August 31, 2011

3 reasons the Obama victory predictor could be wrong

There’s a pretty reliable prediction out that President Obama will win re-election in 2012. As reported by Allahpundit over at Hot Air, the Keys System sees Obama. Predictive modelling, while not my forte, is something right up my alley. I’m quite familiar with the design and upkeep of models in my day job - I work with them regularly and have even built one or two.  Without a great deal of insight into the specifics of the model, I can see few potential problems with the predicted outcome.


The type of ‘model’ used sounds not exactly like a model but more like a scoring system of yes/no indicators. It’s not exactly a logistical regression model but there are some similarities. There are a number of discrete variables that are being used that unlike a logistical regression model where the variables are weighted they are all weighted equally.

As an aside, the 13 variables being used are incumbency, uncontested primary, third party candidacy, major domestic policy change, domestic social unrest, major scandals, foreign policy failures, foreign policy achievements, opponent charisma, mid-term election results, long term economic outlook, and incumbent charisma.

The current de-construction has all of the above items in the Obama column except the last three in favour of the eventual GOP candidate. That’s a de facto win for Obama, case closed, wrap it up, the election is over. The guy ‘model’ has a fantastic predictive record, so it looks like an Obama slam dunk. Hold the phone. There are some serious problems with this unbeatable system. Let me start with an anecdote. When I was in Grade 9, I was really, really into football. Being a math geek, I decided to make some predictions on the upcoming season based on a number of team attributes – Successful draft score (which I assigned), offensive yards per game (previous year), defensive yards per game (previous year), and a number of other factors, totalling 12 discreet attributes. Running the scores through for each team and cross-referencing the resulting scores against the schedule I was able to come up with the prediction that the San Francisco 49ers (who were 6-10 the previous year and 2-14 the year before) would win the Superbowl against the San Diego Chargers. It was 1981. The San Francisco 49ers won their first Superbowl, against the Cincinnati Bengals, who beat out the San Diego Chargers in the Freezer Bowl (which clearly helped the Bengals).

Being as young as I was, I knew nothing about predictive modeling but I did know about gambling. I heard the odds on the 49ers that year were over 60-1. What an opportunity! Although I never gambled on the system I had created I ran it again the next year and I got Dolphins winning the Superbowl. Close – they lost to the Redskins. The following year I got both Superbowl teams wrong. My logic had degraded over time. That’s one of the problems with models. It’s called robustness. Predictive power of a model tends to degrade over time. Models need to be re-validated and tweaked or at times even thrown out and replaced. The world is a dynamic place. Things change. If modeling were easy, the predictive models for the path of hurricanes like Irene would be very precise and would be so every time. It doesn’t work that way. And that’s for models. What I did (and from the sound of it what The Keys System does), does not constitute a model so much as a calculation. That’s not even as strong as a logistical regression model (or any other sort of model). So the expectation that it would degrade over time is even more reasonable.

Robustness 

I mentioned I had a few issues with the Keys System. The points above are a subset of one of them which is classified as robustness. In evaluating a model (or any other predictive ‘system’) you need to account for two things - how accurate its predictive powers are, and how long the predictive ability is going to work. The latter item is the robustness (you want stats? Check this out). 

The robustness of the Keys System is difficult to gauge because what it doesn’t provide is a level of victory. It predicts a winner but it doesn’t seem to predict by how much (keep in mind I haven’t had any direct insight into the system, so I could be wrong on that). If there is no margin of victory it makes it difficult to determine if the model is robust enough to absorb some decay and still make the correct prediction. If the model indicated Obama was going to win by the skin of his teeth, then I’d be very concerned about how robust it actually was.

Bias

The second reason for concern is bias. In a political model, personal viewpoints as data can cloud the variables being used. In this case there is a significant bias, whether intended or not. As Allahpundit notes;

He’s got The One winning on nine of 13 counts: 
1. No contested primary 
2. Incumbency 
3. No third-party candidate 
4. Major domestic-policy changes in his first term 
5. No social unrest 
6. No major scandals 
7. No major foreign-policy failures 
8. Major foreign-policy achievements in his first term (killing Bin Laden)
9. Little charisma by his likely opponent 
The GOP wins three categories: 
1. The incumbent’s party lost seats in the last House election 
2. The long-term economy looks poor
3. Little charisma by the incumbent
When you look at the nine keys supposedly working in Obama’s favor, Allahpundit rightly points out some questionable assessments. My list of items that might belong on the other side of the ledger include;


  • No social unrest – just what are the Tea Party rallies?
  • No major scandals – from Fast and Furious all the way back to TurboTax Geithner we’ve seen a number of scandals.
  • No major foreign policy failures – from borrowing from China, to the missed opportunity in Iran, to misreading the Arab spring, to missed protocol with the Queen of England versus bowing toward every other leader in the free and non-free world. There have been a number of failures. Major ones? Iranian nuclear program development, and abandoning Israel are big deals.
  • Little charisma by his likely opponent – that seems a little presumptuous. Perry, Romney and Bachmann don’t appeal to everyone but that assessment is not truly fair. That also discounts a possible Palin entry.
  • Little charisma by the incumbent – I’m not a fan of Obama but you can argue this isn’t in the right column either (personally I won’t argue that).

My point is that these ratings are subjective. That being the case, the ‘model ‘is subject to manipulation. Another form of bias that possibly exists is that each of the factors are given equal weighting. While clearly it has a good predictive track record, the long term economy is probably twice as important as opponent’s charisma in the minds of a lot of voters. Equal weighting doesn’t make a whole lot of logical sense.

Lastly as far as bias goes, again as Allahpundit points out, major domestic policy change is counted as a positive even though polling indicates that support for Obamacare is at an all time low and support for the failed mega-stimulus-that –did-nothing isn’t exactly all warm and fuzzy either. That’s not exactly a big positive as far as I can see. Change for the sake of change it turns out is not always a good idea.

Chosen variables

The third reason the prediction falls down is the chosen variables themselves. Sometimes the variables used have hidden correlations. For example the incumbent’s party lost a lot of seats in the midterm election most likely because of the long term economic outlook. Those two variables seem to be correlated. Yet both are in the model, perhaps overstating a specific impact (or key in this case). When that happens, models tend to lead to confusion between cause and consequence.

Regression models typically don’t have that many variables but 12 is an acceptable number. I’d be more comfortable with 5-10 variables being used. Once again this isn’t a regression model but it does appear to be attempting to achieve the same result in a similar but less sophisticated way. I’m also not clear on how these variables were arrived at as opposed to other indicators like unemployment rate which is often cited in the media.

The design of the Keys System is also based on the assumption that the election hinges almost entirely on attributes related to incumbency and the incumbent. The challenger almost doesn’t even matter in this analysis. That may well be true but how is that a known starting point? It’s an assumption. Most models contain assumptions as part of their development. That doesn’t mean the assumptions are always correct.

So I’m not convinced that the model is well built. Perhaps the result will prove correct and the Keys System will be vindicated, but I wouldn’t base that ‘success’ on the validity of the Keys System as much as I would luck.  I'm also not sure why someone at this point would put that prediction out there for their system/model.  I can see only two reasons for that - overconfidence or bias.

Exit Question: Too dry?

No comments:

Post a Comment

Disagreement is always welcome. Please remain civil. Vulgar or disrespectful comments towards anyone will be removed.

Related Posts Plugin for WordPress, Blogger...

Share This