2012 election predictions

Compiling academic and media forecaster’s 2012 American Presidential election predictions and statistically judging correctness; Nate Silver was not the best.
statistics, politics, R, Bayes
2012-11-052015-04-21 finished certainty: highly likely importance: 6


Sta­tis­ti­cally ana­lyz­ing in R hun­dreds of pre­dic­tions com­piled for ~10 fore­cast­ers of the 2012 Amer­i­can Pres­i­den­tial elec­tion, and rank­ing them by Brier, RMSE, & log scores; the best over­all per­for­mance seems to be by Drew Linzer and Wang & Hol­brook, while Nate Sil­ver appears as some­what over-rated and the famous Intrade pre­dic­tion mar­ket turn­ing in a dis­ap­point­ing over­all per­for­mance.

In Novem­ber 2012, I was hired by CFAR to com­pile an exten­sive dataset of pun­dits, mod­el­ers, hob­by­ists, and aca­d­e­mics who had attempted to sta­tis­ti­cally fore­cast the 2012 Amer­i­can pres­i­den­tial race and other minor races; the results were inter­est­ing in that they con­tra­dicted the lion­iza­tion of fore­casts in The New York Times. This page is a full list­ing of the R source code I used to pro­duce my analy­sis for the CFAR essay; notes on the deriva­tion of each dataset are stored at 2012-gwern-notes.txt.

The essay itself lives at “Was Nate Sil­ver the Most Accu­rate 2012 Elec­tion Pun­dit?”.

Background

This elec­tion pre­dic­tion judg­ment divided up into sev­eral sec­tions deal­ing with dif­fer­ent cat­e­gories of pre­dic­tions:

  1. the over­all Pres­i­den­tial race pre­dic­tions: prob­a­bil­ity of Obama vic­to­ry, final elec­toral vote count, and per­cent­age of pop­u­lar vote
  2. the Pres­i­den­tial state-by-s­tate pre­dic­tions: the per­cent­age Obama will take (vote share/margin/edge), as well as the prob­a­bil­ity he will win that state at all
  3. the Sen­ate state-by-s­tate pre­dic­tions: sim­i­lar, but nor­mal­ized for the Demo­c­ra­tic can­di­date

Few fore­cast­ers made pre­dic­tions in all cat­e­gories, the ones who did make pre­dic­tions did not always make their full pre­dic­tions pub­lic, etc. Note that all per­cent­ages are nor­mal­ized in terms of that going to Oba­ma, Democ­rats, or in some cas­es, Independents/Greens. The “Real­ity” ‘fore­caster’ is the ground truth; these were all updated 23 Novem­ber in what is hope­fully a final update.

The point of these cal­cu­la­tions is to extract (for cat­e­gor­i­cal pre­dic­tions like per­cent­age of Obama vic­to­ry) and sums (for continuous/quantitative pre­dic­tions like vote share). Intrade prices were inter­preted as straight­for­ward prob­a­bil­i­ties with­out any cor­rec­tion for Intrade’s long-shot bias1

Presidential

presidential <- read.csv("https://www.gwern.net/docs/elections/2012-presidential.csv", row.names=1)
# Reality=2012 result; 2008=2008 results
presidential
                probability electoral popular
Reality              1.0000       332   50.79
2008                 1.0000       365   53.00
Nate Silver          0.9090       313   50.80
Drew Linzer          0.9900       332      NA
Simon Jackman        0.9140       332   50.80
DeSart               0.8862       303   51.37
Margin of Error      0.6800       303   51.50
Wang & Ferguson      1.0000       303   51.10
Intrade              0.6580       291   50.75
Josh Putnam              NA       332      NA
Unskewed Polls           NA       263   48.88

# probability can be scored as a Brier score; available in 'verification' library
install.packages("verification")
library(verification)
# handle lists & vectors for later
br <- function(obs, pred) brier(unlist(obs),
                                unlist(pred),
                                bins=FALSE)$bs # bins=FALSE avoids rounding
# convenience function
brp <- function(p) brier(presidential["Reality",]$probability,
                         presidential[p,]$probability,
                         bins=FALSE)$bs
lapply(rownames(presidential)[1:9], brp)
  • Real­i­ty: 0
  • 2008: 0
  • Wang: 0
  • Linz­er: 0.0001
  • Jack­man: 0.007396
  • Sil­ver: 0.008281
  • DeSart: 0.01295044
  • Mar­gin: 0.1024
  • Intrade: 0.116964
  • Ran­dom: 0.25 (50% guess is always 0.25)
# To score electorals and populars, we use RMSE
rmse <- function(obs, pred) sqrt(mean((obs-pred)^2,na.rm=TRUE))
rpe <- function(p) rmse(presidential["Reality",]$electoral, presidential[p,]$electoral)
lapply(rownames(presidential), rpe)
  • Real­i­ty: 0
  • Linz­er: 0
  • Jack­man: 0
  • Put­nam: 0
  • Sil­ver: 19
  • DeSart: 29
  • Mar­gin: 29
  • Wang: 29
  • 2008: 33
  • Intrade: 41
  • Unskewed: 69
rpp <- function(p) rmse(presidential["Reality",]$popular, presidential[p,]$popular)
lapply(rownames(presidential)[c(1:9,11)], rpp)
  • Real­i­ty: 0
  • Wang: 0.31
  • DeSart: 0.58
  • Jack­man: 0.01
  • Sil­ver: 0.01
  • Intrade: 0.04
  • Mar­gin: 0.71
  • 2008: 2.21
  • Unskewed: 1.91

State

State Win Probabilities

# Reality=final 2012 result - 0 for Romney states, 100 for Obama
# 2008=2008 state results (=Reality, negated for Obama loss of Indiana & North Carolina)
statewin <- read.csv("https://www.gwern.net/docs/elections/2012-statewin.csv", row.names=1)
statewin
                    al       ak     az     ar     ca      co     ct      de
Reality         0.0000 0.000000 0.0000 0.0000 1.0000 1.00000 1.0000 1.00000
2008            0.0000 0.000000 0.0000 0.0000 1.0000 1.00000 1.0000 1.00000
Nate Silver     0.0000 0.000000 0.0200 0.0000 1.0000 0.80000 1.0000 1.00000
Drew Linzer     0.0000 0.000086 0.0000 0.0000 1.0000 0.98333 1.0000 0.98333
Margin of Error 0.0269 0.099800 0.4388 0.0451 0.9443 0.64710 0.9125 0.93770
Intrade         0.0000 0.000000 0.0600 0.0000 0.9500 0.55600 0.9900 0.96000
DeSart          0.0000 0.090000 0.0390 0.0000 1.0000 0.52300 0.9990 1.00000
Simon Jackman   0.0052 0.000000 0.0050 0.0000 1.0000 0.76520 1.0000 1.00000
Wang & Ferguson 0.0000 0.000000 0.0000 0.0000 1.0000 0.84000 1.0000 1.00000
Josh Putnam         NA       NA     NA     NA     NA      NA     NA      NA
Unskewed Polls      NA       NA     NA     NA     NA      NA     NA      NA
                   dc     fl     ga     hi     id     il indiana     ia      ks
Reality         1.000 1.0000 0.0000 1.0000 0.0000 1.0000  0.0000 1.0000 0.00000
2008            1.000 1.0000 0.0000 1.0000 0.0000 1.0000  1.0000 1.0000 0.00000
Nate Silver     1.000 0.5000 0.0000 1.0000 0.0000 1.0000  0.0000 0.8400 0.00000
Drew Linzer        NA 0.6040 0.0000 1.0000 0.0000 1.0000  0.0000 0.9966 0.03866
Margin of Error 1.000 0.4575 0.1972 0.9987 0.0086 0.9569  0.3273 0.6467 0.09710
Intrade         0.975 0.3300 0.0300 0.9750 0.0000 0.9890  0.0200 0.6630 0.00000
DeSart          1.000 0.4910 0.0300 1.0000 0.0000 1.0000  0.0020 0.7700 0.00000
Simon Jackman   1.000 0.5216 0.0014 1.0000 0.0000 1.0000  0.0000 0.8376 0.00000
Wang & Ferguson 1.000 0.5000 0.0000 1.0000 0.0000 1.0000  0.0000 0.8400 0.00000
Josh Putnam        NA     NA     NA     NA     NA     NA      NA     NA      NA
Unskewed Polls     NA     NA     NA     NA     NA     NA      NA     NA      NA
                      ky     la     me      md     ma     mi     mn         ms
Reality         0.000000 0.0000 1.0000 1.00000 1.0000 1.0000 1.0000 0.00000000
2008            0.000000 0.0000 1.0000 1.00000 1.0000 1.0000 1.0000 0.00000000
Nate Silver     0.000000 0.0000 1.0000 1.00000 1.0000 0.9900 1.0000 0.00000000
Drew Linzer     0.000000 0.0000 1.0000 1.00000 1.0000 1.0000 1.0000 0.07866667
Margin of Error 0.019100 0.0442 0.8403 0.96837 0.8988 0.6837 0.7149 0.13470000
Intrade         0.000000 0.0000 0.9300 0.94000 0.9950 0.8840 0.8490 0.00000000
DeSart          0.000000 0.0000 0.9930 1.00000 1.0000 0.9350 0.9610 0.00000000
Simon Jackman   0.000004 0.0000 1.0000 1.00000 1.0000 0.9998 0.9992 0.00000000
Wang & Ferguson 0.000000 0.0200 1.0000 1.00000 1.0000 1.0000 1.0000 0.00000000
Josh Putnam           NA     NA     NA      NA     NA     NA     NA         NA
Unskewed Polls        NA     NA     NA      NA     NA     NA     NA         NA
                      mo     mt     ne        nv     nh     nj     nm     ny
Reality         0.000000 0.0000 0.0000 1.0000000 1.0000 1.0000 1.0000 1.0000
2008            0.000000 0.0000 0.0000 1.0000000 1.0000 1.0000 1.0000 1.0000
Nate Silver     0.000000 0.0200 0.0000 0.9300000 0.8500 1.0000 0.9900 1.0000
Drew Linzer     0.000000 0.0000 0.0000 0.9993333 0.9980 1.0000 1.0000 1.0000
Margin of Error 0.447300 0.2436 0.0562 0.7710000 0.6886 0.8647 0.8579 0.9697
Intrade         0.050000 0.0500 0.0000 0.8370000 0.6490 0.9790 0.9390 0.9500
DeSart          0.052000 0.0080 0.0000 0.7680000 0.7560 0.9980 0.9740 1.0000
Simon Jackman   0.000004 0.0032 0.0000 0.9120000 0.8324 0.9998 0.9968 1.0000
Wang & Ferguson 0.000000 0.0000 0.0000 0.9900000 0.8400 1.0000 1.0000 1.0000
Josh Putnam           NA     NA     NA        NA     NA     NA     NA     NA
Unskewed Polls        NA     NA     NA        NA     NA     NA     NA     NA
                        nc     nd        oh     ok        or     pa     ri
Reality         0.00000000 0.0000 1.0000000 0.0000 1.0000000 1.0000 1.0000
2008            1.00000000 0.0000 1.0000000 0.0000 1.0000000 1.0000 1.0000
Nate Silver     0.26000000 0.0000 0.9100000 0.0000 1.0000000 0.9900 1.0000
Drew Linzer     0.08533333 0.0000 0.9986667 0.0000 0.9986667 1.0000 1.0000
Margin of Error 0.50030000 0.1284 0.6038000 0.0029 0.7886000 0.7562 0.9684
Intrade         0.23000000 0.0030 0.6550000 0.0010 0.9590000 0.8200 0.9500
DeSart          0.06600000 0.0000 0.7040000 0.0000 0.9430000 0.8810 1.0000
Simon Jackman   0.28120000 0.0000 0.9298000 0.0000 0.9726000 0.9910 1.0000
Wang & Ferguson 0.16000000 0.0000 0.9300000 0.0000 1.0000000 0.9300 1.0000
Josh Putnam             NA     NA        NA     NA        NA     NA     NA
Unskewed Polls          NA     NA        NA     NA        NA     NA     NA
                       sc     sd       tn     tx     ut     vt     va     wa
Reality         0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 1.0000 1.0000
2008            0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 1.0000 1.0000
Nate Silver     0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 0.7900 1.0000
Drew Linzer     0.1386667 0.0000 0.000000 0.0000 0.0000 1.0000 0.9760 1.0000
Margin of Error 0.1345000 0.1665 0.053100 0.0545 0.0035 0.9846 0.5046 0.8473
Intrade         0.0400000 0.0500 0.020000 0.0200 0.0450 0.9800 0.5800 0.9750
DeSart          0.0030000 0.0010 0.000000 0.0000 0.0000 1.0000 1.0000 0.9980
Simon Jackman   0.1290000 0.0068 0.000004 0.0000 0.0000 1.0000 0.7840 1.0000
Wang & Ferguson 0.0000000 0.0000 0.000000 0.0000 0.0000 1.0000 0.8400 1.0000
Josh Putnam            NA     NA       NA     NA     NA     NA     NA     NA
Unskewed Polls         NA     NA       NA     NA     NA     NA     NA     NA
                         wv     wi          wy
Reality         0.000000000 1.0000 0.000000000
2008            0.000000000 1.0000 0.000000000
Nate Silver     0.000000000 0.9700 0.000000000
Drew Linzer     0.001333333 1.0000 0.000666667
Margin of Error 0.042700000 0.6448 0.006900000
Intrade         0.020000000 0.7460 0.000000000
DeSart          0.000000000 0.8560 0.000000000
Simon Jackman   0.005400000 0.9698 0.000000000
Wang & Ferguson 0.000000000 0.9900 0.000000000
Josh Putnam              NA     NA          NA
Unskewed Polls           NA     NA          NA

brstate <- function(p) br(statewin["Reality",], statewin[p,])
lapply(rownames(statewin)[1:9], brstate)
  • Real­i­ty: 0
  • Drew Linz­er: 0.00384326
  • Wang/Ferguson: 0.007615686
  • Nate Sil­ver: 0.00911372
  • Simon Jack­man: 0.00971369
  • DeSart/Holbrook: 0.01605542
  • Intrade: 0.02811906
  • 2008: 0.03921569
  • Mar­gin of Error: 0.05075311
  • ran­dom (50%) guesser 0.25000000

State win vote-shares

Sources:

statemargin <- read.csv("https://www.gwern.net/docs/elections/2012-statemargin.csv", row.names=1)
statemargin
                      al       ak       az       ar       ca       co       ct
Reality         38.42829 40.79253 44.44855 36.87899 59.69455 51.56536 58.38274
2008            38.80000 37.70000 45.00000 38.80000 60.90000 53.50000 60.50000
Nate Silver     36.70000 38.60000 46.20000 38.60000 58.10000 50.80000 56.60000
Drew Linzer     40.30000 37.50000 46.20000 37.10000 59.80000 51.20000 56.80000
Margin of Error 37.00000 41.00000 49.00000 39.00000 61.00000 53.00000 59.00000
Josh Putnam           NA       NA 46.59500       NA 58.39500 50.87500 55.92000
Unskewed Polls  37.78000 36.40000 43.95000 44.68000 57.65000 49.48000 54.55000
Intrade               NA       NA       NA       NA       NA       NA       NA
Simon Jackman   38.70000       NA 46.10000 36.40000 58.60000 51.00000 56.80000
DeSart          35.20000 32.20000 46.40000 38.70000 59.20000 50.10000 57.70000
Wang & Ferguson 42.50000 39.00000 46.00000 38.00000 57.50000 51.00000 56.50000
                      de       dc       fl       ga       hi       id       il
Reality         58.61074 90.91402 50.00787 45.48216 70.54523 32.62233 57.53322
2008            61.90000 92.90000 50.90000 47.00000 71.80000 36.10000 61.80000
Nate Silver     59.60000 93.00000 49.80000 45.50000 66.50000 32.10000 59.80000
Drew Linzer     61.00000       NA 50.20000 46.00000 65.60000 31.20000 60.20000
Margin of Error 60.00000 91.00000 49.00000 44.00000 70.00000 35.00000 61.00000
Josh Putnam           NA       NA 50.08000 45.38000       NA       NA 59.58000
Unskewed Polls  86.88000 57.40000 47.60000 43.20000 58.55000 30.95000 55.25000
Intrade               NA       NA       NA       NA       NA       NA       NA
Simon Jackman         NA 91.60000 50.10000 45.50000 65.00000 32.00000 59.60000
DeSart          60.50000 95.80000 49.90000 45.50000 66.60000 29.10000 60.80000
Wang & Ferguson 62.50000 90.00000 50.00000 46.00000 63.50000 32.00000 59.50000
                 indiana      ia       ks       ky       la       me       md
Reality         44.08345 51.9882 37.82721 37.80994 40.57746 55.96352 61.97419
2008            49.90000 54.0000 41.40000 41.10000 39.90000 57.60000 61.90000
Nate Silver     45.30000 51.1000 37.90000 40.30000 39.30000 55.90000 60.90000
Drew Linzer     44.30000 51.6000 41.10000 45.10000 39.70000 56.40000 61.30000
Margin of Error 47.00000 52.0000 41.00000 36.00000 39.00000 57.00000 62.00000
Josh Putnam     44.36500 51.2750       NA       NA 43.06500 56.17000 60.64500
Unskewed Polls  41.90000 49.8800 36.60000 40.80000 43.63000 51.90000 55.83000
Intrade               NA      NA       NA       NA       NA       NA       NA
Simon Jackman   44.80000 51.4000       NA 40.90000 38.90000 56.00000 61.00000
DeSart          43.10000 51.8000 39.40000 41.90000 39.30000 55.90000 61.90000
Wang & Ferguson 43.50000 51.0000 41.50000 44.50000 43.50000 55.50000 61.00000
                      ma       mi      mn       ms       mo       mt       ne
Reality         60.74886 54.30391 52.6497 43.54862 44.34962 41.70813 37.86805
2008            62.00000 57.40000 54.2000 42.80000 49.30000 47.20000 41.50000
Nate Silver     59.00000 53.00000 53.7000 45.60000 45.60000 45.20000 40.40000
Drew Linzer     60.00000 52.70000 54.2000 41.80000 45.30000 45.30000 42.50000
Margin of Error 58.00000 53.00000 54.0000 43.00000 49.00000 45.00000 40.00000
Josh Putnam     56.17000 52.78500 53.7650       NA 45.92500 45.46500       NA
Unskewed Polls  60.10000 51.75000 51.0300 39.83000 46.20000 38.80000 34.40000
Intrade               NA       NA      NA       NA       NA       NA       NA
Simon Jackman   59.70000 53.60000 54.0000       NA 45.30000 46.00000 42.80000
DeSart          62.80000 53.70000 54.3000 40.00000 46.10000 44.20000 39.80000
Wang & Ferguson 59.50000 52.75000 53.7500 44.00000 45.25000 45.75000 43.00000
                      nv       nh       nj       nm       ny       nc       nd
Reality         52.35625 51.98268 57.85939 52.99547 62.62461 48.35097 38.69731
2008            55.10000 54.30000 56.80000 56.70000 62.20000 49.90000 44.70000
Nate Silver     51.80000 51.40000 55.50000 54.10000 62.40000 48.90000 42.00000
Drew Linzer     52.20000 51.60000 56.60000 54.40000 63.20000 49.10000 41.70000
Margin of Error 55.00000 53.00000 57.00000 57.00000 62.00000 50.00000 43.00000
Josh Putnam     52.02500 51.51500 56.18000 54.56500 62.51000 49.22000       NA
Unskewed Polls  52.15000 50.03000 53.80000 53.53000 58.75000 44.98000 37.15000
Intrade               NA       NA       NA       NA       NA       NA       NA
Simon Jackman   51.90000 51.30000 56.00000 54.40000 62.70000 49.20000 43.00000
DeSart          51.80000 51.70000 57.10000 54.70000 64.60000 47.70000 40.10000
Wang & Ferguson 52.50000 51.00000 56.00000 53.00000 62.00000 49.00000 43.00000
                      oh       ok       or       pa       ri       sc       sd
Reality         50.14323 33.22768 54.30016 51.75834 62.70096 44.08803 39.86614
2008            51.20000 34.40000 57.10000 54.70000 63.10000 44.90000 44.70000
Nate Silver     51.30000 33.80000 53.60000 52.50000 61.80000 43.20000 42.50000
Drew Linzer     51.60000 33.50000 53.60000 52.70000 63.10000 44.30000 44.80000
Margin of Error 52.00000 31.00000 55.00000 55.00000 62.00000 43.00000 44.00000
Josh Putnam     51.47000       NA       NA 52.84500       NA       NA 44.79000
Unskewed Polls  47.75000 35.55000 50.53000 50.28000 59.73000 41.58000 39.88000
Intrade               NA       NA       NA       NA       NA       NA       NA
Simon Jackman   51.90000 34.50000 53.10000 53.10000 62.80000 44.80000 44.90000
DeSart          51.30000 35.40000 53.80000 52.90000 65.20000 43.30000 42.70000
Wang & Ferguson 51.50000 35.50000 53.00000 51.50000 62.00000 47.00000 44.50000
                      tn       tx       ut       vt       va       wa       wv
Reality         39.07377 41.36371 24.73251 66.57055 51.15646 56.13941 35.50631
2008            41.80000 43.80000 34.20000 67.80000 52.70000 57.50000 42.60000
Nate Silver     41.40000 41.20000 27.80000 66.20000 50.70000 56.20000 41.30000
Drew Linzer     43.30000 41.40000 26.70000 70.50000 51.10000 57.10000 42.80000
Margin of Error 39.00000 39.00000 31.00000 64.00000 50.00000 57.00000 38.00000
Josh Putnam     43.93500 42.44000 27.31000       NA 50.89500 56.68000       NA
Unskewed Polls  43.70000 39.85000 28.75000 56.53000 48.88000 51.43000 44.55000
Intrade               NA       NA       NA       NA       NA       NA       NA
Simon Jackman   44.00000 40.80000 26.90000 68.80000 51.00000 56.50000 41.50000
DeSart          41.30000 40.70000 25.90000 70.70000 50.10000 57.10000 39.80000
Wang & Ferguson 44.50000 42.00000 27.50000 65.50000 51.00000 57.00000 41.50000
                      wi       wy
Reality         52.80191 27.81889
2008            56.30000 32.70000
Nate Silver     52.40000 30.90000
Drew Linzer     52.50000 32.00000
Margin of Error 52.00000 33.00000
Josh Putnam     52.30500       NA
Unskewed Polls  49.98000 30.55000
Intrade               NA       NA
Simon Jackman   52.50000       NA
DeSart          52.60000 30.10000
Wang & Ferguson 52.25000 34.00000

What’s the equiv­a­lent of Brier func­tion for out­comes which aren’t yes/no bina­ry? A more quan­ti­ta­tive mea­sure; a com­mon choice is the RMSE (which pun­ishes out­lier­s), in this case, we’re look­ing at the dif­fer­ence between the pre­dicted edge in votes and the actual edge over all the states a pre­dic­tor gave us num­bers:

rmse <- function(obs, pred) sqrt(mean((obs-pred)^2,na.rm=TRUE))
rmsesm <- function(p) rmse(statemargin["Reality",], statemargin[p,])
lapply(rownames(statemargin), rmsesm)
  • Real­i­ty: 0
  • Nate Sil­ver: 1.863676
  • Josh Put­nam: 2.033683
  • Simon Jack­man: 2.25422
  • DeSart & Hol­brook: 2.414322
  • Mar­gin of Error: 2.426244
  • Drew Linz­er: 2.5285
  • Wang & Fer­gu­son: 2.79083
  • 2008: 3.206457
  • Unskewed Polls: 7.245104

Senate

Senate Win Probabilities

senatewin <- read.csv("https://www.gwern.net/docs/elections/2012-senatewin.csv", row.names=1)
                   az    ca    ct   de    fl   hi indiana    me   md    ma   mi
Reality         0.000 1.000 1.000 1.00 1.000 1.00    1.00 1.000 1.00 1.000 1.00
Nate Silver     0.040 1.000 0.960 1.00 1.000 1.00    0.70 0.930 1.00 0.940 1.00
Intrade         0.225 0.998 0.888 0.99 0.859 0.96    0.85 0.957 0.96 0.786 0.95
Wang & Ferguson 0.120 0.950 0.998 0.95 0.950 0.95    0.84 0.950 0.95 0.960 0.96
                  mn   ms    mo    mt   ne   nv   nj   nm   ny    nd   oh   pa
Reality         1.00 0.00 1.000 1.000 0.00 0.00 1.00 1.00 1.00 1.000 1.00 1.00
Nate Silver     1.00 0.00 0.980 0.340 0.01 0.17 1.00 0.97 1.00 0.080 0.97 0.99
Intrade         0.95 0.00 0.703 0.371 0.06 0.06 0.96 0.95 1.00 0.155 0.84 0.86
Wang & Ferguson 0.95 0.05 0.960 0.690 0.05 0.27 0.95 0.95 0.95 0.750 0.95 0.95
                  ri   tn    tx   ut   vt   va   wa    wv    wi   wy
Reality         1.00 0.00 0.000 0.00 0.00 1.00 1.00 1.000 1.000 0.00
Nate Silver     1.00 0.00 0.000 0.00 0.00 0.88 1.00 0.920 0.790 0.00
Intrade         0.99 0.00 0.025 0.00 0.05 0.78 0.96 0.951 0.626 0.00
Wang & Ferguson 0.95 0.05 0.050 0.05 0.05 0.96 0.95 0.950 0.720 0.05

The Sen­ate win pre­dic­tions (done only by Wang, Sil­ver, & Intrade in this dataset):

brsw <- function (pundit) br(senatewin["Reality",], senatewin[pundit,])
lapply(rownames(senatewin), brsw)
  • Wang: 0.01246376
  • Sil­ver: 0.04484545
  • Intrade: 0.04882958

To com­bine the state win pre­dic­tions with the pres­i­dency win pre­dic­tion and also the Sen­ate race win pre­dic­tions requires data on all 3, so still Wang vs Sil­ver vs Intrade:

combineBinaryForecasts <- function(p) c(statewin[p,],
                                        senatewin[p,],
                                        presidential[p,]$probability)
brpssw <- function(pundit) br(combineBinaryForecasts("Reality"), combineBinaryForecasts(pundit))
lapply(rownames(senatewin), brpssw)
  • Wang: 0.009408282
  • Sil­ver: 0.02297625
  • Intrade: 0.03720485

Senate win vote-shares

Source: Wash­ing­ton Post

senatemargin <- read.csv("https://www.gwern.net/docs/elections/2012-senatemargin.csv", row.names=1)
senatemargin
              az   ca   ct   de   fl   hi   id   me   md   ma   mi   mn   ms
Reality     45.8 61.6 55.2 66.4 55.2 62.6 49.9 52.9 55.3 53.7 54.7 65.3 40.3
Nate Silver 46.6 59.6 52.6 66.5 53.2 56.6 50.0 53.0 60.8 51.7 56.0 63.7 32.3
              mo   mt   ne   nv   nj   nm   ny   nd   oh   pa   ri   tn   tx
Reality     54.7 48.7 41.8 44.7 58.5 51.0 71.9 50.5 50.3 53.6 64.8 30.4 40.5
Nate Silver 52.2 48.4 45.6 47.5 56.1 53.4 67.5 47.2 51.9 52.9 59.1 35.5 41.5
              ut   vt   va   wa   wv   wi   wy
Reality     30.2 24.8 52.5 60.2 60.6 51.5 21.6
Nate Silver 32.4 25.0 51.0 59.3 56.0 51.1 27.7

rmse <- function(obs, pred) sqrt(mean((obs-pred)^2,na.rm=TRUE))
r <- function(x) rmse(senatemargin["Reality",], senatemargin[x,])
r("Reality"); r("Nate Silver"); # no one else's predictions are available
  • Real­i­ty: 0
  • Nate Sil­ver: 3.272197

Not bad at all.

Let’s com­bine the state mar­gin with the elec­toral / pop­u­lar to get an over­all RMSE pic­ture of the pre­dic­tors:

r <- function(p) rmse(unlist(c(statemargin["Reality",],
                        presidential["Reality",]$electoral,
                        presidential["Reality",]$popular)),
                        unlist(c(statemargin[p,], presidential[p,]$electoral,
                                                  presidential[p,]$popular)))
lapply(rownames(statemargin), r)
  • Real­i­ty: 0
  • Josh Put­nam: 2.002633
  • Simon Jack­man: 2.206758
  • Drew Linz­er: 2.503588
  • Nate Sil­ver: 3.186463
  • DeSart: 4.635004
  • Mar­gin of Error: 4.641332
  • Wang & Fer­gu­son: 4.83369
  • 2008: 5.525641
  • Unskewed Polls: 11.84946

(Which shows you how bad Unskewed Polls was: we could fit Put­nam, Jack­man, Linz­er, and Sil­ver’s errors into his and have room left over.)

Log scores of win predictions

logScore <- function(obs, pred) sum(ifelse(obs, log(pred), log(1-pred)), na.rm=TRUE)

Exam­ple of the dif­fer­ence between Brier and log score:

# Oops!
brier(0,1,bins=FALSE)$bs
1
# But we can recover by getting the second right
brier(c(0,1),c(1,1),bins=FALSE)$bs
0.5

# Oops!
logScore(1, 0)
-Inf
# Can we recover? ...we're screwed
logScore(c(1,1), c(0,1))
-Inf

Pres­i­dency win pre­dic­tion:

lsp <- function(p) logScore(1, presidential[p,]$probability)
lapply(rownames(presidential), lsp)
  • Real­i­ty: 0
  • 2008: 0
  • Wang & Fer­gu­son: 0
  • Linz­er: -0.01005034
  • Jack­man: -0.08992471
  • Sil­ver: -0.09541018
  • DeSart: -0.1208126
  • Mar­gin of Error: -0.3856625
  • Intrade: -0.4185503

Applied to state win pre­dic­tions:

ls <- function(p) logScore(statewin["Reality",], statewin[p,])
lapply(rownames(statewin), ls)
  • Real­i­ty: 0
  • Linz­er: -0.9327548
  • Wang & Fer­gu­son: -1.750359
  • Sil­ver: -2.057887
  • Jack­man: -2.254638
  • DeSart: -3.30201
  • Intrade: -5.719922
  • Mar­gin of Error: -10.20808
  • 2008: -Inf

Now Sen­ate win pre­dic­tions:

lss <- function(p) logScore(senatewin["Reality",], senatewin[p,])
lapply(rownames(senatewin), lss)
  • Real­i­ty: 0
  • Wang & Fer­gu­son: -2.89789
  • Sil­ver: -4.911792
  • Intrade: -5.813129

And all of them togeth­er:

combineBinaryForecasts <- function(p) c(statewin[p,], senatewin[p,], presidential[p,]$probability)
lssp <- function(pundit) logScore(combineBinaryForecasts("Reality"), combineBinaryForecasts(pundit))
lapply(c("Wang & Ferguson", "Nate Silver", "Intrade"), lssp)
  • Real­i­ty: 0
  • Wang & Fer­gu­son: -4.648249
  • Sil­ver: -7.06509
  • Intrade: -11.9516

Summary tables

RMSEs

Pre­dic­tor Pres­i­den­tial elec­toral Pres­i­den­tial pop­u­lar State mar­gins S+P­p+Sm2 Sen­ate mar­gins
Sil­ver 19 0.01 1.81659 20.82659 3.272197
Linzer 0 2.5285
Wang 29 0.31 2.79083 32.10083
Jack­man 0 0.01 2.25422 2.26422
DeSart 29 0.58 2.414322 31.99432
Intrade 41 0.04
2008 33 2.21 3.206457 38.41646
Mar­gin 29 0.71 2.426244 32.13624
Put­nam 0 2.033683
Unskewed 69 1.91 7.245104 78.1551

Brier scores

(0 is a per­fect Brier score or RMSE.)

Pre­dic­tor Pres­i­den­tial win State win Sen­ate win St+S­n+P
Sil­ver 0.008281 0.00911372 0.04484545 0.02297625
Linzer 0.0001 0.00384326
Wang 0 0.00761569 0.01246376 0.009408282
Jack­man 0.007396 0.00971369
DeSart 0.012950 0.01605542
Intrade 0.116964 0.02811906 0.04882958 0.03720485
2008 0 0.03921569
Mar­gin 0.1024 0.05075311
Ran­dom 0.2500 0.25000000 0.25000000 0.25000000

Log scores

We men­tioned there were other proper scor­ing rules besides the Brier score; another bina­ry-out­come rule, less used by polit­i­cal fore­cast­ers, is the “log­a­rith­mic scor­ing rule” (see or Eliezer Yud­kowsky’s “Tech­ni­cal Expla­na­tion”); it has some deep con­nec­tions to areas like infor­ma­tion the­o­ry, data com­pres­sion, and Bayesian infer­ence, which makes it invalu­able in some con­text. But because a log score ranges between 0 and neg­a­tive Infin­ity (big­ger is better/smaller worse) rather than 0 and 1 (smaller bet­ter) and has some dif­fer­ent behav­iors, it’s a bit harder to under­stand than a Brier score.

(One way in which the log score dif­fers from the Brier score is treat­ment of 100/0% pre­dic­tions: the log score of a 100% pre­dic­tion which is wrong is neg­a­tive Infin­i­ty, while in Brier it’d sim­ply be 1 and one can recov­er; hence if you say 100% twice and are wrong once, your Brier score would recover to 0.5 but your log score will still be neg­a­tive Infin­i­ty! This is what hap­pens with the “2008” bench­mark.)

Fore­caster State win prob­a­bil­i­ties
Real­ity 0
Linzer -0.9327548
Wang & Fer­gu­son -1.750359
Sil­ver -2.057887
Jack­man -2.254638
DeSart -3.30201
Intrade -5.719922
Mar­gin of Error -10.20808
2008 -In­fin­ity
Fore­caster Pres­i­den­tial win prob­a­bil­ity
Real­ity 0
2008 0
Wang & Fer­gu­son 0
Jack­man -0.08992471
Linzer -0.01005034
Sil­ver -0.09541018
DeSart -0.1208126
Intrade -0.4185503
Mar­gin of Error -0.3856625

Note that the 2008 bench­mark and Wang & Fer­gu­son took a risk here by an out­right 100% chance of vic­to­ry, which the log score rewarded with a 0: if some­how Obama had lost, then the log score of any set of their pre­dic­tions which included the pres­i­den­tial win prob­a­bil­ity would auto­mat­i­cally be -In­fin­i­ty, ren­der­ing them offi­cially The Worst Pre­dic­tors In The World. This is why one should allow for the unthink­able by includ­ing some frac­tion of per­cent; of course, I’m sure Wang & Fer­gu­son don’t mean 100% lit­er­ally but more like “it’s so close to 100% we can’t be both­ered to report the tiny remain­ing pos­si­bil­ity”.

Fore­caster Sen­ate win prob­a­bil­i­ties
Real­ity 0
Wang -2.89789
Sil­ver -4.911792
Intrade -5.813129

See Also


  1. I have been told that once Intrade prices have been cor­rected for this, the new results are com­pa­ra­ble to Sil­ver & Wang. This does­n’t nec­es­sar­ily sur­prise me, but dur­ing the orig­i­nal analy­sis I did not look into doing the long-shot bias cor­rec­tion because: hardly any­one does in dis­cus­sions of pre­dic­tion mar­kets; it would’ve been more work; I’m not sure it’s really legit­i­mate, since if Intrade is biased, then it’s biased - if some­one pro­duces extreme esti­mates which can be eas­ily improved by regress­ing to some rel­e­vant mean, it does­n’t seem quite hon­est to present your cor­rected ver­sion instead as what they “really” meant.↩︎

  2. Sum­ming together RMSEs from dif­fer­ent met­rics is sta­tis­ti­cally ille­git­i­mate & mis­lead­ing since the sum­ma­tion will reflect almost entirely the elec­toral vote per­for­mance, since it’s on a scale much big­ger than the other met­rics. I include it for curios­ity only.↩︎