Prediction Markets

My prediction/betting strategies and track record, reflections on rationality, prediction judgments
psychology, statistics, predictions, politics, Bitcoin, survey, Bayes
2009-01-102019-05-16 finished certainty: highly likely importance: 9


Every­thing old is new again. Wikipedia is the col­lab­o­ra­tion of am­a­teur gen­tle­men, writ in count­less epis­to­lary IRC or email or talk page mes­sages. And the Amer­i­can pub­lic’s un­tram­meled bet­ting on elec­tions and vic­to­ries has been re­born as .

Prediction markets

Wikipedia sum­ma­rizes the idea:

Pre­dic­tion mar­ket­s…are spec­u­la­tive mar­kets cre­ated for the pur­pose of mak­ing pre­dic­tions. As­sets are cre­ated whose fi­nal cash value is tied to a par­tic­u­lar event (e.g., will the next US pres­i­dent be a Re­pub­li­can) or pa­ra­me­ter (e.g., to­tal sales next quar­ter). The cur­rent mar­ket prices can be in­ter­preted as pre­dic­tions of the prob­a­bil­ity of the event or the of the pa­ra­me­ter1. Pre­dic­tion mar­kets are thus struc­tured as bet­ting ex­changes, with­out any risk for the book­mak­er.

Em­pha­sis is added on the most im­por­tant char­ac­ter­is­tic of a pre­dic­tion mar­ket, the way in which it differs from reg­u­lar stock mar­kets. The idea is that by track­ing ac­cu­ra­cy—pun­ish­ing ig­no­rance & re­ward­ing knowl­edge in equal mea­sure—a pre­dic­tion mar­ket can elicit one’s true be­liefs, and avoid the fail­ure mode of pre­dic­tions as pun­dit’s blovi­a­tion or wish­ful think­ing or sig­nal­ing align­ment:

“The usual touch­stone of whether what some­one as­serts is mere per­sua­sion or at least a sub­jec­tive con­vic­tion, i.e., firm be­lief, is bet­ting. Often some­one pro­nounces his propo­si­tions with such con­fi­dent and in­flex­i­ble de­fi­ance that he seems to have en­tirely laid aside all con­cern for er­ror. A bet dis­con­certs him. Some­times he re­veals that he is per­suaded enough for one ducat but not for ten. For he would hap­pily bet one, but at 10 he sud­denly be­comes aware of what he had not pre­vi­ously no­ticed, namely that it is quite pos­si­ble that he has erred.”2

Events, not dividends or sales

Imag­ine a pre­dic­tion mar­ket in which every day the ad­min­is­tra­tor sells off pairs of shares (he does­n’t want to risk pay­ing out more than he re­ceived) for $1 a share, and all the shares say ei­ther heads or tails. Then he flips a coin and gives every­one with a ‘right’ share $2. Ob­vi­ously if peo­ple bid up heads to $5, this is crazy and ir­ra­tional—even if heads wins to­day, one would still lose. Sim­i­larly for any amount greater than $2. But $2 is also crazy: the only way this share price does­n’t lose money is if heads is 100% guar­an­tee. Of course, it is­n’t. It is quite pre­cisely guar­an­teed to not be the case—50% not the case. Any­thing above 50% is go­ing to lose in the long run.

A smart in­vestor could come into this mar­ket, and blindly buy any share what­so­ever that was less than $1; they would make mon­ey. If their shares were even 99¢, then about half would turn into $2 and half into 0…

This is all el­e­men­tary and ob­vi­ous, and its how we can con­vince our­selves that mar­ket prices can in­deed be in­ter­preted as pre­dic­tions of ex­pected val­ue. But that’s only be­cause the odds are known in ad­vance! We spec­i­fied it was a fair coin. If the odds of the event were not known, then things would be much more in­ter­est­ing. No one bets on a coin flip: we bet on whether John is bluffing.

Real pre­dic­tion mar­kets fa­mously pre­fer to make the sub­ject of a share a topic like the party of the vic­tor of the 2008 Amer­i­can pres­i­den­tial elec­tions; a topic with a rel­a­tively clear out­come (bar­ring the oc­ca­sional George W. Bush or coin land­ing on its edge) and of con­sid­er­able in­ter­est to many.

In­ter­est, I mean, not merely for spec­u­lat­ing on, but pos­si­bly of real world im­por­tance. Ad­vo­cates for pre­dic­tion mar­kets as tools, such as , tire­lessly re­mind us of the pos­si­ble ben­e­fits in ‘ag­gre­gat­ing in­for­ma­tion’. A pre­dic­tion mar­ket re­wards clear think­ing and in­sider in­for­ma­tion, but they fo­cus on top­ics it’d be diffi­cult to clearly bet for or against on reg­u­lar fi­nan­cial mar­kets.

Yes, if I thought the fi­nan­cial mar­kets were un­der­valu­ing green power stocks be­cause they were weigh­ing Sen­a­tor John Mc­Cain’s pres­i­den­tial can­di­dacy too heav­i­ly, then I could do some­thing like short those stocks. But sup­pose that’s all I know about the green power stocks and the fi­nan­cial mar­kets? It’d be mad­ness to go and trade on that be­lief alone. I’d be ex­pos­ing my­self to count­less risks, count­less ways for the price of green stocks to be un­con­nected to Mc­Cain’s odds, count­less in­ter­me­di­aries, count­less other re­la­tions of green stocks which may can­cel out my cor­rect ap­praisal of one fac­tor. Cer­tainly in the long run, weakly re­lated fac­tors will have ex­actly the effect they de­serve to have. But this is a long run in which the in­vestor is quite dead.

Pre­dic­tion mar­kets offer a way to cut through all the con­found­ing effects of prox­ies, and bet di­rectly and pre­cisely on that bit of in­for­ma­tion. If I be­lieve Sen­a­tor Barack Obama has been un­duly dis­count­ed, then I can di­rectly buy shares in him in­stead of cast­ing about for some class of stocks that might be cor­re­lated with him—which is a for­mi­da­ble task in and of it­self; per­haps oil stocks will rise be­cause Oba­ma’s plat­form in­cludes with­drawal from Iraq which ren­der the Mid­dle East less sta­ble, or per­haps green stocks will rise for sim­i­lar rea­sons, or per­haps they’ll all fall be­cause peo­ple think he’ll be in­com­pe­tent, or per­haps op­ti­mism over a his­toric elec­tion of a half-black man and en­thu­si­asm over his plans will lift all boats…

One will never get a faith­ful sum­ma­tion of all the in­for­ma­tion about Obama scat­tered among hun­dreds or thou­sands of traders if one places mul­ti­ple diffi­cult bar­ri­ers in front of a trader who wishes to turn his su­pe­rior knowl­edge or analy­sis into mon­ey.

Or here’s an­other ex­am­ple: many of the early uses of pre­dic­tion mar­kets have been in­side cor­po­ra­tions, bet­ting on met­rics like quar­terly sales. Now, all of those met­rics are im­por­tant and will in the long run affect stock prices or div­i­dends. But what em­ployee work­ing down in the R&D de­part­ment is go­ing to say ‘Peo­ple are too op­ti­mistic about next year’s sales, the pro­to­types just aren’t work­ing as well as they would need to’ and go short the com­pa­ny’s stock? No one, of course. A small differ­ence in their as­sess­ment from every­one else’s is un­likely to make a no­tice­able price differ­ence, even if the trans­ac­tion costs of short­ing did­n’t bar it. And yet, the com­pany wants to know what this em­ployee knows.

How much to bet

There’s some­thing of an is­sue with pre­dic­tion mar­kets, specifi­cally a . Given that un­like the reg­u­lar stock mar­ket, trades in pre­dic­tion mar­kets are usu­ally 3, and so lots of traders are go­ing to be net losers. If you don’t have any par­tic­u­lar rea­son to think you are one of the wolves canny enough to make money off the sheep, then you’re one of the sheep, and why trade at all? (I un­der­stand poker play­ers have a say­ing—if you can’t spot the fish at the table, you’re the fish.)

So, the bad-and-self-aware won’t par­tic­i­pate. If you are trad­ing in a pre­dic­tion mar­ket, you are ei­ther good-and-aware or good-and-ig­no­rant or bad-but-ig­no­rant. Iron­i­cal­ly, the lat­ter two can’t tell whether they are the first group or not. It re­minds me of the smok­ing le­sion puz­zle or “King Solomon’s prob­lem” in : you may have many es such as the (the le­sion), and they may cause you to fail or suc­ceed on the pre­dic­tion mar­ket (get can­cer or not) and also want to par­tic­i­pate there­in. What do you do?

Best of course is to test for the le­sion di­rect­ly—to test whether our pre­dic­tions are 4, whether events we con­fi­dently pre­dict at 0% do in fact never hap­pen and so on. If we man­age to over­come our bi­as­es, we can give s. We can do this sort of test­ing with the rel­e­vant bi­as­es—just know­ing about them and in­tro­spect­ing about one’s pre­dic­tions can im­prove them. Com­ing up with the pre­cise rea­sons one is mak­ing a pre­dic­tion im­proves one’s pre­dic­tions5 and can also help with the 6 or the temp­ta­tion to fal­sify your mem­o­ries based on so­cial feed­back, all of which is im­por­tant to fig­ur­ing out how well you will do in the fu­ture. We can quickly test cal­i­bra­tion us­ing our par­tial ig­no­rance about many fac­tual ques­tions, eg. the links in “Test Your Cal­i­bra­tion!”. My re­cent prac­tice with thou­sands of re­al-world pre­dic­tions on Pre­dic­tion­Book.­com has surely helped my cal­i­bra­tion.

So, how much bet­ter are you than your com­pet­ing traders? What is your edge? This, be­lieve it or not, is pretty much all you need to know to know how much to bet on any con­tract. The ex­act frac­tion of your port­fo­lio to bet given a par­tic­u­lar edge is de­fined by the (more de­tails) which gives the great­est pos­si­ble ex­pected util­ity of your growth rate. (But you need to be psy­cho­log­i­cally tough7 to use it lest you be­gin to : it’s not a strat­e­gy. And strictly speak­ing, it does­n’t im­me­di­ately ap­ply to mul­ti­ple bets you can choose from, but let’s say that what­ever we’re look­ing at is the bet we feel is the most mis­priced and we can do the best on.)

The for­mula is:

  • o = odds
  • e = your edge
  • x = the frac­tion to in­vest

To quote the Wikipedia ex­pla­na­tion:

As an ex­am­ple, if a gam­ble has a 60% chance of win­ning (), but the gam­bler re­ceives 1-to-1 odds on a win­ning bet (), then the gam­bler should bet 20% of the bankroll at each op­por­tu­nity (), in or­der to max­i­mize the long-run growth rate of the bankroll.

So, sup­pose the Pres­i­den­t’s re-elec­tion con­tract was float­ing at 50%, but based on his per­for­mance and past in­cum­bent re-elec­tion rates, you de­cide the true odds are 60%; you can buy the con­tract at 50% and if you hold un­til the elec­tion and are right, you get back dou­ble your mon­ey, so the odds are 1:1. The filled-in equa­tion looks like

Hence, you ought to put 20% of your port­fo­lio into buy­ing the Pres­i­den­t’s con­tract. (If we as­sume that all bets are dou­ble-or-noth­ing, Wikipedia tells us it sim­pli­fies to , which in this ex­am­ple would be = = . But usu­ally our con­tracts in pre­dic­tion mar­kets won’t be that sim­ple, so the sim­pli­fi­ca­tion is­n’t very use­ful here.)

It’s not too hard to ap­ply this to more com­plex sit­u­a­tions. Sup­pose the pres­i­dent were at, say, 10% but you are con­vinced the un­for­tu­nate equine sex scan­dal will soon be for­got­ten and the elec­torate will prop­erly ap­pre­ci­ate el Pres­i­dente for win­ning World War III by mak­ing his true re-elec­tion odds 80%. You can buy in at 10% and you re­solve to sell out at 80%, for a re­ward of 70% or 7 times your ini­tial stake (7:1). And we’ll again say you’re right 60% of the time. So your Kelly cri­te­rion looks like:

Wow! We’re sup­posed to bet more than half our port­fo­lio de­spite know­ing we’ll lose the bet 40% of the time? Well, yes. With an up­side like 7x, we can lose sev­eral bets in a row and even­tu­ally make up our loss. And if we win the first time, we just won huge.

It goes both ways, of course. If we have a market/true-odds of 80%/90% and we do the same thing, we have a re­turn of 12.5% (9/8) rather than 100%, and for that lit­tle re­turn ought to risk on­ly:

As one would ex­pect, with a smaller re­ward but equal risk com­pared to our first ex­am­ple, the KC rec­om­mends a smaller than 0.2 frac­tion in­vest­ed.

If one does­n’t en­joy cal­cu­lat­ing the KC, one could al­ways write a pro­gram to do so; Rus­sell O’­Con­nor has a nice Haskell blog post on “Im­ple­ment­ing the Kelly Cri­te­rion” (who also has an in­ter­est­ing post on the KC and the lot­tery.)

Specific markets

So once we are in­ter­ested in pre­dic­tion mar­kets and would like to try them out, we need to pick one. There are sev­er­al. I gen­er­ally ig­nore the ‘play money’ mar­kets like the , de­spite their sim­i­lar lev­els of ac­cu­racy to the real money mar­kets; I just have a prej­u­dice that if I make a killing, then I ought to have a real re­ward like a nice steak din­ner and not just in­cre­ment some bits on a com­put­er. The pri­mary mar­kets to con­sider are:

  • and are prob­a­bly the 2 largest pre­dic­tion mar­kets, but un­for­tu­nate­ly, it is diffi­cult for Amer­i­cans to make use of them - Bet­fair bans them out­right.
  • is an­other Eu­ro­pean pre­dic­tion mar­ket, sim­i­lar to Bet­fair and BETDAQ, but it does not go out of its way to bar Amer­i­cans, and thus is likely the most pop­u­lar mar­ket in the United States. (Its sis­ter site was sport-on­ly, and is now de­func­t.)
  • is some sort of hy­brid of de­riv­a­tives and pre­dic­tions. I know lit­tle about it.
  • (IEM) is an old pre­dic­tion mar­kets, and one of the bet­ter cov­ered in Amer­i­can press. It’s a re­search pre­dic­tion mar­ket, so it han­dles only small quan­ti­ties of money and trades and has only a few traders8. Ac­counts max out at $500, a ma­jor fac­tor in lim­it­ing the depth & liq­uid­ity of its mar­kets.

I did­n’t want to wa­ger too much money on what was only a lark, and the IEM has the fa­vor­able dis­tinc­tion of be­ing clearly le­gal in the USA. So I chose them.

IEM

In 2003, I sent in a check for $20. A given mar­ket’s con­tracts in the IEM are sup­posed to sum to $1, so $20 would let me buy around 40 shares—e­nough to play around with.

My IEM trading

2004

“Like all weak men he laid an ex­ag­ger­ated stress on not chang­ing one’s mind.”

Pre­dic­tion mar­kets are known to have a num­ber of bi­as­es. Some of these bi­ases are shared with other bet­ting ex­changes; horse-rac­ing is plagued with a ‘long-shot fa­voritism’ just like pre­dic­tion mar­kets are. (An ex­am­ple of long-shot fa­voritism would be In­trade and IEM shares for lib­er­tar­ian Ron Paul win­ning the 2008 Re­pub­li­can nom­i­na­tion trad­ing at lu­di­crous val­u­a­tions like 10¢, or Al Gore—who was­n’t even run­ning—­for the De­mo­c­ra­tic nom­i­na­tion at 5¢.) The fi­nan­cial struc­ture of mar­kets also seems to make short­ing of such low-value (but still over-val­ued) shares more diffi­cult. They can be ma­nip­u­lat­ed, con­sciously or un­con­scious­ly, due to not be­ing very good mar­kets (“They are thin, trad­ing vol­umes are ane­mic, and the dol­lar amounts at risk are piti­fully small”), and that’s where they aren’t re­flect­ing the prej­u­dices of their users (one can’t help but sus­pect Ron Paul shares were over­priced be­cause he has so many fans among techies).

I be­gan ex­per­i­ment­ing with some small trades on IEM’s Fed­eral Re­serve in­ter­est rate mar­ket; I had a the­ory that there was a ‘fa­vorites bias’ (the in­verse of long-shot fa­voritism, where traders buck the con­ven­tional wis­dom de­spite it be­ing more cor­rec­t). I sim­ply based my trades on what I read in the New York Times. It worked fairly well. In 2005, I also dab­bled in the mar­kets on Mi­crosoft and Ap­ple share prices, but I did­n’t find any val­ues I liked.

2004 was, of course, a pres­i­den­tial elec­tion year. I could­n’t re­sist, and traded heav­i­ly. I avoided De­mo­c­ra­tic nom­i­na­tions, rea­son­ing that I was too ig­no­rant that year—which was true, I did not ex­pect John Kerry to even­tu­ally win the nom­i­na­tion—and fo­cused on the par­ty-vic­tory mar­ket. The traders there were far too op­ti­mistic about a De­mo­c­ra­tic vic­to­ry; I knew ‘Bush is a war-time pres­i­dent’ (in ad­di­tion to the in­cum­ben­cy!) as peo­ple said, and that this mat­ter a lot to the half of the elec­torate that voted for him in 2000. Giv­ing him a re-elec­tion prob­a­bil­ity of un­der 40% was too fool­ish for words.

I did well on these trades, and then in Oc­to­ber, I closed out all my trades, sold my Republican/Bush shares, and bought Ker­ry. I thought the de­bates had gone well for Kerry and was con­fi­dent the Swift Boat­ing would­n’t do much in the end, and cer­tainly could­n’t com­pen­sate for the al­ba­tross of Iraq.

As you know, I was quite wrong in this strat­e­gy. Bush did win, and won more than in 2000. And I lost $5-10. (Be­tween a quar­ter and a half my ini­tial cap­i­tal. Ouch! I was glad I had­n’t in­vested some more sub­stan­tial sum like $200.) I had profited early on from peo­ple who had con­fused what they wanted to hap­pen with what would, but then I had suc­cumbed to the same thing. Yes, every­one around me (I live in a lib­eral state) was sure Kerry would win, but that’s no ex­cuse for start­ing off with a cor­rect as­sess­ment and then choos­ing a false one. It was a valu­able les­son for me; this ex­pe­ri­ence makes me some­times won­der whether ‘per­sonal’ pre­dic­tion mar­kets, if you will, could be a use­ful tool.

2005/2006

In 2005 & 2006, I did min­i­mal in­ter­est­ing trad­ing. I largely con­tin­ued my ear­lier strate­gies in the in­ter­est rate mar­kets. Slow­ly, I made up for my fail­ures in 2004.

2007

In 2007, the pres­i­den­tial mar­kets started back up! I sur­veyed the mar­kets and the po­lit­i­cal field with great ex­cite­ment. As any­one re­mem­bers, it was the most in­ter­est­ing elec­tion in a very long, with such mem­o­rable char­ac­ters (Hillary Clin­ton, Ron Paul, Barack Oba­ma, John Mc­Cain, Sarah Pal­in) and un­ex­pected twists.

The Republicans

As in 2004, the odds of an ul­ti­mate Re­pub­li­can vic­tory were far too low—hover­ing in the range of 30-40%. This is ob­vi­ously wrong on purely his­tor­i­cal con­sid­er­a­tions (De­moc­rats don’t win the pres­i­dency that often), and seems par­tic­u­larly wrong when we con­sider that George W. Bush won in 2004. Any­one ar­gu­ing that GWB poi­soned the well for a suc­ceed­ing Re­pub­li­can ad­min­is­tra­tion faces the diffi­cult task of ex­plain­ing (at least) 2 things:

  1. How as­so­ci­a­tion with GWB would be so dam­ag­ing when GWB him­self was re-elected in 2004 with a larger per­cent­age of votes than 2000
  2. How as­so­ci­a­tion with GWB poli­cies like Iraq would be so dam­ag­ing when the daily se­cu­rity sit­u­a­tion in Iraq has clearly im­proved since 2004.
  3. And in gen­er­al: how a fresh Re­pub­li­can face (with the same old poli­cies) could do any worse than GWB did, given that he will pos­sess all the ben­e­fits of GWB’s poli­cies and none of the per­sonal an­i­mus against GWB.

The key to Re­pub­li­can bet­ting was fig­ur­ing out who was hope­less, and work from there by es­sen­tially short sell­ing them. As time passed, one could sharpen one’s bets and be­gin bet­ting for a can­di­date rather than against. My list ul­ti­mately looked like this:

  1. Ron Paul was so ob­vi­ously not go­ing to win. He ap­pealed to only a small mi­nor­ity of the Re­pub­li­can par­ty, had views idio­syn­cratic where they weren’t offen­sive, and wanted to de­stroy im­por­tant Re­pub­li­can con­stituen­cies. If the In­ter­nets were Amer­i­ca, per­haps he could’ve won.
  2. Rudy Giu­liani was an­other easy can­di­date to bet against. He had mul­ti­ple strikes: he was far too skeevy, ques­tion­able eth­i­cally (the in­ves­ti­ga­tions of Bernard Kerik were well un­der­way at this point), had made him­self a par­o­dy, had few qual­i­fi­ca­tions, and a cam­paign strat­egy that was as ru­inous as it was per­plex­ing. He was un­ac­cept­able cul­tur­al­ly, what with his di­vorces, loose liv­ing, hu­mor­ous cross-dress­ing, and New York ways. He would not play well in Peo­ria.
  3. was un­done by be­ing a bad ver­sion of Rea­gan. He did­n’t cam­paign nearly as in­dus­tri­ously as he needed to. The death knell, as far as I was con­cerned, was when na­tional pub­li­ca­tions be­gan men­tion­ing the “lazy like a fox” joke as an old joke. No spe­cial ap­peal, no spe­cial re­sources, no con­ven­tional abil­i­ty…
  4. Mitt Rom­ney had 2 prob­lems: he was slick and seemed in­au­then­tic, and peo­ple fo­cused too much on his be­ing Mor­mon and Mass­a­chu­setts gov­er­nor­ship (a po­si­tion that would’ve been a great aid—if it had­n’t been in that dis­gust­ingly lib­eral state). I was less con­fi­dent about strik­ing him off, but I de­cided his odds of 20% or so were too gen­er­ous.
  5. Mike Huck­abee struck me as not hav­ing the re­sources to make it to the nom­i­na­tion. I was even less sure about this one than Mitt, but I lucked out­—the sup­port­ers of Huck­abee be­gan in­fight­ing with Rom­ney sup­port­ers.

This did­n’t leave very many can­di­dates for con­sid­er­a­tion. By this process of elim­i­na­tion, I was in fact left with only John Mc­Cain as a se­ri­ous Re­pub­li­can con­tender. If you re­mem­ber the early days, this was in fact a very strange re­sult to reach: John Mc­Cain ap­peared tired, a beaten man from 2004 mak­ing one last pro forma try, his cam­paign in­ept and riven by in­fight­ing, and he was just in gen­er­al—old, old, old.

But hey, his shares were trad­ing in the 5-15% range. They were the best bar­gain go­ing in the mar­ket. I held them for a long time and ul­ti­mately would sell them at 94-99¢ for a roughly 900% gain. (I sold them in­stead of wait­ing for the Re­pub­li­can con­ven­tion be­cause I was for­go­ing min­i­mal gains, and I was con­cerned by re­ports on his health.)

The Democrats

A sim­i­lar process ob­tained for the De­moc­rats. A cer­tain dis­like of Hillary Clin­ton led me to think that her sta­tus as the heir pre­sump­tive (re­flected in share prices) would be dam­aged at some point. All of the other can­di­dates struck me as flakes and hope­less caus­es, with the ex­cep­tion of John Ed­wards and Barack Oba­ma.

I even­tu­ally ruled out John Ed­wards as hav­ing no com­pelling char­ac­ter­is­tics and smack­ing of phoni­ness (much like Rom­ney). I was never tempted to change my mind on him, and the adul­tery and hair flaps turned out to be wait­ing in the wings for him. So I could get rid of Ed­wards as a choice.

Is it any sur­prise I lighted on Oba­ma? He had im­pressed me (and just about every­one else) with his 2004 con­ven­tion speech, his cam­paign seemed quite com­pe­tent and well-fund­ed, the me­dia clearly loved him, and so on. Best of all, his shares were rel­a­tively low (30-40%) and I had money left after the Re­pub­li­cans. So I bought Obama and sold Clin­ton. I even­tu­ally sold out of Obama at the quite re­spectable 78¢.

Summing up

By the end of the elec­tion, I had made a killing on my Obama and Mc­Cain shares. My ac­count bal­ance stood at $38; so over the 3 or 4 years of trad­ing I had nearly dou­bled my in­vest­ment. $18 is per­haps enough for a steak din­ner.

Fur­ther, I had learned a valu­able les­son in 2004 about my own po­lit­i­cal bi­ases and ir­ra­tional­i­ty, and had earned the right in 2008 to be smug about fore­see­ing a Mc­Cain and Obama match-up when the ma­jor­ity of pun­dits were try­ing to fig­ure out whether Hillary would be run­ning against Huck­abee or Rom­ney.

And fi­nal­ly, I’ve con­cluded that my few ob­ser­va­tions aside, pre­dic­tion mar­kets are pretty ac­cu­rate. I often use them to san­i­ty-check my­self by ask­ing ‘If I dis­agree, what spe­cial knowl­edge do I have?’ Often I have none.

When I got out of the IEM, I re­flected on my trades: I learned some valu­able lessons, I had a good ex­pe­ri­ence, and I came out a be­liev­er. I re­solved that one day I’d like to try out a more sub­stan­tial and var­ied mar­ket, like In­trade.

IEM logs

The fol­low­ing is an edited IEM trad­ing his­tory for me, re­mov­ing many limit po­si­tions and other ex­pired or can­celed trades:

Or­der date O.­time Mar­ket Con­tract Or­der # Unit price Ex­piry Res­o­lu­tion type R.# R.price
12/29/04 20:16:23 Fed­Pol­i­cyB FR­same0205 Pur­chase 20 0.048 Traded 10 0.048
12/29/04 20:17:26 Fed­Pol­i­cyB FRup0205 Pur­chase 2 0.956 Traded 2 0.956
12/29/04 20:17:46 Fed­Pol­i­cyB FR­same0205 Pur­chase 10 0.049 Traded 10 0.049
02/12/05 17:48:51 Com­p-Ret AAPL-05b Bid 5 0.96 3/14/2005 11:59PM Can­cel-Man­ager
02/13/05 16:43:33 Com­p-Ret AAPL-05b Bid 7 0.982 3/15/2005 11:59PM Traded 7 0.98
02/21/05 10:03:45 Fed­Pol­i­cyB FR­same0505 Bid 12 0.053 4/23/2005 11:59PM Traded 12 0.053
02/21/05 10:04:35 Fed­Pol­i­cyB FRup0305 Bid 7 0.988 3/23/2005 11:59PM Traded 7 0.988
02/21/05 10:04:35 Fed­Pol­i­cyB FRup0305 Traded 3 0.007 3/3/2005 9:23AM
02/21/05 10:06:59 Fed­Pol­i­cyB FR­same0305 Bid 6 0.007 3/23/2005 11:59PM Traded 3 0.007
02/21/05 10:07:51 Com­p-Ret AAPL-05b Bid 5 0.998 3/23/2005 11:59PM Can­cel-Man­ager
02/21/05 10:07:51 Com­p-Ret AAPL-05b Traded 4 0.889 2/28/2005 8:56:AM
02/26/05 10:14:08 Com­p-Ret AAPL-05c Bid 5 0.889 3/28/2005 11:59PM Traded 1 0.889
02/26/05 10:14:30 Com­p-Ret MSFT-05c Bid 1 0.889 3/28/2005 11:59PM Traded 1 0.889
02/26/05 10:15:43 MSFT-Price ? Traded 1 0.4 3/5/2005 10:39PM
03/05/05 12:51:45 MSFT-Price MS025-05cL Bid 5 0.4 4/7/2005 11:59PM Traded 4 0.4
03/05/05 12:53:27 Com­p-Ret AAPL-05c Ask 4 0.95 7/7/2005 11:59PM Can­cel-Man­ager
03/05/05 12:53:56 Com­p-Ret MSFT-05c Ask 1 0.5 7/7/2005 11:59PM Can­cel-Man­ager
03/05/05 12:54:38 Fed­Pol­i­cyB FR­same0505 Ask 12 0.7 9/7/2005 11:59PM Can­cel-Man­ager
03/05/05 12:55:07 Fed­Pol­i­cyB FR­same0305 Ask 6 0.2 9/7/2005 11:59PM Can­cel-Man­ager
03/05/05 12:55:33 Fed­Pol­i­cyB FRup0305 Ask 6 0.998 6/7/2005 11:59PM Traded 6 0.998
03/05/05 12:55:33 Fed­Pol­i­cyB ? Traded 2 0.803 9/16/2005 3:37PM
03/05/05 12:55:33 Fed­Pol­i­cyB ? Traded 5 0.803 9/16/2005 3:34PM
09/16/05 14:38:57 Fed­Pol­i­cyB FRup0905 Bid 12 0.803 9/20/2005 11:59PM Traded 5 0.803
09/16/05 14:39:34 Fed­Pol­i­cyB FR­same0905 Bid 6 0.17 9/22/2005 11:59PM Traded 6 0.17
09/28/05 23:49:01 Fed­Pol­i­cyB FR­same1105 Bid 15 0.066 10/1/2005 11:59PM Traded 15 0.066
10/07/05 12:28:48 Fed­Pol­i­cyB FR­same1105 Ask 15 0.07 10/9/2006 11:59PM Can­cel-Man­ager
10/07/05 12:29:23 Fed­Pol­i­cyB FRup1105 Bid 2 0.95 10/9/2006 11:59PM Can­cel-Man­ager
10/10/05 14:54:45 Fed­Pol­i­cyB FRup1105 Bid 3 0.97 10/12/2005 11:59PM Traded 3 0.97
12/09/05 15:02:02 Fed­Pol­i­cyB FRup1205 Bid 15 0.995 12/12/2005 11:59PM Traded 15 0.995
12/09/05 15:02:20 Fed­Pol­i­cyB FR­same1205 Bid 10 0.002 12/12/2005 11:59PM Traded 10 0.002
12/09/05 15:02:43 Fed­Pol­i­cyB FR­down1205 Bid 2 0.001 12/13/2005 11:59PM Traded 2 0.001
12/09/05 15:02:43 Fed­Pol­i­cyB ? Traded 2 0.719 6/2/2006 8:41:40AM
12/09/05 15:02:43 Fed­Pol­i­cyB ? Traded 10 0.719 6/2/2006 8:39:46AM
05/31/06 21:28:25 Fed­Pol­i­cyB FRup0606 Bid 22 0.719 6/6/2006 11:59PM Traded 10 0.719
08/07/06 21:19:08 Fed­Pol­i­cyB FRup0806 Bid 20 0.27 8/22/2006 11:59PM Traded 20 0.27
08/07/06 21:19:08 Fed­Pol­i­cyB ? Traded 7 0.608 8/8/2006 1:13:17PM
08/07/06 21:19:47 Fed­Pol­i­cyB FR­same0806 Bid 10 0.608 8/9/2006 11:59PM Traded 3 0.608
08/07/06 21:19:47 Fed­Pol­i­cyB ? Traded 7 0.7 8/7/2006 9:52:43PM
08/07/06 21:20:29 Fed­Pol­i­cyB FR­same0906 Bid 10 0.7 8/9/2006 11:59PM Traded 3 0.7
08/07/06 21:20:54 Fed­Pol­i­cyB FR­down0906 Bid 10 0.006 8/9/2006 11:59PM Traded 10 0.006
08/07/06 21:23:04 PRES08-WTA DEM08-WTA Bid 15 0.5 12/23/2006 11:59PM Traded 15 0.5
08/28/06 09:20:10 PRES08-VS UREP08-VS Bid 10 0.48 12/30/2006 11:59PM Traded 10 0.48
08/28/06 09:20:10 PRES08-VS ? Traded 3 0.5 9/19/2006 10:24AM
08/28/06 09:20:26 PRES08-VS UDEM08-VS Bid 10 0.5 12/30/2006 11:59PM Traded 1 0.5
06/01/07 20:00:20 PRES08-WTA DEM08-WTA Ask 10 0.66 9/3/2007 11:59PM Traded 10 0.66
06/01/07 20:01:24 PRES08-WTA DEM08-WTA Ask 5 0.7 6/3/2008 11:59PM Traded 5 0.7
06/01/07 20:02:21 PRES08-WTA REP08-WTA Bid 10 0.33 9/3/2007 11:59PM Traded 10 0.33
06/01/07 20:04:26 RCon­v08 ROMN-NOM Bid 5 0.2 7/3/2007 11:59PM Traded 5 0.2
06/01/07 20:05:33 DCon­v08 OBAM-NOM Pur­chase 5 0.322 6/1/2007 8:05:33PM Traded 1 0.322
06/06/07 23:41:39 DCon­v08 DCon­v08 Buy-bun­dle 3 1 Traded 3 1
06/06/07 23:42:20 DCon­v08 EDWA-NOM Ask 3 0.1 6/8/2008 11:59PM Traded 3 0.1
06/06/07 23:42:46 DCon­v08 DROF-NOM Ask 3 0.13 6/8/2008 11:59PM Traded 3 0.13
06/06/07 23:44:29 RCon­v08 RCon­v08 Buy-bun­dle 3 1 Traded 3 1
06/06/07 23:45:12 RCon­v08 GIUL-NOM Ask 3 0.21 9/20/2007 11:59PM Traded 3 0.21
06/06/07 23:45:34 RCon­v08 MCCA-NOM Ask 3 0.15 9/20/2007 11:59PM Traded 3 0.15
06/06/07 23:46:55 PRES08-VS UDEM08-VS Ask 4 0.56 6/8/2008 11:59PM Traded 4 0.56
12/11/07 16:08:57 RCon­v08 HUCK-NOM Ask 3 0.22 12/13/2007 11:59PM Traded 3 0.22
12/11/07 16:10:08 RCon­v08 ROMN-NOM Ask 4 0.25 12/13/2007 11:59PM Traded 4 0.25
12/11/07 16:14:22 RCon­v08 RROF-NOM Ask 3 0.03 12/13/2007 11:59PM Traded 3 0.03
12/11/07 16:16:12 RCon­v08 MCCA-NOM Bid 5 0.1 12/13/2008 11:59PM Traded 5 0.1
12/11/07 16:16:57 RCon­v08 RCon­v08 Buy-bun­dle 5 1 12/11/2007 4:16PM Traded 5 1
12/11/07 16:17:39 RCon­v08 GIUL-NOM Sell 5 0.375 12/11/2007 4:17PM Traded 5 0.375
12/11/07 16:18:01 RCon­v08 HUCK-NOM Sell 5 0.207 12/11/2007 4:18PM Traded 5 0.207
12/11/07 16:18:10 RCon­v08 MCCA-NOM Sell 5 0.108 12/11/2007 4:18PM Traded 5 0.108
12/11/07 16:18:22 RCon­v08 ROMN-NOM Sell 5 0.241 12/11/2007 4:18PM Traded 5 0.241
12/11/07 16:18:33 RCon­v08 THOMF-NOM Sell 5 0.04 12/11/2007 4:18PM Traded 5 0.04
12/11/07 16:18:46 RCon­v08 RROF-NOM Sell 5 0.02 12/11/2007 4:18PM Traded 5 0.02
12/11/07 16:19:03 RCon­v08 ROMN-NOM Sell 4 0.24 12/11/2007 4:19PM Traded 4 0.24
12/11/07 16:20:28 DCon­v08 DCon­v08 Buy-bun­dle 10 1 12/11/2007 4:20PM Traded 10 1
12/11/07 16:20:51 DCon­v08 DROF-NOM Ask 10 0.03 12/13/2008 11:59PM Traded 10 0.03
12/11/07 16:20:51 DCon­v08 ? Traded 5 0.09 12/19/2007 3:34PM
12/11/07 16:21:31 DCon­v08 EDWA-NOM Ask 10 0.09 12/13/2008 11:59PM Traded 5 0.09
12/11/07 16:21:31 DCon­v08 ? Traded 1 0.58 12/11/2007 9:40PM
12/11/07 16:21:31 DCon­v08 ? Traded 9 0.58 12/11/2007 9:40PM
12/11/07 16:25:21 DCon­v08 CLIN-NOM Ask 13 0.58 12/13/2008 11:59PM Traded 3 0.58
12/11/07 16:26:08 DCon­v08 OBAM-NOM Ask 14 0.45 12/13/2008 11:59PM Traded 14 0.45
12/11/07 16:27:05 DCon­v08 OBAM-NOM Bid 5 0.3 12/31/2007 11:59PM Traded 5 0.3
12/11/07 16:28:51 Fed­Pol­i­cyB FR­same0108 Bid 3 0.31 12/31/2007 11:59PM Traded 3 0.31
02/05/08 22:41:41 RCon­v08 THOMF-NOM Sell 3 0.002 2/5/2008 10:41PM Traded 3 0.002
02/05/08 22:47:46 DCon­v08 OBAM-NOM Bid 10 0.42 2/7/2008 11:59PM Traded 10 0.42
02/05/08 22:48:09 DCon­v08 OBAM-NOM Bid 5 0.43 2/7/2008 11:59PM Traded 5 0.425
02/07/08 14:46:34 DCon­v08 DCon­v08 Buy-bun­dle 5 1 2/7/2008 2:46PM Traded 5 1
02/07/08 14:47:21 DCon­v08 EDWA-NOM Sell 5 0.002 2/7/2008 2:47PM Traded 5 0.002
02/07/08 14:47:34 DCon­v08 DROF-NOM Sell 5 0.006 2/7/2008 2:47PM Traded 5 0.006
02/07/08 14:47:54 DCon­v08 OBAM-NOM Ask 15 0.6 2/9/2008 11:59PM Traded 15 0.6
02/07/08 15:11:51 PRES08-WTA REP08-WTA Ask 10 0.51 2/9/2009 11:59PM Traded 10 0.51
02/07/08 15:13:24 RCon­v08 RCon­v08 Buy-bun­dle 4 1 2/7/2008 3:13PM Traded 4 1
02/07/08 15:13:42 RCon­v08 GIUL-NOM Sell 4 0.001 2/7/2008 3:13PM Traded 4 0.001
02/07/08 15:13:49 RCon­v08 HUCK-NOM Sell 4 0.017 2/7/2008 3:13PM Traded 4 0.017
02/07/08 15:13:58 RCon­v08 ROMN-NOM Pur­chase 4 0.005 2/7/2008 3:13PM Traded 4 0.005
02/07/08 15:14:06 RCon­v08 THOMF-NOM Sell 4 0.003 2/7/2008 3:14PM Traded 4 0.003
02/07/08 15:14:14 RCon­v08 RROF-NOM Sell 4 0.009 2/7/2008 3:14PM Traded 4 0.009
02/07/08 15:14:29 RCon­v08 RCon­v08 Buy-bun­dle 1 1 2/7/2008 3:14PM Traded 1 1
02/07/08 15:14:44 RCon­v08 ROMN-NOM Sell 9 0.002 2/7/2008 3:14PM Traded 9 0.002
02/07/08 15:14:54 RCon­v08 GIUL-NOM Sell 1 0.001 2/7/2008 3:14PM Traded 1 0.001
02/07/08 15:15:02 RCon­v08 HUCK-NOM Sell 1 0.017 2/7/2008 3:15PM Traded 1 0.017
02/07/08 15:15:10 RCon­v08 THOMF-NOM Pur­chase 1 0.006 2/7/2008 3:15PM Traded 1 0.006
02/07/08 15:15:22 RCon­v08 RROF-NOM Sell 1 0.009 2/7/2008 3:15PM Traded 1 0.009
02/07/08 15:15:30 RCon­v08 THOMF-NOM Sell 2 0.003 2/7/2008 3:15PM Traded 2 0.003
04/06/08 13:52:28 DCon­v08 CLIN-NOM Ask 5 0.15 4/8/2008 11:59PM Traded 4 0.15
04/06/08 13:52:51 DCon­v08 CLIN-NOM Ask 1 0.14 4/8/2008 11:59PM Traded 1 0.14
04/06/08 13:52:51 DCon­v08 ? Traded 3 0.79 4/10/2008 6:45PM
04/06/08 13:55:08 DCon­v08 OBAM-NOM Bid 5 0.79 4/8/2009 11:59PM Traded 2 0.79
04/06/08 13:59:43 RCon­v08 RCon­v08 Buy-bun­dle 10 1 4/6/2008 1:59PM Traded 10 1
04/06/08 14:00:27 RCon­v08 GIUL-NOM Sell 10 0.004 4/6/2008 2:00PM Traded 10 0.004
04/06/08 14:00:41 RCon­v08 HUCK-NOM Sell 10 0.007 4/6/2008 2:00PM Traded 10 0.007
04/06/08 14:00:54 RCon­v08 ROMN-NOM Sell 10 0.01 4/6/2008 2:00PM Traded 10 0.01
04/06/08 14:01:07 RCon­v08 THOMF-NOM Sell 10 0.004 4/6/2008 2:01PM Traded 10 0.004
04/06/08 14:01:20 RCon­v08 RROF-NOM Sell 10 0.025 4/6/2008 2:01PM Traded 10 0.025
04/14/08 13:51:41 DCon­v08 OBAM-NOM Bid 3 0.78 4/16/2008 11:59PM Traded 3 0.78
05/03/08 12:06:18 DCon­v08 OBAM-NOM Ask 18 0.78 5/5/2008 11:59PM Traded 18 0.78
05/05/08 20:21:52 RCon­v08 MCCA-NOM Ask 20 0.94 5/7/2008 11:59PM Traded 20 0.94
05/20/08 15:44:10 PRES08-VS UREP08-VS Sell 10 0.483 5/20/2008 3:44PM Traded 1 0.483
05/20/08 15:45:29 PRES08-VS UREP08-VS Sell 10 0.482 5/20/2008 3:45PM Traded 9 0.482

Intrade

In 2010, I signed up for In­trade since the IEM was too small and had too few con­tracts to main­tain my in­ter­est.

Payment

Pay­ing In­trade, as a for­eign com­pany in Ire­land, was a lit­tle tricky. I first looked into pay­ing via debit card, but In­trade de­manded con­sid­er­able doc­u­men­ta­tion, so I aban­doned that ap­proach. I then tried a bank trans­fer since that would be quick; but my credit union failed me and said In­trade had not pro­vided enough in­for­ma­tion (which seemed un­likely to me, and In­trade’s cus­tomer ser­vice agreed)—and even if they had, they would charge me $10! Fi­nal­ly, I de­cide to just snail-mail them a check. I was pleas­antly sur­prised to see that postage to Ire­land was ~$1, and it made it there with­out a prob­lem. But very slow­ly: per­haps 15 days or so be­fore the check fi­nally cleared and my ini­tial $200 was de­posit­ed.

My Intrade trading

In­trade has a con­sid­er­ably less us­able sys­tem than IEM. In IEM, sell­ing short is very easy: you pur­chase a pair of con­tracts (yes/no) which sum to $0, and then you sell off the op­po­site. If I think DEM08 is too high com­pared to REP08, I get 1 share of each and sell the DEM08. In­trade, on the other hand, re­quires you to ‘sell’ a share. I don’t en­tirely un­der­stand it, but it seems to be equiv­a­lent.

I wanted to sell short some of the more crazy prob­a­bil­i­ties such as on Japan go­ing nu­clear or the USA at­tack­ing North Ko­rea or Iran, but it turned out that to make even small profits on them, I would have to hold them a long time and be­cause their prob­a­bil­i­ties were so low al­ready, In­trade was de­mand­ing large —to buy 4 or 5 shorts would lock up half my ac­count!9

My first trade was to sell short the In­trade con­tract on , which would le­gal­ize non-med­ical mar­i­juana pos­ses­sion. I rea­soned that Cal­i­for­nia re­cently banned gay mar­riage at the polls, and med­ical mar­i­juana is well-known as a joke (less­en­ing the in­cen­tive to pass Prop 19), and that its true prob­a­bil­ity of pass­ing was more like 30%—well be­low its cur­rent price. The con­tract would ex­pire in just 2 months, mak­ing it even more at­trac­tive.

It was at 49 when I shorted it. I put around 20% of my port­fo­lio (or ~$40) after con­sult­ing with the Kelly cri­te­rion. 2 days lat­er, the price had in­creased to 53.3, and on 4 Oc­to­ber, it had spiked all the way to 76%. I be­gan to se­ri­ously con­sider how con­fi­dent I was in my pre­dic­tion, and whether I was faced with a choice be­tween los­ing the full $40 I had in­vested or buy­ing shares at 76% (to ful­fill my short­ing con­tracts) and eat­ing the loss of ~$20. I med­i­tat­ed, and rea­soned that there was­n’t that much liq­uid­ity and I had found no ger­mane in­for­ma­tion on­line (like a poll reg­is­ter­ing strong pub­lic sup­port), and de­cided to hold onto my shares. As of 27 Oc­to­ber, the price had plum­meted all the way to 27%, and con­tin­ued to bounce around the 25-35% price range. I had at the be­gin­ning de­cided that the true prob­a­bil­ity was in the 30% decile, and if any­thing, it was now un­der­priced. Given that, I was run­ning a risk hold­ing onto my shorts. So on 30 Oc­to­ber, I bought 10 shares at 26%, clos­ing out my shorts, and net­ting me $75.83, for a re­turn of $25.83, or 50% over the month I held it.

My sec­ond trade dipped into the highly liq­uid 2012 US pres­i­den­tial elec­tions. The par­ti­san con­tracts were trad­ing at ~36% for the Re­pub­li­cans and ~73% for the De­moc­rats. I would agree that the true odds are >50% for the De­moc­rats since pres­i­dents are usu­ally re-elected and the Re­pub­li­cans have few good-look­ing can­di­dates com­pared to Oba­ma, who has ac­com­plished quite a bit in office. How­ev­er, I think 73% is over­stat­ed, and fur­ther, that the mar­kets al­ways panic dur­ing an elec­tion and squish the ra­tio to around 50:50. So I sold De­mo­c­rat and bought Re­pub­li­can. (I wound up pur­chas­ing more Re­pub­li­can con­tracts than sell­ing De­mo­c­rat con­tracts be­cause of the afore­men­tioned mar­gin is­sues.)

I bought 5 Reps at 39, and shorted 1 Dem at 60.8. 2 days lat­er, they had changed to 37.5 and 62.8 re­spec­tive­ly. By 2010-11-26, it was 42 and 56.4. By 2011-01-01, Re­pub­li­cans was at 39.8 and De­moc­rats at 56.8.

Fi­nal­ly, I de­cided that Sarah Palin has next to no chance at the Re­pub­li­can nom­i­na­tion since she blew a ma­jor hole in her cre­den­tials by her bizarre res­ig­na­tion as gov­er­nor, and her shares at 18% were just crazy.

I shorted 10 at 18% since I thought the true odds are more like 10%. 2 days lat­er, they had risen to 19%. By 26 No­vem­ber, they were still at 19%, but the odds of her an­nounc­ing a can­di­dacy had risen to 75%. I’d put the odds of her an­nounc­ing a run at ~90% (a mis­take, given that she ul­ti­mately de­cided against run­ning in Oc­to­ber 2011), but I don’t have any spare cash to buy con­tracts. I could sell out of the an­ti-nom­i­na­tion con­tracts and put that money into an­nounce­ment, but I’m not sure this is a good idea—the an­nounce­ment is very volatile, and I dis­like eat­ing the fees. She has­n’t done too well as the Tea Party em­i­nence grise, but maybe she prefers it to the hard work of a na­tional cam­paign?

By 2011-01-01, the nom­i­nee odds were still stuck at 18% but the an­nounce­ment had fallen to 62%. The lat­ter is dra­matic enough that I’m won­der­ing whether my 90% odds re­ally are cor­rect (it prob­a­bly was­n’t). By June, I’ve be­gun to think that Palin knows she has lit­tle chance of win­ning ei­ther the nom­i­na­tion or pres­i­den­cy, and is just milk­ing the spec­u­la­tion for all its worth. Check­ing on 8 June, I see that the odds of an an­nounce­ment have fallen from 62% to 33% and a nom­i­na­tion from 18% to 5.9%—so I would have made out very nicely on the nom­i­na­tion con­tract had I held the short, but been mauled if I had made any shorts on the an­nounce­ment. I am not sure what les­son to draw from this ob­ser­va­tion; prob­a­bly that I am bet­ter at as­sess­ing out­comes based on a great many peo­ple (like a nom­i­na­tion) than out­comes based on a sin­gle in­di­vid­ual per­son’s psy­chol­ogy (like whether to an­nounce a run or not).

Cashing out

In Jan­u­ary 201110, In­trade an­nounced a new fee struc­ture—in­stead of pay­ing a few cents per trade, one has free trad­ing but your ac­count is charged $5 every month or $60 a year (see also the fo­rum an­nounce­ment). Fees have been a prob­lem with In­trade in the past due to the small amounts usu­ally wa­gered—see for ex­am­ple fi­nan­cial jour­nal­ist 2008 com­plaints.

Ini­tial­ly, the new changes did­n’t seem so bad to me, but then I com­pared the an­nual cost of this fee to my trad­ing stake, ~$200. I would have to earn a re­turn of 30% just to cover the fee! (This is also pointed out by many in the fo­rum thread above.)

I don’t trade very often since I think I’m best at spot­ting mis­pric­ings over the long-term (the CA Propo­si­tion 19 con­tract (WP) be­ing a case in point; de­spite be­ing ul­ti­mately cor­rect, I could have been mauled by some of the spikes if I had tried only short­-term trades). If this fee had been in place since I joined, I would be down by $30 or $40.

I’m con­fi­dent that I can earn a good re­turn like 10 or 20%, but I can’t do >30% with­out tak­ing tremen­dous risks and wip­ing my­self out.

And more gen­er­al­ly, as­sum­ing that this is­n’t raid­ing ac­counts11 as a pre­lude to shut­ting down (as a num­ber of fo­rumers claim), In­trade is no longer use­ful for Less­Wrongers like me as it is heav­ily pe­nal­iz­ing small long-term bets like the ones we are usu­ally con­cerned with­—­bets in­tended to be ed­u­ca­tional or in­for­ma­tive. It may be time to in­ves­ti­gate other pre­dic­tion mar­kets like Bet­fair, or just re­sign our­selves to non-monetary/play-money sites like Pre­dic­tion­Book.­com.

For­tu­nately for my de­ci­sion to cash out (I did­n’t see any­thing I wanted to risk hold­ing for more than a few week­s), prices had moved enough that I did­n’t have to take any losses on any po­si­tions12, and I wound up with $223.32. The $5 for Jan­u­ary had al­ready been as­sessed, and there is a 5 euro fee for a check with­drawal, so my check will ac­tu­ally be for some­thing more like $217, a net profit of $17.

I re­quested my ac­count be closed on 5 Jan­u­ary and the check ar­rived 16 Jan­u­ary; the fee for with­drawal was $5.16 and my sum to­tal $218.16 (a lit­tle higher than the $217 I had guessed).

Bitcoin

In May-June 2011, , an on­line cur­ren­cy, un­der­went ap­prox­i­mately 5-6 dou­blings of its ex­change rate against the US dol­lar, draw­ing the in­ter­est of much of the tech world and my­self. (I had first heard of it when it was at 50 cents to the dol­lar, but had writ­ten it off as not worth my time to in­ves­ti­gate in de­tail.)

Dur­ing the first dou­bling, when it hit par­ity with the dol­lar, I be­gan read­ing up on it and ac­quired a Bit­coin of my own—a do­na­tion from Kiba to try out Wit­coin, which was a so­cial news site where votes are worth frac­tions of bit­coins. I then gave my thoughts on Less­Wrong when the topic came up:

After think­ing about it and look­ing at the cur­rent com­mu­nity and the sur­pris­ing amount of ac­tiv­ity be­ing con­ducted in bit­coins, I es­ti­mate that bit­coin has some­where be­tween 0 and 0.1% chance of even­tu­ally re­plac­ing a de­cent size fiat cur­ren­cy, which would put the value of a bit­coin at any­where up­wards of $10,000 a bit­coin. (Match the ex­ist­ing out­stand­ing num­ber of what­ever cur­rency to 21m bit­coins. Many cur­ren­cies have bil­lions or tril­lions out­stand­ing.) Cut that in half to $5000, and call the prob­a­bil­ity an even 0.05% (av­er­age of 0 and 0.1%), and my ex­pected utility/value for pos­sess­ing a coin is $25 a bit­coin ().

I was more than a lit­tle sur­prised that by June, my ex­pected value had al­ready been sur­passed by the mar­ket value of bit­coins. Which leads to a tricky ques­tion: should I sell now? If Bit­coin is a bub­ble as fre­quently ar­gued, then I would be fool­ish not to sell my 5 bit­coins for a cool $130 (ex­clud­ing trans­ac­tion cost­s). But… I had not ex­pected Bit­coin to rise so much, and if Bit­coin did bet­ter than I ex­pect­ed, does­n’t it fol­low that I should no longer be­lieve the prob­a­bil­ity of suc­cess is merely 0.05%? Should­n’t it have in­creased a bit? Even if it in­creased only to 0.07%, that would make the EV more like $35 and so I would con­tinue to hold bit­coins.

The stakes are high. It is a cu­ri­ous prob­lem, but it’s also a pre­dic­tion mar­ket. One is sim­ply pre­dict­ing what the ul­ti­mate price of bit­coins will be. Will they be worth­less, or a global cur­ren­cy? The cur­rent price is the prob­a­bil­i­ty, against an un­known pay­off. To pre­dict the lat­ter, one sim­ply holds bit­coins. To pre­dict the for­mer, one sim­ply sells bit­coins. Bit­coins are not com­modi­ties in any sense. Buy­ing a cow is not a pre­dic­tion mar­ket on beef be­cause the value of beef can’t drop to lit­er­ally 0: you can al­ways eat it. You can’t eat bit­coins or do any­thing at all with them. They are even more purely money than fiat money (the US gov­ern­ment hav­ing per­pet­ual prob­lems with the zinc or nickel or cop­per in its coins be­ing worth more as metal than as coins, and dol­lars are a tough linen fab­ric).

Men­cius Mold­bug turns out to have a sim­i­lar analy­sis of the sit­u­a­tion:

If Bit­coin be­comes the new global mon­e­tary sys­tem, one bit­coin pur­chased to­day (for 90 cents, last time I checked) will make you a very wealthy in­di­vid­ual. You are es­sen­tially buy­ing Man­hat­tan for a quar­ter. There are only 21 mil­lion bit­coins (in­clud­ing those not yet mint­ed). (In my de­sign, this was a far more el­e­gant 264, with quan­ti­ties in ex­po­nen­tial no­ta­tion. Just say­in’.) Mapped to $100 tril­lion of global mon­ey, to pull a ran­dom num­ber out of the air, you be­come a mil­lion­aire. Wow!

So even if the prob­a­bil­ity of Bit­coin suc­ceed­ing is ep­silon, a mil­lion to one, it’s still worth­while for any­one to buy at least a few bit­coins now. The cur­rency thus de­rives an ini­tial value from this prob­a­bil­i­ty, and boots it­self into ex­is­tence from pure worth­less­ness—be­com­ing a vi­able repos­i­tory of sav­ings. If a very strange, dan­ger­ous and un­sta­ble one. I think the prob­a­bil­ity of Bit­coin suc­ceed­ing is very low. I would not put it at a mil­lion to one, though, so I rec­om­mend that you go out and buy a few bit­coins if you have the tech­ni­cal chops. My fi­nan­cial ad­vice is to not buy more than ten13, which should be F-U money if Bit­coin wins.

Bit­coin cu­mu­la­tively rep­re­sents my largest ever wa­ger in a pre­dic­tion mar­ket; at stake was >$130 in losses (if bit­coins go to ze­ro), or in­defi­nite thou­sands. It will be very in­ter­est­ing to see what hap­pens. By 2011-08-05, Bit­coin has worked its way down to around $10/₿, mak­ing my net worth $26; I did spend sev­eral bit­coins on the , though. By 2011-11-23, it had trended down to $2.35/₿, but due to a large do­na­tion of 20 bit­coins, I spent most of my bal­ance at the Silk Road, leav­ing me with 4.7 bit­coins. Over­all, not a good start. By July 2012, do­na­tions brought my stock up to ₿12.5 with prices trad­ing at $5-7. After an un­ex­pected spike on 17 July to $9, I did some read­ing and learned that “pi­rateat40” (the op­er­a­tor of a pos­si­ble Ponzi scheme) was boast­ing in #bit­coin (Red­dit dis­cus­sion) of us­ing the funds to ma­nip­u­late the mar­ket in an ap­par­ent scheme and also mock­ing the ig­no­rance of most buy­ers and sell­ers for not pay­ing at­ten­tion to the Bit­coin fo­rums or IRC chan­nel. pi­rateat40’s ma­nip­u­la­tion and in­sin­u­a­tion of fu­ture plans sourced me on hold­ing many bit­coins, and I re­solved to sell if the price on Mt­Gox went quickly back up to >$9; it did so the next day (18 Ju­ly), I sold at $9.17. With­draw­ing from Mt­Gox turns out to be a ma­jor pain, with with­drawal re­quir­ing pro­vid­ing doc­u­men­ta­tion like a pass­port and a bank trans­fer cost­ing $25. I ul­ti­mately used the #bit­coin-otc chan­nel to arrange a swap with “nan­otube” of my $115 Mt­Gox dol­lars for an equiv­a­lent do­na­tion to my Pay­pal ac­count. The next day, the price had fallen to $7.77; demon­strat­ing why I don’t try to time mar­kets, by 11 Au­gust, the price had jumped to $11.50. This was a lit­tle wor­ri­some for my long-term views that there’s a good chance the Ponzi scheme will be used in mar­ket ma­nip­u­la­tion or col­lapse, but there’s still much time left. A few days lat­er, the price had spiked as high as $15, and I felt like quite a fool; but that’s the mar­velous thing about mar­kets, one day you are a ge­nius and the next you are fool. Un­ex­pect­ed­ly, pi­rateat40 an­nounced the dis­so­lu­tion of his BTCST. Was it a Ponzi or not? No one knew. Per­haps on fears of that, or per­haps be­cause pi­rateat40 was flee­ing with the funds, on the 18-19 Au­gust, the price be­gan drop­ping, and kept drop­ping, all the way through $10, then $9, then $8. Watch­ing this, I re­solved to buy back in. It was very diffi­cult to find any­one who would ac­cept Pay­Pal on #bit­coin-otc, but ul­ti­mately Nameg­duf agreed to a Mt­Gox voucher swap, and I got $60 which I then spent at $7.8 for ₿7.6. In late Feb­ru­ary 2013, Bit­coin was al­most at its al­l-time high of $31, and I hap­pened to also need cash bad­ly; I had re­ceived ad­di­tional do­na­tions, so I sold out my ₿5.79 at $31.5 even as the price reached $32—I just wanted to be out of what might have been an­other bub­ble. I then watched slack­jawed as the bub­ble failed to pop, failed to keep its price-level, but in­stead dou­bled to $60, dou­bled again to $120, hit $159 on 2013-04-07, hav­ing quin­tu­pled since I de­cided to sell out, and fi­nally peaked at $266 2 days later be­fore falling back down to a steady-s­tate of ~$100. That sale was not a great tes­ta­ment to my mar­ket tim­ing skills, and prompted me to re­think my opin­ions about Bit­coin. At var­i­ous points through Au­gust 2013, I sold on #bit­coin-otc ₿0.5 for $52, ₿0.28 for $50, & ₿1.15 for $120, ₿0.5 for $66 & $64, ₿0.25 for $32, ₿0.1 for $13, and ₿1.0 for $127 & $129—leav­ing me un­com­fort­ably ex­posed at ₿18 (hav­ing had diffi­culty find­ing trust­wor­thy buy­er­s). On 2013-10-02, the news burst that Silk Road had been busted & DPR ar­rested & charged; Bit­coin im­me­di­ately be­gan drop­ping by $20–$40 from ~$127 (de­pend­ing on ex­change), so I pur­chased ₿2.7 for $105 each.

(One might won­der why I don’t use the fairly ac­tive Bets of Bit­coin pre­dic­tion mar­ket; that is be­cause the pay­out rules are in­sane and I have no idea how to trans­late the “to­tal weighted bets” into ac­tual prob­a­bil­i­ties—­Bet­ting blind is never a good idea. And I have no in­ter­est in ever us­ing Bit­Bet as they brazenly steal from users.)

Zerocoin

A re­search pa­per (overview) in­tro­duced for the de­struc­tion of coins in a hy­po­thet­i­cal Bit­coin vari­ant (Ze­ro­coin); this al­lowed the cre­ation of new coins out of noth­ing while still keep­ing to­tal coins con­stant (sim­ply re­quire a proof that for every new coin, an older coin was de­stroyed). In other words, truly anony­mous coins rather than the pseu­do­nymity and track­a­bil­ity of Bit­coin. Ex­ist­ing coin mixes are not guar­an­teed to work & to not steal your coins, so this scheme could be use­ful to Bit­coin users and worth adding. Effi­ciency con­cerns meant that the orig­i­nal ver­sion was im­pos­si­ble to add, but the researchers/developers kept work­ing on it and shrunk the proofs to the point where they should be fea­si­ble to use. But they also an­nounced they were look­ing into launch­ing the func­tion­al­ity into an alt­coin.

This raises a ques­tion: would this po­ten­tial “Ze­ro­coin” alt­coin be worth pos­sess­ing? That is, might it be more than sim­ply a test­bed for the ze­ro-knowl­edge proofs to see how they per­form be­fore merg­ing into Bit­coin prop­er?

I am gen­er­ally ex­tremely cyn­i­cal about alt­coins as be­ing gen­er­ally pump-and-dump schemes like Lite­coin; I ex­cept be­cause dis­trib­uted do­main names is an in­ter­est­ing ap­pli­ca­tion of the global ledger and the proof-of-s­take alt­coins as in­ter­est­ing ex­per­i­ments on al­ter­na­tives to Bit­coin’s proof-of-work so­lu­tion. Anonymity seems to me to be even more im­por­tant than Name­coin’s DNS func­tion­al­i­ty—wit­ness the will­ing­ness of peo­ple to pay the fees to laun­dries like Bit­coin Fog with­out even guar­an­tee they will re­ceive safe bit­coins back (or look at the Tor net­work it­self). So I see ba­si­cally a few pos­si­ble long-term out­comes:

  1. Ze­ro­coin fiz­zles out and the net­work dis­in­te­grates be­cause no one cares
  2. Ze­ro­coin core func­tion­al­ity is cap­tured in Bit­coin and it dis­in­te­grates be­cause it is now re­dun­dant
  3. Ze­ro­coin sur­vives as an anonymity lay­er: peo­ple buy ze­ro­coins with tainted bit­coins, then sell the ze­ro­coins for un­linked bit­coins
  4. Ze­ro­coin re­places Bit­coin

Prob­a­bil­i­ty-wise, I’d rank out­come #1 as the most like­ly, #2 is likely but not very likely be­cause the Bit­coin Foun­da­tion seems in­creas­ingly be­holden to cor­po­rate and gov­ern­ment over­seers and even if not ac­tively op­posed, will en­gage in mo­ti­vated rea­son­ing look­ing for rea­sons to re­ject Ze­ro­coin func­tion­al­ity and avoid rock­ing its boat; #3 seems a lit­tle less likely since peo­ple can use the laun­dries or al­ter­na­tive tum­bling so­lu­tions like Coin­Join but still fairly prob­a­ble; #4 very im­prob­a­ble, like 1%.

To elab­o­rate a lit­tle more on the rea­son­ing for be­liev­ing #2 un­like­ly: my be­lief that the Foun­da­tion & core de­vel­op­ers are not keen on Ze­ro­coin is based on my per­sonal in­tu­ition about a num­ber of things:

  • the de­ci­sion by the Ze­ro­coin de­vel­op­ers to pur­sue an alt­coin at all, which is a mas­sive waste of effort if they had no rea­son to ex­pect it to be hard to merge it in (or if the bar­ri­ers to Ze­ro­coin use were purely tech­ni­cal); the alt­coin is a very re­cent de­ci­sion, and they were clear up­front that “Ze­ro­coin is not in­tended as a re­place­ment for Bit­coin” (writ­ten 2013-04-11).
  • , which sug­gests that the Foun­da­tion & core de­vel­op­ers may be grad­u­ally shift­ing into an ac­com­mo­da­tion­ist modes of thought—at­tend­ing gov­ern­ment hear­ings to de­fend Bit­coin, re­peat­edly stat­ing Bit­coin is not anony­mous but pseu­do­ny­mous and so is no threat to the sta­tus quo (which is mis­lead­ing and even tech­ni­cally in­ter­pret­ed, would be tor­pe­doed by Ze­ro­coin), and dis­cussing whitelist­ing ad­dress­es. To put it crude­ly, we may be in the early stages of them “sell­ing out”: mod­er­at­ing their po­si­tions and co­op­er­at­ing with the Pow­ers That Be to avoid rock­ing the boat and achieve things they value more like main­stream ac­cep­tance & praise. (I be­lieve some­thing very sim­i­lar hap­pened to .)
  • the lack of any re­ally pos­i­tive state­ments about Ze­ro­coin, de­spite the tech­ni­cal im­pli­ca­tions: the holy grail achieved—truly anony­mous de­cen­tral­ized dig­i­tal cash! With Ze­ro­coin added in, the im­pos­si­ble will have be­come pos­si­ble. It says a lot about how far from the lib­er­tar­ian cryp­top­unk roots Bit­coin has drifted that Ze­ro­coin is not a top pri­or­i­ty.

Price-wise, #1 and #2 mean ze­ro­coins go to ze­ro, but on the plus side min­ing or buy­ing at least sig­nals sup­port and may have pos­i­tive effects on the Foun­da­tion or Bit­coin com­mu­ni­ty. Out­come #4 (re­plac­ing Bit­coin) means ob­vi­ously lu­di­crous profits as Ze­ro­coin goes from pen­nies or a dol­lar each to $500+ (as­sum­ing for con­ve­nience Ze­ro­coin also sets 21m coin­s). In­ter­est­ing­ly, out­come #3 (anonymity lay­er) also means sub­stan­tial profits: be­cause the price of ze­ro­coins will be more than pen­nies due to the float from Bit­coin users wash­ing coins. Imag­ine that there are 1m ze­ro­coins ac­tively trad­ed, and Bit­coin users want to laun­der $10m of bit­coins a year, and it on av­er­age takes a day for each Bit­coin user to fin­ish mov­ing in and out of ze­ro­coins; then each day there’s $27378 locked up in ze­ro­coins and spread over the 1m ze­ro­coins, then solely from the float alone, each ze­ro­coin must be worth 3¢ (which is a nice profit for any­one who, say, bought ze­ro­coins at 1¢ after the Ze­ro­coin gen­e­sis block).

I per­son­ally think Bit­coin should in­cor­po­rate Ze­ro­coin if the re­source re­quire­ments are not too sev­ere, and sup­port­ing Ze­ro­coin may help this. And if it does­n’t, then it may well be profitable. In ei­ther case, I ben­e­fit. So if/when the Ze­ro­coin gen­e­sis block is re­leased, I will con­sider try­ing to mine it or es­tab­lish­ing a price floor (eg pub­licly com­mit­ting $100 to buy­ing ze­ro­coins at 1¢ from any and all com­er­s).

Pre­dic­tions:

  • Ze­ro­coin as func­tion­ing alt­coin net­work within a year: 65%
  • Ze­ro­coin mar­ket cap >$7,700,000,000 within 5 years (con­di­tional on launch): 1%
  • Ze­ro­coin mar­ket cap >$7,000,000 within 5 years (con­di­tional on launch): 7%
  • Ze­ro­coin func­tion­al­ity in­cor­po­rated into Bit­coin within 1 year: 33%
  • Ze­ro­coin func­tion­al­ity in­cor­po­rated into Bit­coin within 5 years: 45%

Personal bets

Over­all, I am for bet­ting be­cause I am against . Bull­shit is pol­lut­ing our dis­course and drown­ing the facts. A bet costs the bull­shit­ter more than the non-bull­shit­ter so the will­ing­ness to bet sig­nals hon­est be­lief. A bet is a tax on bull­shit; and it is a just tax, trib­ute paid by the bull­shit­ters to those with gen­uine knowl­edge.14

Be­sides pre­dic­tion mar­kets, one can make per­son­-to-per­son bets. These are not com­mon be­cause they re­quire a de­gree of trust due to the is­sue of who will judge a bet & , and I have not found many peo­ple on­line that I would will­ing to bet with or vice ver­sa. Be­low is a list of at­tempts:

Per­son Bet Ac­cepted Date offered Ex­pi­ra­tion Theirs My $ My P B et Po­si­tion R esult Notes
mostly­a­coustic En­trance fee/RSVP re­quired at NYU lec­ture. No 2011-03-03 2 days $5 $100 <5% Against Win LW dis­cus­sion
Eliezer Yud­kowsky HP MoR will win Hugo for Best Novel 2013-2017 Yes 2012-04-12 2017-09-05 $5 $100 5% Against Win LW dis­cus­sion
Fil­ipe Cosma Shal­izi be­lieves that P=NP Yes 2012-06-04 1 week $100 $100 1% Against Win I for­gave the amount due to his per­sonal cir­cum­stances.
mtaran Kim Suozzi’s do­na­tion so­lic­i­ta­tions not a scam No 2012-08-19 2013-01-01 $10 $100 90% Against Loss LW dis­cus­sion; in ne­go­ti­at­ing the de­tails, mtaran did­n’t seem to un­der­stand bet­ting, so the bet fell through.
chaos­mo­sis Mitt Rom­ney lose 2012 Pres­i­den­tial elec­tion No 2012-10-15 2013-11-03 $30 $20 70% For Win
David Lee >1m peo­ple us­ing Google Glass-style HUD in 10 years. No 2013-06-08 10 years ? ? 50% Against For­tune dis­cus­sion; Lee’s cav­a­lier ac­cep­tance of 100:1 odds in­di­cated he was not se­ri­ous, so I de­clined. As of 4 years lat­er, Google Glass is off the mar­ket and there is no ap­par­ent trend to­wards any­one us­ing them.
chaos­mo­sis HP MoR: the dead char­ac­ter Hermione to reap­pear as ghost No 2013-06-30 1 year ? $25 30% Against Win Red­dit dis­cus­sion
ja­coblyles MIRI/CFAR to evolve into ter­ror­ist or­ga­ni­za­tions No 2012-10-18 30 years ? <$1000 <1% Against LW dis­cus­sion
Patrick Ro­bot­ham Whether could prove took eco­nom­ics course to third party Yes 2013-09-20 im­me­di­ate $50 $10 50% Against Loss
Mpara­iso >30 Silk Road­-re­lated ar­rests in the year after the bust No 2013-10-08 2014-10-01 $20 $100 20% Against offer, PB.­com
qw­er­ty­oruiop Bit­coin ≤$50/₿ be­tween Oc­to­ber & De­cem­ber 2013 Yes 2013-10-19 2013-12-19 ₿0.1 ₿ 0.1 5% Against Win PB.­com; signed con­tract; qw­er­ty­oruiop paid early as once Bit­coin reached a peak of $900, it was ob­vi­ously not go­ing to be ≤$50 again, as in­deed it was not.
every­one Sheep Mar­ket­place to shut down in 6 months No 2013-10-30 2 013-04-30 ₿2.3 ₿ 1.0 40% For Loss Red­dit post
* Sheep Mar­ket­place to shut down in 12 months No 2013-10-30 2 014-10-30 ₿0.66 ₿ 1.0 50% For Win *
* Black­Mar­ket Re­loaded to shut down in 6 months No 2013-10-30 2 013-04-30 ₿3.0 ₿ 1.0 35% For Win *
* Black­Mar­ket Re­loaded to shut down in 12 months No 2013-10-30 2 014-10-30 ₿1.5 ₿ 1.0 50% For Win *
Del­er­rar Nan­otube is pro­vid­ing es­crow for the 4 BMR/Sheep bets No 2013-10-30 2 013-10-31 ₿0.1 ₿ 0.1 <5% For Win Offer on Red­dit
Robin Han­son The first AI to be an upload/brain em­u­la­tion. No? 2016-10-06 N A $100 \ $1000 <10% Against Offer on OB
Carl King Any use of US nukes in first 6 months of Trump pres­i­dency No 2017-01-20 2 017-07-20 $100 \ $0 <3% Against Win On Twit­ter. Carl King offered to bet any­one and then stopped re­spond­ing when Ge­off Greer & I tried to make one with him at even and then 3:1 odds.
quan­ti­cle Trump will still be Pres­i­dent 2019-01-01 Yes 2017-03-30 2 019-01-01 ₿0.10 ₿0.01819 17% Against Loss Twit­ter. While I did lose (as ex­pect­ed), in light of sub­se­quent events, I still think my prob­a­bil­ity was more ac­cu­rate than quan­ti­cle’s, as pre­dic­tion mar­kets like Pre­dic­tIt rapidly & con­sis­tently through Trump’s first 2 years gave him a >10% and usu­ally >20% prob­a­bil­ity of not fin­ish­ing the first 2 years.

Predictions

“I re­call, for ex­am­ple, sug­gest­ing to a reg­u­lar loser at a weekly poker game that he keep a record of his win­nings and loss­es. His re­sponse was that he used to do so but had given up be­cause it proved to be un­lucky.”

Ken Bin­more, Ra­tio­nal De­ci­sions

Mar­kets teach hu­mil­ity to all ex­cept those who have very good or very poor mem­o­ries. Writ­ing down pre­cise pre­dic­tions is like : it’s bru­tal to do be­cause it is al­most a par­a­dig­matic long-term ac­tiv­i­ty, be­ing wrong is un­pleas­ant & un­re­ward­ing15, and it re­quires 2 skills, for­mu­lat­ing pre­cise pre­dic­tions and then ac­tu­ally pre­dict­ing. (For spaced rep­e­ti­tion, writ­ing good flash­cards and then ac­tu­ally reg­u­larly re­view­ing.) There are lots of ex­er­cises to try to (cal­i­brate your­self us­ing trivia ques­tions ob­scure his­tor­i­cal events, ge­og­ra­phy, etc.), but they only take you so far; it’s the real world near term and long term pre­dic­tions that give you the most food for thought, and those re­quire a year or three at min­i­mum. I’ve used PB heav­ily for 11 months now, and I used pre­dic­tion mar­kets for years be­fore PB, and only now do I be­gin to feel like I am get­ting a grasp on pre­dict­ing. We’ll look at these al­ter­na­tives.

Prediction sites

“The best salve for fail­ure—to have quite a lot else go­ing on.”

Alain de Bot­ton

Be­sides the spe­cific mech­a­nism of pre­dic­tion mar­kets, one can just make and keep track of pre­dic­tions one­self. They are much cheaper than pre­dic­tion mar­kets or in­for­mal bet­ting and cor­re­spond­ingly tend to elicit many more re­sponses16

There are a num­ber of rel­e­vant web­sites I have a lit­tle ex­pe­ri­ence with; some as­pire to be like David Brin’s pro­posed pre­dic­tion reg­istries, some do not:

  1. Pre­dic­tion­Book (PB) is a gen­er­al-pur­pose free-form pre­dic­tion site. PB is a site in­tended for per­sonal use and small groups reg­is­ter­ing pre­dic­tions; the hope was that Less­Wrongers would use it when­ever they made pre­dic­tions about things (as they ought to in or­der to keep their the­o­ries grounded in re­al­i­ty). It has­n’t seen much up­take, though not for the lack of my try­ing.

    I per­son­ally use it heav­ily and have in­put some­where around 1000 pre­dic­tions, of which around 300 have been judged. (I ap­par­ently am rather un­dercon­fi­den­t.) A good way to get started is to go to the list of up­com­ing pre­dic­tions and start en­ter­ing in your own as­sess­ment; this will give you feed­back quick­ly.

  2. I find the Long Bets con­cept in­ter­est­ing, but it has se­ri­ous flaws for any­one who wants to do more than make a pub­lic state­ment like War­ren Buffet has. Forc­ing peo­ple to put up money has kept re­al-money pre­dic­tion mar­kets pretty small in both par­tic­i­pants and vol­ume; and how much more so when all pro­ceeds go to char­i­ty? No won­der that half a decade or more lat­er, there’s only a score mon­ey-bets go­ing, even with promi­nent par­tic­i­pants like War­ren Buffet. Non-money mar­kets or pre­dic­tion reg­istries can work in the higher vol­umes nec­es­sary for learn­ing to pre­dict bet­ter. Sin­gle-hand­edly on PB I have made 10 times the num­ber of pre­dic­tions on all of Long Bets. Where will I learn & im­prove more, Long Bets or PB? (It was easy for me to bor­row all the de­cent pre­dic­tions and reg­is­ter them on PB.)

  3. Fu­ture­Time­line is a main­tained list of pro­jected tech­no­log­i­cal mile­stones, events like the Olympics, and mega-con­struc­tion dead­lines.

    Fu­ture­Time­line does not as­sign any prob­a­bil­i­ties and does­n’t at­tempt to track which came true; hence, it’s more of a list of sug­ges­tions than pre­dic­tions. I have copied over many of the more fal­si­fi­able ones to PB.

  4. Wrong­To­mor­row: a site that was de­voted solely to reg­is­ter­ing and judg­ing pre­dic­tions made by pun­dits (such as the in­fa­mous ).

    Un­for­tu­nate­ly, WT was mod­er­ated and when WT did­n’t see a sud­den mas­sive surge in con­tri­bu­tions, mod­er­a­tion fell be­hind badly un­til even­tu­ally the server was just turned off for the au­thor’s other pro­jects. I still man­aged to copy a num­ber of pre­dic­tions off it into PB, how­ev­er. WT is an ex­am­ple of a gen­eral fail­ure mode for col­lec­tions of pre­dic­tions: no fol­low-through. Pre­dic­tions are the par­a­dig­matic Long Con­tent, and WT will prob­a­bly not be the first site to learn this the hard way.

And the last site demon­strates like Brin’s pre­dic­tion reg­istries have not come into ex­is­tence. One of the few ap­prox­i­ma­tions to a pre­dic­tion reg­istry is justly fa­mous 2005 book Ex­pert Po­lit­i­cal Judg­ment: How Good Is It? How Can We Know?, which dis­cusses an on­go­ing study which has tracked >28000 pre­dic­tions by >284 ex­perts, proves why: ex­perts are not ac­cu­rate and can be out­per­formed by em­bar­rass­ingly sim­ple mod­els, and they do not learn from their ex­pe­ri­ence, at­tempt­ing to retroac­tively jus­tify their pre­dic­tions with ref­er­ence to coun­ter­fac­tu­als. (If wishes were fish­es… Pre­dic­tions are about the real world, and in the real world, hacks and bub­bles are nor­mal ex­pected phe­nom­e­na. A verse I saw some­where runs: “Since the be­gin­ning / not one un­usual thing has hap­pened”. If your pre­dic­tions can’t han­dle nor­mal ex­oge­nous events, then they are still wrong. Tet­lock iden­ti­fies this as a com­mon fail­ure mode of hedge­hog-style ex­perts: “I was ac­tu­ally right! but for X Y Z…”) And look­ing around, I think I agree with Eliezer Yud­kowsky that when the vast ma­jor­ity of peo­ple make a pre­dic­tion, it is not an ac­tual pre­dic­tion to be judged right or wrong but an en­ter­tain­ing in­tended to sig­nal par­ti­san loy­al­ties.

An­other fea­ture worth men­tion­ing is that pre­dic­tion sites do not gen­er­ally al­low ret­ro­spec­tive pre­dic­tions, be­cause that is eas­ily abused even by the hon­est (who may be suffer­ing ). Pre­dic­tion mar­kets, need­less to say, uni­ver­sally ban ret­ro­spec­tive pre­dic­tions. So, pre­dict­ing gen­er­ally does­n’t give fast feed­back­—in­trin­si­cal­ly, you can’t learn very much from short­-term pre­dic­tions be­cause ei­ther there’s se­ri­ous ran­dom­ness in­volved such that it takes hun­dreds of pre­dic­tions to be­gin to im­prove, or the pre­dic­tions are badly over-de­ter­mined by avail­able in­for­ma­tion that one learns lit­tle from the suc­cess­es.

Prediction sources

A short list of sites which make it easy to find new­ly-cre­ated pre­dic­tions or (for quicker grat­i­fi­ca­tion & cal­i­bra­tion) pre­dic­tions which are about to reach their due dates:

IARPA: The Good Judgment Project

In 2011, the agency (IARPA) be­gan the Ag­grega­tive Con­tin­gent Es­ti­ma­tion (ACE) Pro­gram, pit­ting 5 re­search teams against each other to in­ves­ti­gate and im­prove pre­dic­tion of geopo­lit­i­cal events. One team, the Good Judg­ment Project (see the Wired in­ter­view with ), so­licited col­lege grad­u­ates for the 4 year time pe­riod of ACE to reg­is­ter pre­dic­tions on se­lected events, for a $150 hon­o­rar­i­um. A last-minute no­tice was posted on Less­Wrong, and I im­me­di­ately signed up and was ac­cepted as I pre­dicted.

The ini­tial sur­vey upon my ac­cep­tance was long and de­tailed (cal­i­bra­tion on geopol­i­tics, fi­nance, and re­li­gion; per­son­al­ity sur­veys with a lot of fox/hedgehog ques­tions; ba­sic prob­a­bil­i­ty; a crit­i­cal think­ing test, the CRT; ed­u­ca­tional test scores; and then what looked like a full ma­trix IQ test—we were al­lowed to see some of our own re­sults, like the sea­son 2 cal­i­bra­tion test17). The fi­nal re­sults will no doubt turn up many in­ter­est­ing cor­re­la­tions or lack of cor­re­la­tion. I look for­ward to com­plet­ing the study. At the very least, they will sup­ply a few hun­dred pre­dic­tions I can put on Pre­dic­tion­Book.­com—­for­mu­lat­ing a qual­ity pre­dic­tion (fal­si­fi­able, ob­jec­tive, and in­ter­est­ing) can be the hard­est part of pre­dict­ing.

Season 1 results

My ini­tial batch of short­-term pre­dic­tions did well; even though I make a ma­jor mis­take when I fum­ble-fin­gered a pre­dic­tion about Mu­gabe (I bet that he would fall from office in a mon­th, when I be­lieved the op­po­site), I was still up by $700 in its play-money. I have, nat­u­ral­ly, been copy­ing my pre­dic­tions onto Pre­dic­tion­Book.­com the en­tire time.

De­spite a very ques­tion­able pre­dic­tion clo­sure by IARPA which cost me $20018, I fin­ished 2011 well in the green. My re­sults:

  • Your to­tal earn­ings for 84 out of 85 closed fore­casts is 15,744.
  • You were ranked 28 among the 204 fore­cast­ers in Group 3c.

Not too shab­by; I was ac­tu­ally un­der the im­pres­sion I was do­ing a lot worse than that. Hope­fully I can do bet­ter in 2012—I seem fairly ac­cu­rate, so I ought to make my bets larg­er.

Season 2 results

Nat­u­rally I signed up for Sea­son 2. But it took the GJP months to ac­tu­ally send us the hon­o­rar­i­um, and for Sea­son 2, they switched to a much harder to use pre­dic­tion-mar­ket in­ter­face which I did not like at all. I used up my ini­tial al­lot­ment of mon­ey, but I’m not sure how ac­tively I will par­tic­i­pate: there’s still some nov­elty but the UI was bad enough that all the fun is gone. The later ad­di­tion of ‘trad­ing agents’ where one could just spec­ify one’s prob­a­bil­ity and it would make ap­pro­pri­ate trades au­to­mat­i­cally lured me back in for some trad­ing, but as one would ex­pect from my dis­en­gage­ment, my fi­nal re­sults were far worse than for sea­son 1: I ranked 184 out of 245 (~75th per­centile).

I might as well stick around for sea­son 3. Maybe I will try harder this time.

Season 3 results

For some rea­son, I never saw my sea­son 3 re­sults; search­ing my emails turns up no men­tions of the offi­cial re­sults be­ing re­leased (only some is­sues with a few con­tro­ver­sial con­tract de­ci­sion­s). I don’t re­call my sea­son 3 re­sults be­ing un­usu­ally good or bad.

Season 4 results

My re­sults in 2014-2015 (the fi­nal sea­son of the IARPA com­pe­ti­tion) ranked me 41 of 343 in my ex­per­i­men­tal group. This was much bet­ter than sea­son 2, and slightly bet­ter than sea­son 1 (~12th per­centile vs 13th).

Calibration

“The best lack all con­vic­tion, while the worst are full of pas­sion­ate in­ten­si­ty.”

Yeats,

Faster even than mak­ing one’s own pre­dic­tions is the pro­ce­dure of cal­i­brat­ing your­self. Sim­ply put, in­stead of buy­ing shares or not, you give a di­rect prob­a­bil­i­ty: your 10% pre­dic­tions should come true 10% of the time, your 20% pre­dic­tions true 20% of the time, etc. This is not so much about fig­ur­ing out the true prob­a­bil­ity of the event or fact in the real world but rather about your own ig­no­rance. It is as much about learn­ing hu­mil­ity and avoid­ing hubris as it is about ac­cu­ra­cy. You can be well-cal­i­brated even mak­ing pre­dic­tions about top­ics you are com­pletely ig­no­rant of—sim­ply flip a coin to choose be­tween 2 pos­si­bil­i­ties. You are still bet­ter than some­one who is equally ig­no­rant but ar­ro­gantly tries to pick the right an­swers any­way and fail­s—he will be re­vealed as mis­cal­i­brat­ed. If they are ig­no­rant and don’t know it, they will come out over­con­fi­dent; and if they are knowl­edge­able and don’t re­al­ize it, they will come out un­der­con­fi­dent. (Note that learn­ing of your over­con­fi­dence is less painful than in a pre­dic­tion mar­ket, where you lose your mon­ey.)

Thus, one can sim­ply com­pile a trivia list and test peo­ple on their cal­i­bra­tion; there are at least 4 such on­line quizzes along with the board game Wits & Wa­gers. (Con­sul­tant Dou­glas Hub­bard has a book How to Mea­sure Any­thing: Find­ing the Value of “In­tan­gi­bles” in Busi­ness which is prin­ci­pally on the topic of ap­ply­ing a com­bi­na­tion of cal­i­bra­tion and Fermi es­ti­mates to many busi­ness prob­lems, which I found imag­i­na­tive & in­ter­est­ing.) These tests are also use­ful for oc­ca­sional in­de­pen­dent checks on whether you eas­ily suc­cumb to bias or mis­cal­i­bra­tion in other do­mains; I per­son­ally seem to do rea­son­ably well19.

Some pro­fes­sional groups do much bet­ter on fore­cast­ing than oth­ers. Two of the key fac­tors found by Arm­strong and other fore­cast­ing re­searchers is that the bet­ter groups have fast and clear feed­back20, and con­verse­ly, Tet­lock’s “hedge­hogs” were char­ac­ter­ized by con­stant at­tempts to ra­tio­nal­ize un­ex­pected out­comes and re­frain from fal­si­fy­ing their cher­ished world-view. Trivia ques­tions, and to a lesser ex­tent the pre­dic­tions on Pre­dic­tion­Book.­com, offer both fac­tors.

1001 PredictionBook Nights

I ex­plain what I’ve learned from cre­at­ing and judg­ing thou­sands of pre­dic­tions on per­sonal and re­al-world mat­ters: the chal­lenges of main­te­nance, the lim­i­ta­tions of pre­dic­tion mar­kets, the in­ter­est­ing ap­pli­ca­tions to my other es­says, skep­ti­cism about pun­dits and un­re­flec­tive per­sons’ opin­ions, my own bi­ases like op­ti­mism & plan­ning fal­la­cy, 3 very use­ful heuristics/approaches, and the costs of these ac­tiv­i­ties in gen­er­al. (Plus an ex­tremely geeky par­ody of Fate/Stay Night.)

(Ini­tial dis­cus­sion on Less­Wrong.)

I am the
Be­lief is my body and choice is my blood.
I have recorded over a thou­sand pre­dic­tions,

Nor aware of hope
Have with­stood pain to up­date many times
Wait­ing for truth’s ar­rival.
This is the one un­cer­tain path.
My whole life has been…
Un­lim­ited Bayes Works!21

In Oc­to­ber 2009, the site Pre­dic­tion­Book.­com was an­nounced on LW. I signed up in July 2010, as track­ing free-form pre­dic­tions was the log­i­cal end­point of my dab­bling in pre­dic­tion mar­kets, and I had re­cently with­drawn from In­trade due to fee changes. Since then I have been the prin­ci­pal user of PB.­com, and a while ago, I reg­is­tered my 1001th pre­dic­tion. (I am cur­rently up to >1628 pre­dic­tions, with >383 judged; PB to­tal has >4258 pre­dic­tion­s.) I had to write and re­search most of them my­self and they rep­re­sent a large time in­vest­ment. To what use have I put the site, and what have I got­ten out of the pre­dic­tions?

Using PredictionBook

“Our er­rors are surely not such aw­fully solemn things. In a world where we are so cer­tain to in­cur them in spite of all our cau­tion, a cer­tain light­ness of heart seems health­ier than this ex­ces­sive ner­vous­ness on their be­half.”

, “”, sec­tion VII

Us­ing Pre­dic­tion­Book taught me two things as far as such sites go:

  1. I learned the value of cen­tral­iz­ing (and ) pre­dic­tions of in­ter­est to me. I ran­sacked Long­Bet­s.org, Wrong­To­mor­row.­com, , Fu­ture­Time­line.net, and var­i­ous col­lec­tions of pre­dic­tions like Arthur C. Clarke’s list, Less­Wrong’s own an­nual pre­dic­tion threads (2010, 2011), or sim­ply ran­dom com­ments on LW (some­times Red­dit too). This makes search­ing for pre­vi­ous pre­dic­tions eas­ier, graphs all my reg­is­tered pre­dic­tions, and makes back­ups a lit­tle sim­pler. Wrong­To­mor­row promptly vin­di­cated my para­noia by dy­ing with­out no­tice. I now have a re­ply to oft-re­peated plea for a “pre­dic­tions reg­istry”: no one cares, so if you want one, you need to do it your­self.

  2. I re­al­ized that us­ing pre­dic­tion mar­kets had nar­rowed my ap­pre­ci­a­tion of what pre­dic­tions are good for. IEM & In­trade had taught me con­tempt for cer­tain pun­dits (and re­spect for ) be­cause they would mam­mer on about is­sues where I knew bet­ter from the rel­e­vant mar­ket; but there are very few liq­uid mar­kets in ei­ther site, and so I learned this for only a few things like the US Pres­i­den­tial elec­tions. Pre­dic­tion mar­kets will be flawed for the fore­see­able fu­ture, with in­di­vid­ual con­tracts sub­ject to long-shot bias22 or sim­ply bizarre claims due to illiq­uid­ity23; for these things, one must go else­where or not go at all.

    At worst, this fix­a­tion on pre­dic­tion mar­ket­s—and re­al-money pre­dic­tion mar­ket­s—­may lead one to en­gage in epic yak-shav­ing in striv­ing to change US laws to per­mit pre­dic­tion mar­kets! I am re­minded of Thore­au:

    This spend­ing of the best part of one’s life earn­ing money in or­der to en­joy a ques­tion­able lib­erty dur­ing the least valu­able part of it re­minds me of the Eng­lish­man who went to In­dia to make a for­tune first, in or­der that he might re­turn to Eng­land and live the life of a po­et. He should have gone up the gar­ret at once.

  3. I learned how hard it is to write good pre­dic­tions. Most of my be­liefs are not even fal­si­fi­able. Even the ones which seem con­crete and em­pir­i­cal wig­gle and squirm when one tries to pin them down to the point where a third-party could judge whether they came true (they es­pe­cially squirm when they don’t come true).

Noted predictions

“Robert Mor­ris has a very un­usual qual­i­ty: he’s never wrong. It might seem this would re­quire you to be om­ni­scient, but ac­tu­ally it’s sur­pris­ingly easy. Don’t say any­thing un­less you’re fairly sure of it. If you’re not om­ni­scient, you just don’t end up say­ing much. More pre­cise­ly, the trick is to pay care­ful at­ten­tion to how you qual­ify what you say…He has an al­most su­per­hu­man in­tegri­ty. He’s not just gen­er­ally cor­rect, but also cor­rect about how cor­rect he is. You’d think it would be such a great thing never to be wrong that every­one would do this. It does­n’t seem like that much ex­tra work to pay as much at­ten­tion to the er­ror on an idea as to the idea it­self. And yet prac­ti­cally no one does.”

Paul Gra­ham

Do any par­tic­u­lar sets of pre­dic­tions come to my mind? Yes:

  1. My largest out­stand­ing col­lec­tion are about the un­re­leased Evan­ge­lion movies & man­ga; I re­gard their up­com­ing re­leases as ex­cel­lent chances to test my the­o­ries about Evan­ge­lion in­ter­pre­ta­tion in a way that is usu­ally im­pos­si­ble when it comes to lit­er­ary in­ter­pre­ta­tion
  2. For my per­sonal Adder­all dou­ble-blind tri­al, I recorded 16 pre­dic­tions about a trial (guess­ing whether it was placebo or Adder­all) to try to see how strong an effect I could di­ag­nose, in ad­di­tion to whether there was one at all. (I also did one for modafinil & )
  3. Dur­ing the big Bit­coin bub­ble, I recorded a num­ber of pre­dic­tions on Red­dit & LW and fol­lowed up on a num­ber of them; I be­lieve this was ed­u­ca­tional for those in­volved—at the least, I think I tem­pered my own en­thu­si­asm by not­ing the reg­u­lar fail­ure of the most op­ti­mistic pre­dic­tions and the very low Out­side View prob­a­bil­ity of a take-off
  4. I made qual­i­ta­tive pre­dic­tions in for 2010 & 2011, but I’ve re­frained from record­ing them be­cause I’ve been ac­cused of be­ing sub­jec­tive in my eval­u­a­tions; for 2012 & 2013, I bit the bul­let.
  5. For my mod­el­ing & pre­dic­tions of when Google will kill its var­i­ous prod­ucts, I reg­is­tered my own ad­just­ments to the fi­nal set of 5-year sur­vival pre­dic­tions so as to com­pare my per­for­mance with the mod­el’s per­for­mance 5 years later

Benefits from making predictions

Day ends, mar­ket closes up or down, re­porter looks for good or bad news re­spec­tive­ly, and writes that the mar­ket was up on news of In­tel’s earn­ings, or down on fears of in­sta­bil­ity in the Mid­dle East. Sup­pose we could some­how feed these re­porters false in­for­ma­tion about mar­ket clos­es, but give them all the other news in­tact. Does any­one be­lieve they would no­tice the anom­aly, and not sim­ply write that stocks were up (or down) on what­ever good (or bad) news there was that day? That they would say, “hey, wait a min­ute, how can stocks be up with all this un­rest in the Mid­dle East?”24

When I do use pre­dic­tions, I’ve no­ticed some di­rect ben­e­fits:

  • Giv­ing prob­a­bil­i­ties can make an analy­sis clearer (how do I know what I think un­til I see what I pre­dic­t?); when I spec­u­lated on the iden­tity of Mike Dar­win‘s pa­tron (above, ’Notes’), the very low prob­a­bil­i­ties I as­signed in the con­clu­sion to any par­tic­u­lar bil­lion­aire makes clear that I re­pose no real con­fi­dence in any of my guesses and that this is more of a Fermi prob­lem puz­zle or ex­er­cise than any­thing else. (And in­deed, none of them were cor­rec­t.) I be­lieve that sharp­en­ing my analy­ses has also made me bet­ter at spot­ting po­lit­i­cal blovi­a­tion and pun­dits pon­ti­f­y­ing:

    “Don’t ask whether pre­dic­tions are made, ask whether pre­dic­tions are im­plied.” –Steven Kaas

  • Go­ing on the record with time-stamps can turn sour-grapes into a small vic­to­ry. If one read my ar­ti­cle and saw a foot­note to the effect that the Bit­coin fo­rum ad­min­is­tra­tors were cen­sors who re­moved any dis­cus­sion of the Silk Road, such an ac­cu­sa­tion is rather less con­vinc­ing than a foot­note link­ing to a pre­dic­tion that a par­tic­u­lar thread would be re­moved and not­ing that as the reader can ver­ify for them­selves, said thread was in­deed sub­se­quently delet­ed.

One of the things I hoped would make my site un­usual was reg­u­larly em­ploy­ing pre­dic­tion; I haven’t been able to do it as often as I hoped, but I’ve still used it in 19 pages:

  • : pro­jec­tions about fin­ish­ing writing/research pro­jects, and site pageviews
  • : whether I will con­tinue to use cer­tain soft­ware tools cho­sen in ac­cor­dance with its prin­ci­ples
  • : suc­cess of the 2012 projects
  • : pre­dict­ing the WMF’s half-hearted efforts at ed­i­tor re­ten­tion will fail; pre­dic­tions about in­for­mal ex­per­i­ments I’ve car­ried out
  • : com­puter Go
  • : check­ing the suc­cess of blind­ing Adder­all, day-time modafinil, iodine, and nico­tine ex­per­i­ments (see above);
  • : check­ing blind­ing of 2 vi­t­a­min D ex­per­i­ments
  • Notes: pre­dic­tions on Steve Job­s’s lack of char­i­ty, cor­rect­ness of spec­u­la­tive analy­sis
  • : in my de­scrip­tion of the fail­ure of Knol as a Wikipedia or blog com­peti­tor, I nat­u­rally reg­is­tered sev­eral es­ti­mates of when I ex­pected it to die; I was cor­rect to ex­pect it to die quick­ly, in 2012 or 2013, but not that the con­tent would re­main pub­lic. This ex­pe­ri­ence was part of the mo­ti­va­tion for my later .
  • : see above
  • -(an ex­er­cise sim­i­lar to the Evan­ge­lion pre­dic­tions)
  • Pre­dic­tion mar­kets: po­lit­i­cal pre­dic­tions, In­trade fail­ure pre­dic­tions, GJP ac­cep­tance
  • : pre­dic­tion of cen­sor­ship on main Bit­coin fo­rums (see above), and of no le­gal reper­cus­sions
  • : as­serts semi­con­duc­tor man­u­fac­tur­ing is frag­ile and hence Kry­der’s law has been per­ma­nently set back by 2011 Thai floods
  • The Notenki Mem­oirs: per­pet­u­ally in­-plan­ning movie Aoki Uru will not be re­leased.
  • : cor­rectly pre­dicted tol­er­ance for a par­tic­u­larly fre­quent user
  • : I reg­is­tered pre­dic­tions on what replies I ex­pected from Par­la­panides, ask­ing about whether he wrote the leaked script be­ing an­a­lyzed, to fore­stall ac­cu­sa­tions of hind­sight bias
  • , The New Yorker 2011; men­tioned my own failed pre­dic­tion of a gov­ern­ment crack­down
  • : as part of my sta­tis­ti­cal mod­el­ing of the likely life­times of Google prod­ucts, I took the fi­nal mod­el’s pre­dic­tions of 5-year sur­vival (to 2018) and ad­justed them to what I felt in­tu­itively was more right.
  • : blind­ing index/guessing whether ac­tive or placebo
  • : email in­ter­view with Bit­coin ex­change Mt­Gox founder Jed Mc­Caleb, pre­dict­ing whether the ori­gin story was true (I guessed it was a leprechaun/urban-legend but it turned out to be sort of true)
  • : guess­ing about whether an anony­mous ex­tor­tion­ist would re­ply to my re­fusal or try a sec­ond time (he did nei­ther)

Lessons learned

“We should not be up­set that oth­ers hide the truth from us, when we hide it so often from our­selves.”

, Maximes 11

To sum things up, like the haunted ra­tio­nal­ist, I learned in my gut things that I al­ready sup­pos­edly knew—the bi­ases are now more sat­is­fy­ing; the fol­low­ing are my sub­jec­tive im­pres­sions:

  • I knew (to quote Julius Cae­sar) that “What we wish, we read­ily be­lieve, and what we our­selves think, we imag­ine oth­ers think al­so.” or (to quote Or­well), “Pol­i­tic­s…is a sort of sub­-atomic or non-Euclid­ean word where it is quite easy for the part to be greater than the whole or for two ob­jects to be in the same place si­mul­ta­ne­ous­ly.”25, but it was­n’t un­til I was sure that George Bush would not be re-elected in 2004, that I knew that I could suc­cumb to that even in ab­stract is­sues which I had read enor­mous quan­ti­ties of in­for­ma­tion & spec­u­la­tion on.

  • while I am weak in ar­eas close to me, in other ar­eas I am un­der­con­fi­dent, which is a sin and as much to be reme­died as over­con­fi­dence. (Specifi­cal­ly, it seemed I was ini­tially over­con­fi­dent on 95%+ pre­dic­tions and un­der­con­fi­dent in the 60-90% regime; I think I’ve learned my lesson, but by the na­ture of these things, my recorded cal­i­bra­tion will take many pre­dic­tions to re­cover in the ex­treme ranges.)

  • I am too op­ti­mistic and not cyn­i­cal enough; the car­di­nal ex­am­ple, per­son­al­ly, would be the five-year XiX­iDu pre­dic­tion which was fal­si­fied in one month. The Out­side View heav­ily mil­i­tated against it, as did my fel­low pre­dic­tors, and if it had been for­mu­lated as some­thing so­cially dis­ap­proved of like al­co­hol or smok­ing, I would prob­a­bly have gone with 10 or 20% like JoshuaZ; but be­cause it was a fel­low Less­Wronger try­ing to get his life straight…

  • I am con­sid­er­ably more skep­ti­cal of op-eds and other pun­dit­ry, after track­ing the rare clear pre­dic­tions they made. (I was al­ready wary due to Tet­lock, and a more re­cent study of ma­jor pun­dits but not enough, it seem­s.)

    The rareness of such pre­dic­tions has in­still in me an ap­pre­ci­a­tion of Han­son­ian sig­nal­ing the­o­ries of pol­i­tic­s—it is so hard to get fal­si­fi­able pre­dic­tions out of writ­ings even when they look clear; for ex­am­ple, lead­ing up to the 2011 US Fed­eral debt cri­sis and rat­ings down­grade, every­one prog­nos­ti­cated fu­ri­ous­ly—but did they mean any rat­ing agen­cy, or all of them, or just a ma­jor­i­ty?

  • I re­spect fun­da­men­tal trends more; they are pow­er­ful pre­dic­tors in­deed, and like Philip Tet­lock’s ex­perts, I find that it’s hard to out­-per­form the past in pre­dict­ing. I no longer ex­pect much of politi­cians, who are as trapped as the rest of us.

    This could be seen as more use of base rates as the pri­or, or as mov­ing to­wards more of an Out­side View. I am fre­quently re­minded of the power of re­duc­tion­ism and analy­sis—­pace MoR Quir­rel’s ques­tion to Harry26, what states of the world would a pre­dic­tion com­ing true im­ply had be­come more like­ly? Some­times when I record pre­dic­tions, I see some­one who has clearly not con­sid­ered what his pre­dic­tions com­ing true im­plies about the cur­rent state of the world; I sigh and re­flect on how you just can’t get there from here.

  • Merely con­tem­plat­ing se­ri­ously my pre­dic­tions over years and decades makes the fu­ture much more con­crete to me; I will live most of my life there, so I should take a longer-term per­spec­tive.

  • Mak­ing thou­sands of pre­dic­tions has helped me gain de­tach­ment from par­tic­u­lar po­si­tions and ideas (which made it eas­ier for me to write my es­say and pub­licly ad­mit them—after so many ‘fail­ures’ on PB.­com, what were a few de­scribed in more de­tail?) To quote Alain de Bot­ton:

    The best salve for fail­ure – to have quite a lot else go­ing on.

    This de­tach­ment it­self seems to help ac­cu­ra­cy; I was struck by a psy­chol­ogy study demon­strat­ing that not only are peo­ple bet­ter at fal­si­fy­ing the­o­ries put forth by other peo­ple, they are bet­ter at fal­si­fy­ing when pre­tend­ing it is held by an imag­i­nary friend27!

  • Raw prob­a­bil­i­ties are more in­tu­itive; I can’t de­scribe this much bet­ter than the poker ar­ti­cle, “This is what 5% feels like.”

  • Plan­ning fal­lacy: I knew it per­fectly well, but still com­mit­ted it un­til I tracked pre­dic­tions; this is true both of my own mun­dane ac­tiv­i­ties like writ­ing, and larger more global events (re­cent­ly, run­ning out the clock on the Pales­tin­ian na­tion­hood UN vote)

    This was in­ter­est­ing be­cause it’s so easy to make ex­cus­es—‘I would’ve suc­ceeded if not for X!’ The ques­tion (in the clas­sic study) is whether stu­dents could pre­dict their pro­jects’ ac­tual com­ple­tion time; they’re not try­ing to pre­dict project com­ple­tion time given a hy­po­thet­i­cal ver­sion of them­selves which did­n’t pro­cras­ti­nate. If they aren’t self­-aware enough to know they pro­cras­ti­nate and to take that into ac­coun­t—their pre­dic­tions are still bad, no mat­ter why they’re bad. (And some­one on the out­side who is told that in the past the stu­dents had fin­ished -1 days be­fore the due date will just shrug and say: ‘re­gard­less of whether they took so long be­cause of pro­cras­ti­na­tion, or be­cause of , or be­cause of a 3rd rea­son, I have no rea­son to be­lieve they’ll fin­ish early this time.’ And they’d be ab­solutely cor­rec­t.) It’s like a fel­low who pre­dicts he won’t fall off a cliff, but falls off any­way. ‘If only that cliff had­n’t been there, I would­n’t’ve fal­l­en!’ Well, duh. But you still fell. How can you cor­rect this un­til you stop mak­ing ex­cus­es?

  • Less hind­sight bias; when I have my pre­vi­ous opin­ions writ­ten down, it’s harder to claim I knew it all along (when I did­n’t), and as in­di­cat­ed, writ­ing down my rea­sons (even in Twit­ter-sized com­ments) helped pre­vent it.

    Ex­am­ple: I had put the 2011 S&P down­grade at 5%, and re­minded of my skep­ti­cism, I can see the dou­ble-s­tan­dards be­ing ap­plied by pun­dit­s—all of a sud­den they re­mem­ber how the rat­ings agen­cies failed in the hous­ing bub­ble and how the aca­d­e­mic lit­er­a­ture has proven they are in­fe­rior to the mar­kets and how they are a bad gov­ern­men­t-granted monopoly, even though they were happy to cite the AAA rat­ing be­fore­hand and are still happy to cite the other rat­ings agen­cies… In short, while base rates are pow­er­ful in­deed, there are still many ex­oge­nous events and mul­ti­plic­i­ties of low prob­a­bil­ity events.

I think, but am not sure, that I re­ally have in­ter­nal­ized these lessons; they sim­ply seem… ob­vi­ous to me, now. I was sur­prised when I looked up my ear­li­est work and saw it was only around 14 months ago —I felt like I’d been record­ing pre­dic­tions for far longer.

Non-benefits

“If peo­ple don’t want to come to the ball­park how are you go­ing to stop them?”

, p. 36 The Yogi book (1997)

Mak­ing pre­dic­tions has been per­son­ally cost­ly; while some pre­dic­tions have been to­tal time in­vest­ments of a score of sec­onds, other pre­dic­tions re­quired con­sid­er­able re­search, and think­ing care­fully is no pic­nic, as we’ve all no­ticed. I jus­tify the in­vested time as a learn­ing ex­pe­ri­ence which would hope­fully pay off for oth­ers as well, who can free-ride off the many pre­dic­tions (eg. the soon-to-ex­pire pre­dic­tions) I have la­bo­ri­ously added to PB.­com. (Only a fool learns from his mis­takes on­ly.)

What I have not no­ticed? It was sug­gested that pre­dic­tions might help me in res­o­lu­tions based on some ex­per­i­men­tal ev­i­dence28; I did not no­tice any­thing, but I did­n’t care­fully track it or put in pre­dic­tions about many rou­tine tasks. Mak­ing pre­dic­tions seems to be largely effec­tive for im­prov­ing one’s epis­temic ra­tio­nal­i­ty; I make no promises or im­plied war­ranties as to whether it is in­stru­men­tally ra­tio­nal.

How I make predictions

A pre­dic­tion can be bro­ken up into 3 steps:

  1. The spec­i­fi­ca­tion
  2. The due-date
  3. The prob­a­bil­ity

The first is­sue is sim­ply for­mu­lat­ing the pre­dic­tion. The goal is to make a state­ment on an ob­jec­tive and eas­ily check­able fact; imag­ine that the other peo­ple pre­dict­ing are your­self if you had been raised in some com­pletely op­po­site fash­ion like an evan­gel­i­cal Re­pub­li­can house­hold, and they are quite as sus­pi­cious of you as you are of them, and be­lieve you to be suffer­ing from as many par­ti­san and self­-serv­ing bi­ases as you be­lieve them to. Word­ing is im­por­tant as words how we think about things and can di­rectly bias us (eg. s)29. The pre­dic­tion should be so clear that they would ex­pose them­selves to mock­ery even among their own kind if they were to se­ri­ously dis­agree about the judg­ment30. For ex­am­ple, ‘Obama will be the next Pres­i­dent’ is per­fectly pre­cise—every­one knows and un­der­stands what it is to be Pres­i­dent and how one would de­cide—and so there’s no need to do any more; it would be ris­i­ble to try to deny it. On the other hand, ‘the globe will in­crease 1° Fahren­heit’ may ini­tially sound good, but your dark coun­ter­part im­me­di­ately ob­jects: ‘what if it’s colder in Rus­sia? When is this in­crease go­ing to hap­pen? Is this ex­actly 1° or are you go­ing to try to claim as suc­cess only 0.9° too? Who’s de­cid­ing this any­way?’ A good res­o­lu­tion might be ‘OK, global tem­per­a­tures will in­crease >=1.0° Fahren­heit on av­er­age ac­cord­ing to the next IPCC re­port’.

De­cid­ing the due-date of a pre­dic­tion is usu­ally triv­ial and not worth dis­cussing; when mak­ing open-ended pre­dic­tions about peo­ple (eg. ‘X will re­ceive a No­bel Prize’), I find it help­ful to con­sult s like So­cial Se­cu­ri­ty’s ta­ble to fig­ure out their av­er­age life ex­pectancy and then set the due-date to that. (This both min­i­mizes the num­ber of changes to the due date and helps cal­i­brate us by point­ing out what time spans we’re re­ally deal­ing with.)

When we be­gin de­cid­ing what prob­a­bil­ity to give the pre­dic­tion, we can em­ploy a num­ber of heuris­tics (par­tially drawn from “Tech­niques for prob­a­bil­ity es­ti­mates”):

  1. . Al­ready dis­cussed, but base rates should be your men­tal start­ing point for every pre­dic­tion, be­fore you take into ac­count any other opin­ion or be­lief.

    Base rates are eas­ily ex­pressed in terms of fre­quen­cies: “of the last Y years, X hap­pened only on­ce, so I will start with 1/Y%”. (“There are 10 can­di­dates for the 2012 Re­pub­li­can nom­i­nee, so I will as­sume 10% un­til I’ve looked at each can­di­date more close­ly.”) Fre­quen­cies have a long his­tory in the aca­d­e­mic lit­er­a­ture of mak­ing sub­op­ti­mal or fal­la­cious per­for­mance just dis­ap­pear31, and there’s no rea­son to think that is not true for your pre­dic­tions as well. This works for per­sonal pre­dic­tions as well—­fo­cus on what sort of per­son you are, how you’ve done in sim­i­lar cases over years, and you’ll im­prove your pre­dic­tions32.

    In mak­ing a pre­dic­tion, I start with an Out­side View, ask­ing, “how many times has this been true in the past?” Then I try many Out­side Views. For ex­am­ple, if asked about a North Ko­rean nuke test within a year from to­day, I would ask how often it has hap­pened in the past 1 year; the past 10 years; the past 20 years; how often do nu­clear pow­ers run nuke tests; how often does NK run its cy­cles of diplo­matic ex­tor­tion ini­tia­tives? After a bunch of this, there should be a strong gut feel­ing of cer­tain­ty.

    An ex­am­ple: “A Level 7 (Chernobyl/2011 Japan lev­el) nu­clear ac­ci­dent will take place by end of 2020”. One’s gut im­pres­sion is a very bad place to start be­cause Fukushima and Cher­nobyl—­men­tioned in the very pre­dic­tion!—are such vivid and ex­am­ples. 60%? 50%? Read the cov­er­age of Fukushima and many peo­ple give every im­pres­sion of ex­pect­ing fresh dis­as­ters in com­ing years. (Look at Ger­many quickly an­nounc­ing the shut­down of its nu­clear re­ac­tors, de­spite tsunamis not be­ing a fre­quent prob­lem in north­ern Eu­rope, shall we say.) But if we start with base rates and look up nu­clear ac­ci­dents, we re­al­ize some­thing in­ter­est­ing: Cher­nobyl and Fukushima come to mind read­ily in part be­cause they are—lit­er­al­ly—the only such lev­el-7 ac­ci­dents over the past >40 years. So the fre­quency would be 1 in ~20 years, which puts a differ­ent face on a pre­dic­tion span­ning 9 years. This gives us a base rate more like ~40%. This is our start­ing point for ask­ing how much does the rate go down be­cause Fukushima has prompted ad­di­tional safety im­prove­ments or clo­sure of older plants (Fukushi­ma’s equal­ly-out­dated sib­ling nu­clear plants will have a harder time get­ting stays in ex­e­cu­tion) and how much the rate goes up due to global warm­ing or ag­ing nu­clear plants. But from here we can hope to ar­rive at a sen­si­ble an­swer and not be spooked by a re­cent in­ci­dent.

  2. What does the pre­dic­tion about the fu­ture world im­ply about the present world?

    Every pre­dic­tion one makes is also a retro­d­ic­tion: you are claim­ing that the world is now and in the past on a course to­wards the fu­ture you have picked out of all the pos­si­bil­i­ties (or not on that course), and on that course to the de­gree you spec­i­fied. What does your claim im­ply about the world as it is now? The world has to be in a state which can progress of its own in­ter­nal logic to the fu­ture state, and so we can work back­wards to fig­ure out what that im­plies about the present or past. (You can think of this as a kind of proof by con­tra­dic­tion: as­sum­ing pre­dic­tion X is true, what can we in­fer from X about the present world which is ab­sur­d?)

    An­other way to put it would be, if I used this in a of some­thing I know, does this de­liver ridicu­lous es­ti­mates? If the mar­ket says 10%, does this make sense given the past year? How many nukes does that im­ply over the past 10 or 20 years? What fre­quen­cies can I con­vert it to and test against an Out­side View? For ex­am­ple, many peo­ple think false pa­ter­nity rates in the USA as of 2018 are >10%; 23andMe sells mil­lions of DNA test kits each year and sold record amounts, some­thing like a mil­lion post-Thanks­giv­ing in 2017 & 2018, and this feeds into a data­base of >5 mil­lion Amer­i­can­s—if you use this in a Fermi es­ti­mate about Amer­i­can di­vorce court daily av­er­ages, you get some­thing thor­oughly ab­surd. Nat­u­ral­ly, Amer­i­can di­vorce courts do not col­lapse un­der the weight of di­vorces sparked by rev­e­la­tions of cuck­oldry.

    In our first ex­am­ple, sup­pose some­one pre­dicted “Within ten years [2020] ei­ther ge­netic ma­nip­u­la­tion or em­bryo se­lec­tion will have been used on at least 50% of Chi­nese ba­bies to in­crease the ba­bies’ ex­pected in­tel­li­gence”. This ini­tially seems rea­son­able from the stand­point of 2010: China is a big place with known in­ter­ests in eu­gen­ics. But then we start work­ing back­ward­s—this pre­dic­tion im­plies han­dling >=9 mil­lion preg­nan­cies an­nu­al­ly, which en­tails hun­dreds of thou­sands of gy­ne­col­o­gists, ge­neti­cists, lab tech­ni­cians etc., which all have lead­-times mea­sured in years or decades. (It takes a long time to train a doc­tor even if your stan­dards are low.) And the pro­gram must be set up with hun­dreds of thou­sands of em­ploy­ees, poli­cies ex­per­i­mented with and im­ple­ment­ed, and so on. As mat­ters stand, even in the United States mere geno­typ­ing could barely be done for 9 mil­lion peo­ple an­nu­al­ly, and ge­netic se­quenc­ing for em­bryos is much more ex­pen­sive & diffi­cult, and ge­netic mod­i­fi­ca­tion is even hairi­er. If we work back­wards, we would ex­pect to see such a pro­gram al­ready be­gun and ac­tive as it fran­ti­cally tries to scale up to han­dle those mil­lions of cases a year in or­der to hit the dead­line. But as far as I knows, all the pieces are ab­sent in China as of the day it was pre­dict­ed; hence, it’s al­ready too late. And then there are the pol­i­tics; it is a deeply doubt­ful as­ser­tion that the Chi­nese pop­u­la­tion would coun­te­nance this, given the stress over the and the con­tin­u­ing cri­sis. Peo­ple just don’t or­ga­nize their re­pro­duc­tion like that—no pop­u­la­tion world­wide in 2010 used IVF for more than a few per­cent of births at the high­est. Even if the pre­dic­tion comes true even­tu­al­ly, it defi­nitely will not come true in time. (The same logic ap­plies to “Within ten years the SAT test­ing ser­vice will re­quire stu­dents to take a blood test to prove they are not on cog­ni­tive en­hanc­ing drugs.”; ~1.65 mil­lion test-tak­ers im­plies scores of thou­sands of , who do not ex­ist, al­though in the­ory they could be trained in un­der a year—but whence the train­er­s?)

    A sec­ond ex­am­ple would be a se­ries of pre­dic­tions on anti-aging/life-extension reg­is­tered in No­vem­ber 2011. The first and ear­li­est pre­dic­tion—“By 2025 there will be at least one con­firmed per­son who has lived to 130”—ini­tially seems at least pos­si­ble (I am op­ti­mistic about the ap­proaches sug­gested by ), and so I as­signed it a rea­son­able prob­a­bil­ity of 3%. But I felt trou­bled—­some­thing about it seemed wrong. So I ap­plied this heuris­tic: what does the ex­is­tence of an 130 year-old in 2025 im­ply about peo­ple in 2011? Well, if some­one is 130 in 2025, then that im­plies that are now 116 years old (). Then I looked up the then-old­est per­son in the world: , aged 115 years old. Oops. It’s im­pos­si­ble for the pre­dic­tion to come true, but be­cause we did­n’t think about what it com­ing true im­plied about the present world, we made an ab­surdly high pre­dic­tion. We can do this for all the other an­ti-ag­ing pre­dic­tions; for ex­am­ple “By 2085 there will be at least one con­firmed per­son who has lived to 150” can be rephrased as ‘some­one aged 76 now will live to 2085’, which, be­cause peo­ple aged 76 are so badly dam­aged by ag­ing and dis­ease al­ready, seems im­plau­si­ble ex­cept with a of some sort (“Hmm, phrased in that con­text, my es­ti­mate has to go down”). This can be ap­plied to fi­nan­cial or eco­nomic ques­tions, too, since un­der even the weak­est ver­sion of , the mar­kets are smarter than you— asks why we don’t see in­vestor pil­ing into so­lar power if it’s fol­low­ing an ex­po­nen­tial curve down­wards and is such a great idea (Robin Han­son ap­peals to dis­count rates and pur­blind in­vestors).

    The idea of ‘rephras­ing’ leads di­rectly into the next heuris­tic.

  3. Break­ing pre­dic­tions down into con­junc­tions

    Sim­i­lar to heuris­tic #1, we may not re­al­ize what a pre­dic­tion im­plies in­ter­nally and so wind up giv­ing high prob­a­bil­ity to .

    “Hillary Clin­ton will be­come Pres­i­dent in 2016” is speci­fic, eas­ily date­able, im­plies things about the present world like ru­mors of Clin­ton run­ning and strong po­lit­i­cal con­nec­tions (as do ex­ist), and yet this pre­dic­tion is still easy to mess up for some­one in 2012. Why? Be­cause be­com­ing Pres­i­dent is ac­tu­ally the out­come of a long se­ries of steps, every one of which must be suc­cess­ful and every one of which is doubt­ful: Hillary must re­sign from the White House where she was then Sec­re­tary of State, she must an­nounce a run, she must be­come De­mo­c­ra­tic nom­i­nee (out of sev­eral can­di­dates), and she must ac­tu­ally win. It’s the ex­cep­tional nom­i­nee who ever has >50% odds, so we start with a coin flip and work our way down to per­haps a few per­cent. This is more plau­si­ble than most na­tion­al-level De­moc­rats, but not as plau­si­ble as pun­dits might lead you to be­lieve.

    We can see a par­tic­u­larly strik­ing fail­ure to an­a­lyze in the pre­dic­tion “Obama gets re­elected and dur­ing that time Hillary Clin­ton bro­kers the mid­dle east peace deal be­tween Is­rael and Pales­tine for the two state so­lu­tion. This se­cures her pres­i­dency in 2016.”, where the pre­dic­tor gave it a flab­ber­gast­ing 80%; be­fore click­ing through, the reader is in­vited to as­sign prob­a­bil­i­ties to the fol­low­ing events (and then mul­ti­ply them to ob­tain the prob­a­bil­ity that they will all come true):

    1. Barack Obama is re-elected
    2. A Mid­dle East peace deal is bro­kered
    3. The peace deal is for a two state so­lu­tion
    4. Hillary Clin­ton runs in 2016
    5. Hillary Clin­ton is the 2016 De­mo­c­ra­tic nom­i­nee
    6. Hillary Clin­ton is elected

    (Some­times the ex­am­ples are even more ex­treme than 6 claus­es.) This heuris­tic is not per­fect, as it works best on step-by-step processes where every step must hap­pen. If this is not true, the heuris­tic will be overly pes­simistic. Worse, it is pos­si­ble to lie to our­selves by sim­ply break­ing down the steps into ever tinier steps and giv­ing them rel­a­tively small prob­a­bil­i­ties like 99%: the op­po­site of the good heuris­tic is the bad , where if we then mul­ti­ple out each of our ex­ag­ger­ated sub­-steps, we wind up be­ing ab­surdly skep­ti­cal. Steven Kaas fur­nishes an ex­am­ple:

    Walk­ing re­quires dozens of differ­ent mus­cles work­ing to­geth­er, so if you think you can walk you’re just com­mit­ting the con­junc­tion fal­la­cy.

    (One more com­plex use of this heuris­tic is to com­bine it with a : de­cide the odds of it ever hap­pen­ing, the last date it could hap­pen by, and then you can fig­ure out how your con­fi­dence will change in each year that goes by with­out it hap­pen­ing. I have found this use­ful in think­ing about Ar­ti­fi­cial In­tel­li­gence, which is some­thing that may or may not hap­pen but which one should some­how be chang­ing one’s opin­ion on as an­other year goes by with no H.A.L.)

  4. Build­ing pre­dic­tions up into dis­junc­tions

    One of the prob­lems with non-fre­quency in­for­ma­tion is that we’re not al­ways good at an ‘ab­solute pitch’ for prob­a­bil­i­ty—we may have in­tu­itive prob­a­bil­i­ties but they are fuzzy. On the other hand, com­par­isons are much eas­ier: I may not be able to say that Obama had a 52.5% chance of elec­tion vs Mc­Cain at 47.3%, but I can tell you which guy was on the hap­pier side of 50%. This sug­gests we pit pre­dic­tions against each oth­er: I pit my in­tu­ition about Obama against my in­tu­ition about Mc­Cain and I see Obama comes out on top. The more pre­dic­tions you can pit against each other the bet­ter, which ul­ti­mates leads to an ex­haus­tive list of out­comes, a full dis­junc­tion: “ei­ther Obama (52.5%) or Mc­Cain (47.3%) or Nader (0.2%) will win”

    Sur­prised to see Ralph Nader there? He ran too, you know. This is one of the pit­falls of dis­junc­tive rea­son­ing (as over­stated con­di­tion­al­ity and floors on per­cent­ages are a pit­fall of con­junc­tive rea­son­ing), the pit­fall of the pos­si­bil­i­ties you for­got to list and make room for.

    Nader is pretty triv­ial, but imag­ine you were dis­cussing Mid­dle East­ern pol­i­tics and your in­ter­locu­tor im­me­di­ately goes “ei­ther Is­rael will aeri­ally at­tack Iran or Is­rael will launch covert ops or the US will aeri­ally at­tack Iran or…” If you du­ti­fully be­gin as­sign­ing prob­a­bil­i­ties (“let’s see, 15% sounds rea­son­able, and covert ops is a lot less prob­a­ble so we’ll give that just 5%, and then the US is just as likely to at­tack Iran so that’s 15% too, and…”), you find you have some­how con­cluded Iran will be at­tacked, 35%+, when no pre­dic­tion mar­ket re­motely agrees with you! What hap­pened? You read about one dis­junct (“Iran will be at­tacked, pe­riod”) di­vided up into fine de­tail, on it, and ig­nored how many pos­si­bil­i­ties were also be­ing tucked away un­der “Iran will not be at­tacked, pe­riod”. If you had con­structed your own dis­junc­tion be­fore lis­ten­ing to the other guy, you might have in­stead said that no-at­tack was 80%+ prob­a­ble, and then cor­rectly divvied up the re­main­ing per­cent­age among the var­i­ous at­tack op­tions. Even do­main-ex­perts have prob­lems when the tree of cat­e­gories or out­comes is pre­sented to them with mod­i­fi­ca­tions, un­for­tu­nately33.

  5. Sets of pre­dic­tions must be con­sis­tent: a full set of dis­junc­tions must add to 100%, the prob­a­bil­ity some­thing will hap­pen and will not hap­pen must also sum to 100%, etc.34 It’s sur­pris­ing how often peo­ple mess this up.

  6. Bias heuris­tics: Is this pre­dic­tion po­lit­i­cally ap­peal­ing and par­ti­san­ly-po­lar­ized? Am I in a bub­ble of some sort, where every­one takes for granted some­thing (like that John Kerry or Hillary Clin­ton will win)? Is this some­thing peo­ple want to hap­pen? Am I be­ing in­suffi­ciently cyn­i­cal in think­ing about how peo­ple will act? Have I over­es­ti­mated how fast things will move? What sort of fric­tions are there slow­ing things down? Adam Smith re­minds us “there is a great deal of ruin in a na­tion”, and there is a great deal of ruin in many things in­deed. (Case in point: Venezue­la, North Ko­rea, Don­ald Trump.) re­minds us that in fore­cast­ing things, we tend to overem­pha­size the short­-term and ig­nore that slow steady changes can ac­cu­mu­late to enor­mous de­grees. And “if peo­ple don’t want to go to the ball game, how are you go­ing to stop them?”

    In some cas­es, it’s ob­vi­ous just from read­ing the writ­ings that other peo­ple are bi­ased: Ron Paul fans were delu­sional and dri­ving up In­trade prices way back when, and an­ti-Trumpers & an­ti-Bit­coin­ers were clearly let­ting their de­sires drive their fac­tual as­sess­ments and there was lit­tle need to even read the pro- ar­gu­ments. (In the case of Bit­coin, it was par­tic­u­larly ob­vi­ous be­cause many elite crit­ics were mak­ing glar­ing fac­tual er­rors which showed they had not even read the whitepa­per and so their opin­ion had no cor­re­la­tion with truth.) It’s a sim­ple heuris­tic: iden­tify the more ir­ra­tional sound­ing side, and go to the op­po­site side of the mean. (eg if the an­ti-Trumpers are un­hinged and the PM is 50-50 Trump vs Hillary, then ad­just one’s pre­dic­tion to a 55%/45% split). Here you are treat­ing con­sen­suses as a kind of bi­ased Out­side View and di­rec­tion­ally up­dat­ing away to ad­just for the bias.

See Also

Appendices

Modus tollens vs modus ponens

A log­i­cal­ly-valid ar­gu­ment which takes the form of a modus po­nens may be in­ter­preted in sev­eral ways; a ma­jor one is to in­ter­pret it as a kind of re­duc­tio ad ab­sur­dum, where by ‘prov­ing’ a con­clu­sion be­lieved to be false, one might in­stead take it as a modus tol­lens which proves that one of the premises is false. This is a pow­er­ful strat­egy which has been de­ployed against many skep­ti­cal & meta­phys­i­cal claims in phi­los­o­phy, where often the con­clu­sion is ex­tremely un­likely and lit­tle ev­i­dence can be pro­vided for the premises used in the proofs; and it is rel­e­vant to many other de­bates, par­tic­u­larly method­olog­i­cal ones.

The Hidden Library of the Long Now

told an in­ter­est­ing story in Au­gust 2011 of a long-term project that is un­der way or may have been com­plet­ed:

…he pub­licly posts them [his pre­dic­tion­s], time stamps them and does sta­tis­tics. That’s just bril­liant, and it is some­thing I started do­ing pri­vately in March of 2006. Within a year I found out that I was use­less at pre­dict­ing the kinds of events I thought I would be best at—­such as, say, de­vel­op­ments in the drug treat­ment of dis­eases which I knew a lot about. What I turned out to be re­ally good at was any­thing to do with fail­ure analy­sis, where there was a lot of both quan­ti­ta­tive and qual­i­ta­tive data were avail­able. For rea­sons I’ll men­tion only el­lip­ti­cally here, I be­came in­ter­ested in econo­met­ric data, and I also had the op­por­tu­nity to travel the world specifi­cally for the pur­pose of do­ing “fail­ure analy­sis re­ports” on var­i­ous kinds of in­fra­struc­ture: the health care sys­tem in Mex­i­co, food dis­tri­b­u­tion and pric­ing in North Africa, the vi­a­bil­ity of the cruise (ship) in­dus­try over the pe­riod from 2010 to 2010, po­ten­tial prob­lems with au­to­mat­ed, transoceanic con­tainer ship­ping… The tem­plate I was given was to col­lect data from a wide range of sources—­some of which no self re­spect­ing aca­d­e­mic would, or could ap­proach. There were lots of other peo­ple in the study do­ing the same thing, some­times in par­al­lel.

I got “re­cruited” be­cause “the de­signer” of the study had a diffi­cult prob­lem he and his cho­sen ex­perts could not solve, name­ly, how to en­code vast amounts of in­for­ma­tion in a sub­strate that would, ver­i­fi­ably, last tens of mil­lions of years. One of the par­tic­i­pants in this work­ing group brought me along as a guest to one of their ses­sions. There were all kind of pro­pos­als, from the L. Ron Hub­bard-Scien­tol­ogy one of writ­ing text on stain­less steel plates, to nano­lith­o­g­ra­phy us­ing gold… The dis­cus­sion went on for hours and what im­pressed me was that no one had any real data or any demon­strated ex­pe­ri­ence with (or for) their pu­ta­tive tech­nol­o­gy. At lunch, I was in­tro­duced to “the de­signer” and his first ques­tion was, “What are you here for?” I told him I was there to solve his prob­lem and that, if he liked, I could tell him how to do what he wanted ab­sent any wild new tech­nol­ogy or ac­cel­er­ated ag­ing tests. I said one word to him: GLASS. Or­gan­isms trapped in are, of course, the demon­strated proof that even a very frag­ile and per­ish­able sub­strate can be sta­bi­lized and re­tain the in­for­ma­tion en­coded in it for tens of mil­lions of years, if not longer. Pick a sta­ble glass, pro­tect it prop­er­ly, and any rea­son­able in­for­ma­tion con­tain­ing sub­strate will be sta­ble over ge­o­log­i­cal time pe­ri­ods35. There were a lot of pissed off peo­ple who did­n’t get to stay for the ex­pected (and elab­o­rate) evening meal. As it turned out, “the de­signer” had an­other pas­sion, and that was that he col­lected and used peo­ple whom he deemed (and ul­ti­mately ob­jec­tively test­ed) were “bril­liant” at fail­ure analy­sis. Fail­ure analy­sis can be ei­ther prospec­tive or ret­ro­spec­tive, but what it con­sists of some­one telling you what’s likely to go wrong with what­ever it is you are do­ing, or why things went wrong after they al­ready have.

Dar­win en­larges in an email:

My sec­ond con­cern is pretty well ad­dressed in my last post, “Fucked.” The geopo­lit­i­cal sit­u­a­tion is atro­cious; much worse than the me­dia sense and vast­ly, vastly worse than most of the politi­cians sense. At the very top, in a few places, such as B of A, Citi­corp and the IMF, there are peo­ple shit­ting them­selves morn­ing, noon and night. Be­fore, they were just shit­ting them­selves in the morn­ing. The “study” I al­lude to in my re­sponse was the work of a re­ally bizarrely in­ter­est­ing hu­man be­ing who is richer than Croe­sus and com­pletely ob­sessed with in­for­ma­tion. He has spent tens of mil­lions an­a­lyz­ing the “plan­e­tary so­cial, eco­nomic & geopo­lit­i­cal sit­u­a­tion” for fail­ure. He wanted a time­line to fail­ure and was smart enough to un­der­stand he could never get pre­ci­sion. He wanted and I think he got, a “best com­pleted by” date for his pro­ject. By now, I would guess that there are mas­sive pack­ets of glass go­ing into very, very deep holes in a num­ber of places…Let’s just say I read the fi­nal re­port of the study group and I think I have every good rea­son to be in one hell of hur­ry.

This all is quite in­ter­est­ing. Can one guess who this mys­te­ri­ous fig­ure is? Let’s do some ad hoc rea­son­ing in the spirit of s!

Let’s see, tens of mil­lions just on the pre­lim­i­nary stud­ies rules out mil­lion­aires; add in land pur­chases and fab­ri­ca­tion costs and the project would run into hun­dreds of mil­lions (eg. it cost some­thing like $50m for his and most of the work had al­ready been done by the Foun­da­tion!), so we can rule out mul­ti­-mil­lion­aires, leav­ing just bil­lion­aire-class wealth.

Pri­vate phil­an­thropy is al­most non-ex­is­tent in China36, Rus­sia, and In­dia so al­though they have many bil­lion­aires we can rule out those na­tion­al­i­ties. Aus­tralian bil­lion­aires are fairly rare and mostly in busi­ness or the ex­trac­tive in­dus­tries, so we can prob­a­bly rule out Aus­tralia too. Com­bined with Dar­win be­ing an Eng­lish mono­lin­gual (as far as I can tel­l), one can re­strict the search to Amer­ica and Eng­land, Eu­ro­pean at the most ex­ot­ic.

To strike Dar­win—a cry­oni­cist—as weird and ex­tremely in­tel­li­gent, he prob­a­bly has a high per­son­al­ity rat­ing, sug­gest­ing he ei­ther in­her­ited his money or made it in tech or fi­nance. Be­ing ob­sessed with in­for­ma­tion fits the two lat­ter bet­ter than the for­mer. He im­plies start­ing in 2006 or 2007, and it’s un­likely he was brought in on the ground floor or the ob­ses­sion started only then, so our bil­lion­aire’s wealth was prob­a­bly made in the ’80s or ’90s or early 2000s at the very lat­est, in the first or sec­ond dot-com boom. This de­scribes a rel­a­tively small sub­set of the 400 or so Amer­i­can bil­lion­aires.

With­out trawl­ing through Wikipedi­a’s cat­e­gories, the most ob­vi­ous sus­pects for a weird ex­tremely in­tel­li­gent tech bil­lion­aire in­ter­ested in in­for­ma­tion are Jeff Be­zos, & , , 37, 38, , and . Of those 6, I think I would rank them by plau­si­bil­ity as fol­lows:

  1. Jeff Be­zos

    Scat­ter­ing glass cap­sules of in­for­ma­tion is an ex­tremely Long Now idea and Be­zos has al­ready bought into the Long Now to the tune of dozens of mil­lions. This alone makes him the most plau­si­ble can­di­date, al­though his plau­si­bil­ity is dam­aged by the fact that he is a very busy CEO and has been for the last 2 decades and pre­sum­ably would have diffi­cul­ties de­vot­ing a lot of time to such a pro­ject.

  2. Pe­ter Thiel

    He has no di­rect Long Now links I know of, but he fits the de­scribed man even bet­ter than Be­zos in some re­spects: he is acutely aware of up­com­ing and 39 and scat­ters widely over highly spec­u­la­tive in­vest­ments (, , 20 un­der 20, the etc.). An ad­di­tional point in his fa­vor is he lives in San Fran­cis­co, near by Dar­win or Long Now fig­ures like

  3. Charles Si­monyi; sim­i­lar to Jay S. Walker

  4. Page & Brin; while I gen­er­ally get a utopian Sin­guli­tar­ian vibe off them and their projects and they seem to like pub­li­ciz­ing their works, Google Books is rel­a­tively im­pres­sive and I could see them in­ter­ested in this sort of thing as a ‘in­sur­ance pol­icy’.

  5. Yang; I don’t see any­thing es­pe­cially im­plau­si­ble about him, but noth­ing in fa­vor ei­ther.

  6. Jay S. Walk­er; his Li­brary quite im­pressed me when I saw it, in­di­cat­ing con­sid­er­able re­spect for the past, a re­spect con­ducive to such a pro­ject. I ini­tially ranked him at #3 based on old in­for­ma­tion about his for­tune be­ing at $6-7 bil­lion in 2000, but Time re­ported that the dot-com crash had re­duced his for­tune to $0.33 bil­lion.

  7. El­lison; like Jobs, his heart is cold, but he does seem to do­nate40 and claims to do­nate large sums qui­et­ly, con­sis­tent with the sto­ry. As some­one who made his bil­lions off data­base rented long-term, hope­fully he has an ap­pre­ci­a­tion of in­for­ma­tion and a longer-term per­spec­tive than most techies.

(I do not in­clude al­though he is fa­mous and matches a few cri­te­ria; as far as I or oth­ers can tell, his past char­ity has been triv­ial41 and has es­sen­tially never used his wealth for any­thing but his own good like buy­ing new or­gans, and comes off in as hav­ing so­cio­pathic char­ac­ter­is­tics; an anony­mous Job ad­viser re­marked in 2010 “Steve ex­presses con­tempt for every­one—un­less he’s con­trol­ling them.”. It’s in­ter­est­ing that Ap­ple’s cur­rent pro­gram was in­sti­tuted after Jobs re­signed, by Tim Cook; Ap­ple’s orig­i­nal phil­an­thropic pro­grams were shut down in 1997 by Jobs within weeks of his re­turn42. I would be shocked if Jobs was the for­mer em­ploy­er.)

All this said, I am well aware I haven’t looked at even a small per­cent­age of Amer­i­can bil­lion­aires, and I could be wrong in fo­cus­ing on techies—­fi­nance is equally plau­si­ble (look at ! if he is­n’t a plau­si­ble can­di­date, no one is), and in­her­ited wealth still com­mon enough to not be ig­nored. Pon­der­ing the im­pon­der­ables, I’d give a 15% chance that one of those 6 peo­ple was the em­ploy­er, and per­haps a 9% chance that the em­ployer was ei­ther Be­zos, Thiel, or Si­monyi, with Be­zos be­ing 4%, Thiel ~3% and Si­monyi 2%.

And in­deed, Dar­win said he did­n’t rec­og­nize sev­eral of those names, and im­plied they were all wrong. Well, it would have been fairly sur­pris­ing if 15% con­fi­dence as­ser­tions de­rived through such du­bi­ous rea­son­ing were right.


  1. As is true of every short de­scrip­tion, this is a lit­tle over-sim­pli­fied. Peo­ple are risk-a­verse and fun­da­men­tally un­cer­tain, so their be­liefs about the true prob­a­bil­ity won’t di­rectly trans­late into the percentage/price they will buy at, and one can’t even av­er­age out and say ‘this is what the mar­ket be­lieves the prob­a­bil­ity is’. See econ­o­mist Ra­jiv Sethi’s “On the In­ter­pre­ta­tion of Pre­dic­tion Mar­ket Data” & “From Or­der Books to Be­lief Dis­tri­b­u­tions”; for more rig­or, see Wolfers & Zitze­witz’s pa­per, “In­ter­pret­ing Pre­dic­tion Mar­ket Prices as Prob­a­bil­i­ties”↩︎

  2. , (A824/B852)↩︎

  3. Or neg­a­tive-sum, when you con­sider the costs of run­ning the pre­dic­tion mar­ket and the var­i­ous fees that might be as­sessed on par­tic­i­pants—the house needs a cut. In some cir­cum­stances, pre­dic­tion mar­kets can be pos­i­tive-sum for traders: if some party ben­e­fits from the in­for­ma­tion and will sub­si­dize it to en­cour­age trad­ing. For ex­am­ple, when com­pa­nies run in­ter­nal pre­dic­tion mar­kets they tend to sub­si­dize the mar­kets.

    Pub­lic pre­dic­tion mar­ket sub­si­dies are much rar­er—the only in­stance I know of is Pe­ter Mc­Cluskey sub­si­diz­ing 2008 In­trade mar­kets (an­nounce­ment). As far as he could tell in No­vem­ber 2008, his sub­si­dies did not do much. I emailed him May 2012, and he said:

    I was some­what dis­ap­pointed with the re­sults.

    I don’t ex­pect a small num­ber of sub­si­dized mar­kets is enough to ac­com­plish much. I sus­pect it would re­quire many donors (or a bil­lion­aire donor) to cre­ate the mar­kets needed for me to con­sider them suc­cess­ful. I see no hint that my efforts en­cour­aged any­one else to sub­si­dize such mar­kets.

    ↩︎
  4. See also the Less­Wrong ar­ti­cles on the top­ic.↩︎

  5. Rowe & Wright 2001, re­view­ing stud­ies of the :

    When one re­stricts the ex­change of in­for­ma­tion among pan­elists so se­verely and de­nies them the chance to ex­plain the ra­tio­nales be­hind their es­ti­mates, it is no sur­prise that feed­back loses its po­tency (in­deed, the sta­tis­ti­cal in­for­ma­tion may en­cour­age the sort of group pres­sures that Del­phi was de­signed to pre-emp­t). We (Rowe and Wright 1996) com­pared a sim­ple it­er­a­tion con­di­tion (with no feed­back) to a con­di­tion in­volv­ing the feed­back of sta­tis­ti­cal in­for­ma­tion (means and me­di­ans) and to a con­di­tion in­volv­ing the feed­back of rea­sons (with no av­er­ages) and found that the great­est de­gree of im­prove­ment in ac­cu­racy over rounds oc­curred in the “rea­sons” con­di­tion. Fur­ther­more, we found that, al­though sub­jects were less in­clined to change their fore­casts as a re­sult of re­ceiv­ing rea­sons feed­back than they were if they re­ceived ei­ther “sta­tis­ti­cal” feed­back or no feed­back at all, when “rea­sons” con­di­tion sub­jects did change their fore­casts they tended to change to­wards more ac­cu­rate re­spons­es. Al­though pan­elists tended to make greater changes to their fore­casts un­der the “it­er­a­tion” and “sta­tis­ti­cal” con­di­tions than those un­der the ‘rea­sons’ con­di­tion, these changes did not tend to be to­ward more ac­cu­rate pre­dic­tions. This sug­gests that in­for­ma­tional in­flu­ence is a less com­pelling force for opin­ion change than nor­ma­tive in­flu­ence, but that it is a more effec­tive force. Best (1974) has also pro­vided some ev­i­dence that feed­back of rea­sons (in ad­di­tion to av­er­ages) can lead to more ac­cu­rate judg­ments than feed­back of av­er­ages (e.g., me­di­ans) alone.

    It may be a stretch to gen­er­al­ize this to a sin­gle per­son pre­dict­ing on their own, though many tools in­volve groups or you could view pre­dict­ing as a Del­phi method in­volv­ing tem­po­rally sep­a­rated selves. (If mul­ti­ple selves works for in ex­plain­ing ad­dic­tion, why not pre­dict­ing?)↩︎

  6. See , Arkes 1988.↩︎

  7. Wikipedi­a’s re­marks that “Kelly bet­ting leads to highly volatile short­-term out­comes which many peo­ple find un­pleas­ant, even if they be­lieve they will do well in the end.”↩︎

  8. On 2008-01-27, the IEM sent out an email which ac­ci­den­tally listed all re­cip­i­ents in the CC; the listed emails to­taled 292 emails. Given that many of these traders (like my­self) are surely in­ac­tive or in­fre­quent, and only a frac­tion will be ac­tive at a given time, this means the 10 or so mar­kets are thinly in­hab­it­ed.↩︎

  9. The prob­lem is that if a con­tract is at 10%, and you buy 10 con­tracts, then if the con­tract ac­tu­ally pays off, you have to come up with 100% to pay the other peo­ple their win­nings. In­trade, to guar­an­tee them pay­ment, will make you pay the full 10%, and then freeze the 90% in your ac­count.↩︎

  10. This sec­tion first ap­peared on Less­Wrong.­com as “2011 In­trade fee changes, or, In­trade con­sid­ered no longer use­ful for Less­Wrongers” and in­cludes some dis­cus­sion.↩︎

  11. When I sub­mit­ted my with­drawal re­quest for my bal­ance, I re­ceived an email offer­ing to in­stead set my ac­count to ‘in­ac­tive’ sta­tus such that I could not trade but would not be charged the fee; if I wanted to trade, I would sim­ply be charged that mon­th’s $5. I de­clined the offer, but I could­n’t help won­der—why did­n’t they sim­ply set all ac­counts to ‘in­ac­tive’ and then let peo­ple opt in to the new fee struc­ture? Or at least set ‘in­ac­tive’ all ac­counts which have not en­gaged in any trans­ac­tions within X months?

    Re­gard­less, here are my prob­a­bil­i­ties for In­trade end­ing in the next few years:

    In March 2013 (rel­e­vant events post-dat­ing my pre­dic­tions in­clude the US CFTC at­tack­ing In­trade), In­trade an­nounced it was shut­ting down trad­ing and liq­ui­dat­ing all po­si­tions. I prob­a­bly was far too op­ti­mistic.↩︎

  12. I made $0.31 on DEM.2012, $3.65 on REP.2012, and $1.40 on 2012.REP.NOM.PALIN for a to­tal profit of $5.36.↩︎

  13. An aside: there’s not much point in ac­cu­mu­lat­ing more than, say, 1000 bit­coins. It’s gen­er­ally be­lieved that Bit­coin’s ul­ti­mate fate will be vic­tory or fail­ure—it’d be very strange if Bit­coin lev­eled off as a sta­ble per­ma­nent al­ter­na­tive cur­rency for only part of the In­ter­net. In such a sit­u­a­tion, the differ­ence be­tween 1000 bit­coins and 1500 bit­coins is like the differ­ence to Bill Gates be­tween $60 bil­lion and $65 bil­lion; it mat­ters in some ab­stract sense, but not even a tiny frac­tion as much as the differ­ence be­tween $1 and $100 mil­lion. Money is log­a­rith­mic in util­i­ty, as the say­ing goes.↩︎

  14. Alex Tabar­rok, “A Bet is a Tax on Bull­shit”↩︎

  15. The fa­mous neu­ro­trans­mit­ter is in­ti­mately in­volved with feel­ings of hap­pi­ness and plea­sure (which is why dopamine is affected by most ad­dic­tions or ad­dic­tive drugs). It also is in­volved in learn­ing—­make an er­ror and no dopamine for you; (Bayer & Glim­cher 2005, Neu­ron):

    The mid­brain dopamine neu­rons are hy­poth­e­sized to pro­vide a phys­i­o­log­i­cal cor­re­late of the re­ward pre­dic­tion er­ror sig­nal re­quired by cur­rent mod­els of . We ex­am­ined the ac­tiv­ity of sin­gle dopamine neu­rons dur­ing a task in which sub­jects learned by trial and er­ror when to make an eye move­ment for a juice re­ward. We found that these neu­rons en­coded the differ­ence be­tween the cur­rent re­ward and a weighted av­er­age of pre­vi­ous re­wards, a re­ward pre­dic­tion er­ror, but only for out­comes that were bet­ter than ex­pect­ed. Thus, the fir­ing rate of mid­brain dopamine neu­rons is quan­ti­ta­tively pre­dicted by the­o­ret­i­cal de­scrip­tions of the re­ward pre­dic­tion er­ror sig­nal used in re­in­force­ment learn­ing mod­els for cir­cum­stances in which this sig­nal has a pos­i­tive val­ue. We also found that the dopamine sys­tem con­tin­ued to com­pute the re­ward pre­dic­tion er­ror even when the be­hav­ioral pol­icy of the an­i­mal was only weakly in­flu­enced by this com­pu­ta­tion.

    See also .↩︎

  16. For ex­am­ple, no one has ac­tu­ally taken up Kev­in’s offer to wa­ger on the out­come to ap­peal, while there are dozens of spe­cific prob­a­bil­i­ties given in an ear­lier sur­vey.↩︎

  17. Un­for­tu­nately they don’t give any pop­u­la­tion sta­tis­tics so it’s hard for me to in­ter­pret my re­sults:

    Your cal­i­bra­tion score is -3. Cal­i­bra­tion is de­fined as the differ­ence be­tween the per­cent­age av­er­age con­fi­dence rat­ing and the per­cent­age of cor­rect an­swers. A score of zero is per­fect cal­i­bra­tion. Pos­i­tive num­bers in­di­cate over­con­fi­dence and can go up to 100. Neg­a­tive num­bers rep­re­sent un­der­-con­fi­dence and can go down to -100.

    Your dis­crim­i­na­tion score is 4.48. Dis­crim­i­na­tion is de­fined as the differ­ence be­tween the per­cent­age av­er­age con­fi­dence rat­ing for the cor­rect items and the per­cent­age av­er­age con­fi­dence rat­ing for the in­cor­rect items. Higher pos­i­tive num­bers in­di­cate greater dis­crim­i­na­tion and are bet­ter scores.

    ↩︎
  18. Specifi­cal­ly, pre­dic­tion #1007. In its pref­ace to the re­sults page, GJP told us:

    Ques­tion 1007 (the “lethal con­fronta­tion” ques­tion) il­lus­trates this point. Many of our best fore­cast­ers got ‘burned’ on this ques­tion be­cause a Chi­nese fish­ing cap­tain killed a South Ko­rean Coast Guard offi­cer late in the fore­cast­ing win­dow—an out­come that the tour­na­men­t’s spon­sors deemed to sat­isfy the cri­te­ria for re­solv­ing the ques­tion as ‘yes’, but one that had lit­tle geopo­lit­i­cal sig­nifi­cance (it did not sig­nify a more as­sertive Chi­nese naval pol­i­cy). These fore­cast­ers had fol­lowed our ad­vice (or their own com­mon sense) by low­er­ing their es­ti­mated like­li­hood of a lethal con­fronta­tion as time elapsed and made their bet­ting de­ci­sions based on this as­sump­tion.

    ↩︎
  19. For ex­am­ple, in the Your­Moral­s.org tests deal­ing with calibration/bias, I usu­ally do well above av­er­age, even for Less­Wrongers; see:

    ↩︎
  20. The 2001 an­thol­ogy of re­views and pa­pers, , is in­valu­able, al­though many of the pa­pers are highly tech­ni­cal. Ex­cerpts from Dy­lan Evan­s’s Risk In­tel­li­gence (in the Wall Street Jour­nal) may be more read­able:

    Psy­chol­o­gists have tended to as­sume that such bi­ases are uni­ver­sal and vir­tu­ally im­pos­si­ble to avoid. But cer­tain groups of peo­ple-such as me­te­o­rol­o­gists and pro­fes­sional gam­bler­s-have man­aged to over­come these bi­ases and are thus able to es­ti­mate prob­a­bil­i­ties much more ac­cu­rately than the rest of us. Are they do­ing some­thing the rest of us can learn? Can we im­prove our risk in­tel­li­gence?

    Sarah Licht­en­stein, an ex­pert in the field of de­ci­sion sci­ence, points to sev­eral char­ac­ter­is­tics of groups that ex­hibit high in­tel­li­gence with re­spect to risk. First, they tend to be com­fort­able as­sign­ing nu­mer­i­cal prob­a­bil­i­ties to pos­si­ble out­comes. Start­ing in 1965, for in­stance, U.S. Na­tional Weather Ser­vice fore­cast­ers have been re­quired to say not just whether or not it will rain the next day, but how likely they think it is in per­cent­age terms. Sure enough, when re­searchers mea­sured the risk in­tel­li­gence of Amer­i­can fore­cast­ers a decade lat­er, they found that it ranked among the high­est ever record­ed, ac­cord­ing to a study in the Jour­nal of the Royal Sta­tis­ti­cal So­ci­ety.

    It helps, too, if the group makes pre­dic­tions only on a nar­row range of top­ics. The ques­tion for weather fore­cast­ers, for ex­am­ple, is al­ways roughly the same: Will it rain or not? Doc­tors, on the other hand, must con­sider all sorts of differ­ent ques­tions: Is this rib bro­ken? Is this growth ma­lig­nant? Will this drug cock­tail work? Stud­ies have found that doc­tors score rather poorly on tests of risk in­tel­li­gence.

    Fi­nal­ly, groups with high risk in­tel­li­gence tend to get prompt and well-de­fined feed­back, which in­creases the chance that they will in­cor­po­rate new in­for­ma­tion into their un­der­stand­ing. For weather fore­cast­ers, it ei­ther rains or it does­n’t. For bat­tle­field com­man­ders, tar­gets are ei­ther dis­abled or not. For doc­tors, on the other hand, pa­tients may not come back, or they may be re­ferred else­where. Di­ag­noses may re­main un­cer­tain.

    …Royal Dutch Shell in­tro­duced just such a pro­gram in the 1970s. Se­nior ex­ec­u­tives had no­ticed that when newly hired ge­ol­o­gists pre­dicted oil strikes at four out of 10 new wells, only one or two ac­tu­ally pro­duced. This over­con­fi­dence cost Royal Dutch Shell mil­lions of dol­lars. In the train­ing pro­gram, the com­pany gave ge­ol­o­gists de­tails of pre­vi­ous ex­plo­rations and asked them for nu­mer­i­cal es­ti­mates of the chances of find­ing oil. The in­ex­pe­ri­enced ge­ol­o­gists were then given feed­back on the num­ber of oil strikes that had ac­tu­ally been made. By the end of the pro­gram, their es­ti­mates roughly matched the ac­tual num­ber of oil strikes.

    …Just by be­com­ing aware of our ten­dency to be over­con­fi­dent or un­der­con­fi­dent in our es­ti­mates, we can go a long way to­ward cor­rect­ing for our most com­mon er­rors. Doc­tors, for in­stance, could pro­vide nu­mer­i­cal es­ti­mates of prob­a­bil­ity when mak­ing di­ag­noses and then get data about which ones turned out to be right. As for the rest of us, we could es­ti­mate the like­li­hood of var­i­ous events in a given week, record our es­ti­mates in nu­mer­i­cal terms, re­view them the next week and thus mea­sure our risk in­tel­li­gence in every­day life. A sim­i­lar tech­nique is used by many suc­cess­ful gam­blers: They keep ac­cu­rate and de­tailed records of their earn­ings and their losses and reg­u­larly re­view their strate­gies in or­der to learn from their mis­takes.

    ↩︎
  21. Mod­i­fied ver­sion of Eliezer Yud­kowsky’s par­ody of the Fate/Stay Night chant.↩︎

  22. is the over­valu­ing of events in the 0-5% range or so; it plagues even heav­ily traded mar­kets on In­trade. Ron Paul and Michele Bach­mann are 2 cases in point—they are cov­ered by the heav­i­ly-traded US Pres­i­den­tial con­tracts, yet they are priced too high, and this has been noted by many:

    Be­yond blog posts, a 2004 Wolfers & Zitze­witz pa­per finds their pres­ence (see also Roth­schild 2011):

    In fact, the price differ­ences im­plied a (s­mall) ar­bi­trage op­por­tu­nity that per­sisted for most of sum­mer 2003 and has reap­peared in 2004. Sim­i­lar pat­terns ex­isted for Trade­sports se­cu­ri­ties on other fi­nan­cial vari­ables like crude oil, gold prices and ex­change rates. This find­ing is con­sis­tent with the long-shot bias be­ing more pro­nounced on small­er-s­cale ex­changes.

    This is ap­par­ently due in part to the short­-term pres­sure on pre­dic­tion mar­ket traders; Robin Han­son says:

    “In­trade and IEM don’t usu­ally pay in­ter­est on de­posits, so for long term bets you can win the bet and still lose over­all. The ob­vi­ous so­lu­tion is for them to pay such in­ter­est, but then they’d lose a hid­den tax many cus­tomers don’t no­tice.”

    An­other rea­son to use a free-form site like PB.­com—you can (and I have) made pre­dic­tions about decades or cen­turies into the far fu­ture with­out wor­ry­ing about how to earn re­turns of thou­sands of per­cent.↩︎

  23. Go­ing through In­trade to copy over pre­dic­tions to PB.­com, I was struck by how non-liq­uid mar­kets could be left at hi­lar­i­ous prices, prices that make no ra­tio­nal sense since they can’t even rep­re­sent some­one hedg­ing against that out­come be­cause so few shares have been sold; ex­am­ple con­tracts in­clude:

    1. US at­tack­ing North Ko­rea
    2. China at­tack­ing Tai­wan
    3. Japan ac­quir­ing nu­clear weapons
    ↩︎
  24. , “It’s Charis­ma, Stu­pid”↩︎

  25. “In Front of Your Nose”, 1946:

    To see what is in front of one’s nose needs a con­stant strug­gle. One thing that helps to­ward it is to keep a di­ary, or, at any rate, to keep some kind of record of one’s opin­ions about im­por­tant events. Oth­er­wise, when some par­tic­u­larly ab­surd be­lief is ex­ploded by events, one may sim­ply for­get that one ever held it. Po­lit­i­cal pre­dic­tions are usu­ally wrong. But even when one makes a cor­rect one, to dis­cover why one was right can be very il­lu­mi­nat­ing. In gen­er­al, one is only right when ei­ther wish or fear co­in­cides with re­al­i­ty. If one rec­og­nizes this, one can­not, of course, get rid of one’s sub­jec­tive feel­ings, but one can to some ex­tent in­su­late them from one’s think­ing and make pre­dic­tions cold-blood­ed­ly, by the book of arith­metic. In pri­vate life most peo­ple are fairly re­al­is­tic. When one is mak­ing out one’s weekly bud­get, two and two in­vari­ably make four. Pol­i­tics, on the other hand, is a sort of sub­-atomic or non-Euclid­ean word where it is quite easy for the part to be greater than the whole or for two ob­jects to be in the same place si­mul­ta­ne­ous­ly. Hence the con­tra­dic­tions and ab­sur­di­ties I have chron­i­cled above, all fi­nally trace­able to a se­cret be­lief that one’s po­lit­i­cal opin­ions, un­like the weekly bud­get, will not have to be tested against solid re­al­i­ty.

    ↩︎
  26. Eliezer Yud­kowsky, chap­ter 20, Harry Pot­ter and the Meth­ods of Ra­tio­nal­ity:

    …while I sup­pose it is barely pos­si­ble that per­fectly good peo­ple ex­ist even though I have never met one, it is nonethe­less im­prob­a­ble that some­one would be beaten for fifteen min­utes and then stand up and feel a great surge of kindly for­give­ness for his at­tack­ers. On the other hand it is less im­prob­a­ble that a young child would imag­ine this as the role to play in or­der to con­vince his teacher and class­mates that he is not the next Dark Lord.

    The im­port of an act lies not in what that act re­sem­bles on the sur­face, Mr. Pot­ter, but in the states of mind which make that act more or less prob­a­ble.

    ↩︎
  27. “When fal­si­fi­ca­tion is the only path to truth”; ab­stract:

    Can peo­ple con­sis­tently at­tempt to fal­si­fy, that is, search for re­fut­ing ev­i­dence, when test­ing the truth of hy­pothe­ses? Ex­per­i­men­tal ev­i­dence in­di­cates that peo­ple tend to search for con­firm­ing ev­i­dence. We re­port two novel ex­per­i­ments that show that peo­ple can con­sis­tently fal­sify when it is the only help­ful strat­e­gy. Ex­per­i­ment 1 showed that par­tic­i­pants read­ily fal­si­fied some­body else’s hy­poth­e­sis. Their task was to test a hy­poth­e­sis be­long­ing to an ‘imag­i­nary par­tic­i­pant’ and they knew it was a low qual­ity hy­poth­e­sis. Ex­per­i­ment 2 showed that par­tic­i­pants were able to fal­sify a low qual­ity hy­poth­e­sis be­long­ing to an imag­i­nary par­tic­i­pant more read­ily than their own low qual­ity hy­poth­e­sis. The re­sults have im­por­tant im­pli­ca­tions for the­o­ries of hy­poth­e­sis test­ing and hu­man ra­tio­nal­i­ty.

    One line of thought in is that our minds are not evolved for truth-seek­ing per se, but rather are split be­tween heuris­tics and effec­tive pro­ce­dures like that, and ar­gu­men­ta­tion to try to de­ceive & per­suade oth­ers; eg. “Why do hu­mans rea­son? Ar­gu­ments for an ar­gu­men­ta­tive the­ory” (Mercier & Sper­ber 2011). This ties in well with why we are bet­ter at fal­si­fy­ing the the­o­ries of oth­ers—you don’t con­vince any­one by fal­si­fy­ing your own the­o­ries, but you do by fal­si­fy­ing oth­ers’ the­o­ries.↩︎

  28. :

    Half of par­tic­i­pants were as­signed ran­domly to a “self­-pre­dic­tion” in­ter­ven­tion, ask­ing them to pre­dict their fu­ture ac­cep­tance of HBV vac­ci­na­tion. The main out­come mea­sure was sub­se­quent vac­ci­na­tion be­hav­ior. Other mea­sures in­cluded per­ceived bar­ri­ers to HBV vac­ci­na­tion, mea­sured prior to the in­ter­ven­tion. Re­sults: There was a [s­ta­tis­ti­cal­ly-]sig­nifi­cant in­ter­ac­tion be­tween the in­ter­ven­tion and vac­ci­na­tion bar­ri­ers, in­di­cat­ing the effect of the in­ter­ven­tion differed de­pend­ing on per­ceived vac­ci­na­tion bar­ri­ers. Among high­-bar­ri­ers pa­tients, the in­ter­ven­tion [s­ta­tis­ti­cal­ly-]sig­nifi­cantly in­creased vac­ci­na­tion ac­cep­tance. Among low-bar­ri­ers pa­tients, the in­ter­ven­tion did not in­flu­ence vac­ci­na­tion ac­cep­tance. Con­clu­sions: The self­-pre­dic­tion in­ter­ven­tion [s­ta­tis­ti­cal­ly-]sig­nifi­cantly in­creased vac­ci­na­tion ac­cep­tance among “high­-bar­ri­ers” pa­tients, who typ­i­cally have very low vac­ci­na­tion rates.

    ↩︎
  29. Rowe & Wright 2001:

    • In phras­ing ques­tions, use clear and suc­cinct de­fi­n­i­tions and avoid emo­tive terms.

      How a ques­tion is worded can lead to [sub­stan­tial] re­sponse bi­as­es. By chang­ing words or em­pha­sis, one can in­duce re­spon­dents to give dra­mat­i­cally differ­ent an­swers to a ques­tion. For ex­am­ple, Hauser (1975) de­scribes a 1940 sur­vey in which 96% of peo­ple an­swered yes to the ques­tion “do you be­lieve in free­dom of speech?” and yet only 22% an­swered yes to the ques­tion “do you be­lieve in free­dom of speech to the ex­tent of al­low­ing rad­i­cals to hold meet­ings and ex­press their views to the com­mu­ni­ty?” The sec­ond ques­tion is con­sis­tent with the first; it sim­ply en­tails a fuller de­fi­n­i­tion of the con­cept of free­dom of speech. One might there­fore ask which of these an­swers more clearly re­flects the views of the sam­ple. Ar­guably, the more apt rep­re­sen­ta­tion comes from the ques­tion that in­cludes a clearer de­fi­n­i­tion of the con­cept of in­ter­est, be­cause this should en­sure that the re­spon­dents are all an­swer­ing the same ques­tion. Re­searchers on Del­phi per se have shown lit­tle em­pir­i­cal in­ter­est in ques­tion word­ing. Salan­cik, Wenger and Heifer (1971) pro­vide the only ex­am­ple of which we are aware; they stud­ied the effect of ques­tion length on ini­tial pan­elist con­sen­sus and found that one could ap­par­ently ob­tain greater con­sen­sus by us­ing ques­tions that were nei­ther “too short” nor “too long.” This is a gen­er­ally ac­cepted prin­ci­ple for word­ing items on sur­veys: they should be long enough to de­fine the ques­tion ad­e­quately so that re­spon­dents do not in­ter­pret it differ­ent­ly, yet they should not be so long and com­pli­cated that they re­sult in in­for­ma­tion over­load, or so pre­cisely de­fine a prob­lem that they de­mand a par­tic­u­lar an­swer. Al­so, ques­tions should not con­tain emo­tive words or phras­es: the use of the term “rad­i­cals” in the sec­ond ver­sion of the free­dom-of-speech ques­tion, with its po­ten­tially neg­a­tive con­no­ta­tions, might lead to emo­tional rather than rea­soned re­spons­es.

    • Frame ques­tions in a bal­anced man­ner.

      Tver­sky and Kah­ne­man (1974, 1981) pro­vide a sec­ond ex­am­ple of the way in which ques­tion fram­ing may bias re­spons­es. They posed a hy­po­thet­i­cal sit­u­a­tion to sub­jects in which hu­man lives would be lost: if sub­jects were to choose one op­tion, a cer­tain num­ber of peo­ple would defi­nitely die, but if they chose a sec­ond op­tion, then there was a prob­a­bil­ity that more would die, but also a chance that less would die. Tver­sky and Kah­ne­man found that the pro­por­tion of sub­jects choos­ing each of the two op­tions changed when they phrased the op­tions in terms of peo­ple sur­viv­ing in­stead of in terms of dy­ing (i.e., sub­jects re­sponded differ­ently to an op­tion worded “60% will sur­vive” than to one worded “40% will die,” even though these are log­i­cally iden­ti­cal state­ments). The best way to phrase such ques­tions might be to clearly state both death and sur­vival rates (bal­anced), rather than leave half of the con­se­quences im­plic­it. Phras­ing a ques­tion in terms of a sin­gle per­spec­tive, or nu­mer­i­cal fig­ure, may pro­vide an an­chor point as the fo­cus of at­ten­tion, so bi­as­ing re­spons­es.

    ↩︎
  30. It may help to read the dialogue/examples of “Dr. Mal­foy” and “Dr. Pot­ter” in chap­ter 22 of Eliezer Yud­kowsky’s Harry Pot­ter and the Meth­ods of Ra­tio­nal­ity.↩︎

  31. For ex­am­ple, the fa­mous and repli­cated ex­am­ples of doc­tors fail­ing to cor­rectly ap­ply Bayes’ the­o­rem to can­cer rates is re­duced when the per­cent­ages are trans­lated into fre­quen­cies. Rowe & Wright 2001 give this ad­vice:

    • When pos­si­ble, give es­ti­mates of un­cer­tainty as fre­quen­cies rather than prob­a­bil­i­ties or odds.

    Many ap­pli­ca­tions of Del­phi re­quire pan­elists to make ei­ther nu­mer­i­cal es­ti­mates of the prob­a­bil­ity of an event hap­pen­ing in a spec­i­fied time pe­ri­od, or to as­sess their con­fi­dence in the ac­cu­racy of their pre­dic­tions. Re­searchers on be­hav­ioral de­ci­sion mak­ing have ex­am­ined the ad­e­quacy of such nu­mer­i­cal judg­ments. Re­sults from these find­ings, sum­ma­rized by Good­win and Wright (1998), show that some­times judg­ments from di­rect as­sess­ments (what is the prob­a­bil­ity that…?) are in­con­sis­tent with those from in­di­rect meth­ods. In one ex­am­ple of an in­di­rect method, sub­jects might be asked to imag­ine an urn filled with 1,000 col­ored balls (say, 400 red and 600 blue). They would then be asked to choose be­tween bet­ting on the event in ques­tion hap­pen­ing, or bet­ting on a red ball be­ing drawn from the urn (both bets offer­ing the same re­ward). The ra­tio of red to blue balls would then be var­ied un­til a sub­ject was in­differ­ent be­tween the two bets, at which point the re­quired prob­a­bil­ity could be in­ferred. In­di­rect meth­ods of elic­it­ing sub­jec­tive prob­a­bil­i­ties have the ad­van­tage that sub­jects do not have to ver­bal­ize nu­mer­i­cal prob­a­bil­i­ties. Di­rect es­ti­mates of odds (such as 25 to 1, or 1,000 to 1), per­haps be­cause they have no up­per or lower lim­it, tend to be more ex­treme than di­rect es­ti­mates of prob­a­bil­i­ties (which must lie be­tween zero and one). If prob­a­bil­ity es­ti­mates de­rived by differ­ent meth­ods for the same event are in­con­sis­tent, which method should one take as the true in­dex of de­gree of be­lief? One way to an­swer this ques­tion is to use a sin­gle method of as­sess­ment that pro­vides the most con­sis­tent re­sults in re­peated tri­als. In other words, the sub­jec­tive prob­a­bil­i­ties pro­vided at differ­ent times by a sin­gle as­ses­sor for the same event should show a high de­gree of agree­ment, given that the as­ses­sor’s knowl­edge of the event is un­changed. Un­for­tu­nate­ly, lit­tle re­search has been done on this im­por­tant prob­lem. Beach and Phillips (1967) eval­u­ated the re­sults of sev­eral stud­ies us­ing di­rect es­ti­ma­tion meth­ods. Test-retest cor­re­la­tions were all above 0.88, ex­cept for one study us­ing stu­dents as­sess­ing odds, where the re­li­a­bil­ity was 0.66.

    Gigeren­zer (1994) pro­vided em­pir­i­cal ev­i­dence that the un­trained mind is not equipped to rea­son about un­cer­tainty us­ing sub­jec­tive prob­a­bil­i­ties but is able to rea­son suc­cess­fully about un­cer­tainty us­ing fre­quen­cies. Con­sider a gam­bler bet­ting on the spin of a roulette wheel. If the wheel has stopped on red for the last 10 spins, the gam­bler may feel sub­jec­tively that it has a greater prob­a­bil­ity of stop­ping on black on the next spin than on red. How­ev­er, ask the same gam­bler the rel­a­tive fre­quency of red to black on spins of the wheel and he or she may well an­swer 50-50. Since the roulette ball has no mem­o­ry, it fol­lows that for each spin of the wheel, the gam­bler should use the lat­ter, rel­a­tive fre­quency as­sess­ment (50-50) in bet­ting. Kah­ne­man and Lo­vallo (1993) have ar­gued that fore­cast­ers tend to see fore­cast­ing prob­lems as unique when they should think of them as in­stances of a broader class of events. They claim that peo­ple’s nat­ural ten­dency in think­ing about a par­tic­u­lar is­sue, such as the likely suc­cess of a new busi­ness ven­ture, is to take an “in­side” rather than an “out­side” view. Fore­cast­ers tend to pay par­tic­u­lar at­ten­tion to the dis­tin­guish­ing fea­tures of the par­tic­u­lar event to be fore­cast (e.g., the per­sonal char­ac­ter­is­tics of the en­tre­pre­neur) and re­ject analo­gies to other in­stances of the same gen­eral type as su­per­fi­cial. Kah­ne­man and Lo­vallo cite a study by Coop­er, Woo, and Dunkel­berger (1988), which showed that 80% of en­tre­pre­neurs who were in­ter­viewed about their chances of busi­ness suc­cess de­scribed this as 70% or bet­ter, while the over­all sur­vival rate for new busi­ness is as low as 33 per­cent. Gigeren­z­er’s ad­vice, in this con­text, would be to ask the in­di­vid­ual en­tre­pre­neurs to es­ti­mate the pro­por­tion of new busi­nesses that sur­vive (as they might make ac­cu­rate es­ti­mates of this rel­a­tive fre­quen­cy) and use this as an es­ti­mate of their own busi­nesses sur­viv­ing. Re­search has shown that such in­ter­ven­tions to change the re­quired re­sponse mode from sub­jec­tive prob­a­bil­ity to rel­a­tive fre­quency im­prove the pre­dic­tive ac­cu­racy of elicited judg­ments. For ex­am­ple, Sniezek and Buck­ley (1991) gave stu­dents a se­ries of gen­eral knowl­edge ques­tions with two al­ter­na­tive an­swers for each, one of which was cor­rect. They asked stu­dents to se­lect the an­swer they thought was cor­rect and then es­ti­mate the prob­a­bil­ity that it was cor­rect. Their re­sults showed the same gen­eral over­con­fi­dence that Arkes (2001) dis­cuss­es. How­ev­er, when Sniezek and Buck­ley asked re­spon­dents to state how many of the ques­tions they had an­swered cor­rectly of the to­tal num­ber of ques­tions, their fre­quency es­ti­mates were ac­cu­rate. This was de­spite the fact that the same in­di­vid­u­als were gen­er­ally over­con­fi­dent in their sub­jec­tive prob­a­bil­ity as­sess­ments for in­di­vid­ual ques­tions. Good­win and Wright (1998) dis­cuss the use­ful­ness of dis­tin­guish­ing be­tween sin­gle-event prob­a­bil­i­ties and fre­quen­cies. If a ref­er­ence class of his­toric fre­quen­cies is not ob­vi­ous, per­haps be­cause the event to be fore­cast is truly unique, then the only way to as­sess the like­li­hood of the event is to use a sub­jec­tive prob­a­bil­ity pro­duced by judg­men­tal heuris­tics. Such heuris­tics can lead to judg­men­tal over­con­fi­dence, as Arkes (2001) doc­u­ments.

    ↩︎
  32. From :

    Os­berg and Shrauger (1986) de­ter­mined pre­dic­tion ac­cu­racy by scor­ing an item as a hit if the re­spon­dents pre­dicted the event defi­nitely or prob­a­bly would oc­cur and it did, or if the re­spon­dent pre­dicted that the event defi­nitely or prob­a­bly would not oc­cur and it did not. Re­spon­dents who were in­structed to fo­cus on their own per­sonal dis­po­si­tions pre­dicted [s­ta­tis­ti­cal­ly-]sig­nifi­cantly more of the 55 items cor­rectly (74%) than did re­spon­dents in the con­trol con­di­tion who did not re­ceive in­struc­tions (69%). Re­spon­dents whose in­struc­tions were to fo­cus on per­sonal base rates had higher ac­cu­racy (72%) and re­spon­dents whose in­struc­tions were to fo­cus on pop­u­la­tion base rates had lower ac­cu­racy (66%) than con­trol re­spon­dents, al­though these differ­ences were not sta­tis­ti­cal­ly-sig­nifi­cant.

    ↩︎
  33. From Psy­chol­ogy of In­tel­li­gence Analy­sis:

    Ide­al­ly, in­tel­li­gence an­a­lysts should be able to rec­og­nize what rel­e­vant ev­i­dence is lack­ing and fac­tor this into their cal­cu­la­tions. They should also be able to es­ti­mate the po­ten­tial im­pact of the miss­ing data and to ad­just con­fi­dence in their judg­ment ac­cord­ing­ly. Un­for­tu­nate­ly, this ideal does not ap­pear to be the norm. Ex­per­i­ments sug­gest that “out of sight, out of mind” is a bet­ter de­scrip­tion of the im­pact of gaps in the ev­i­dence.

    This prob­lem has been demon­strated us­ing fault trees, which are schematic draw­ings show­ing all the things that might go wrong with any en­deav­or. Fault trees are often used to study the fal­li­bil­ity of com­plex sys­tems such as a nu­clear re­ac­tor or space cap­sule.

    A fault tree show­ing all the rea­sons why a car might not start was shown to sev­eral groups of ex­pe­ri­enced me­chan­ics.96 The tree had seven ma­jor branch­es–in­suffi­cient bat­tery charge, de­fec­tive start­ing sys­tem, de­fec­tive ig­ni­tion sys­tem, de­fec­tive fuel sys­tem, other en­gine prob­lems, mis­chie­vous acts or van­dal­ism, and all other prob­lem­s–and a num­ber of sub­cat­e­gories un­der each branch. One group was shown the full tree and asked to imag­ine 100 cases in which a car won’t start. Mem­bers of this group were then asked to es­ti­mate how many of the 100 cases were at­trib­ut­able to each of the seven ma­jor branches of the tree. A sec­ond group of me­chan­ics was shown only an in­com­plete ver­sion of the tree: three ma­jor branches were omit­ted in or­der to test how sen­si­tive the test sub­jects were to what was left out.

    If the me­chan­ics’ judg­ment had been fully sen­si­tive to the miss­ing in­for­ma­tion, then the num­ber of cases of fail­ure that would nor­mally be at­trib­uted to the omit­ted branches should have been added to the “Other Prob­lems” cat­e­go­ry. In prac­tice, how­ev­er, the “Other Prob­lems” cat­e­gory was in­creased only half as much as it should have been. This in­di­cated that the me­chan­ics shown the in­com­plete tree were un­able to fully rec­og­nize and in­cor­po­rate into their judg­ments the fact that some of the causes for a car not start­ing were miss­ing. When the same ex­per­i­ment was run with non-me­chan­ics, the effect of the miss­ing branches was much greater.

    As com­pared with most ques­tions of in­tel­li­gence analy­sis, the “car won’t start” ex­per­i­ment in­volved rather sim­ple an­a­lyt­i­cal judg­ments based on in­for­ma­tion that was pre­sented in a well-or­ga­nized man­ner. That the pre­sen­ta­tion of rel­e­vant vari­ables in the ab­bre­vi­ated fault tree was in­com­plete could and should have been rec­og­nized by the ex­pe­ri­enced me­chan­ics se­lected as test sub­jects. In­tel­li­gence an­a­lysts often have sim­i­lar prob­lems. Miss­ing data is nor­mal in in­tel­li­gence prob­lems, but it is prob­a­bly more diffi­cult to rec­og­nize that im­por­tant in­for­ma­tion is ab­sent and to in­cor­po­rate this fact into judg­ments on in­tel­li­gence ques­tions than in the more con­crete “car won’t start” ex­per­i­ment.

    ↩︎
  34. Rowe & Wright 2001:

    • Use co­her­ence checks when elic­it­ing es­ti­mates of prob­a­bil­i­ties.

    As­sessed prob­a­bil­i­ties are some­times in­co­her­ent. One use­ful co­her­ence check is to elicit from the fore­caster not only the prob­a­bil­ity (or con­fi­dence) that an event will oc­cur, but also the prob­a­bil­ity that it will not oc­cur. The two prob­a­bil­i­ties should sum to one. A vari­ant of this tech­nique is to de­com­pose the prob­a­bil­ity of the event not oc­cur­ring into the oc­cur­rence of other pos­si­ble events. If the events are mu­tu­ally ex­clu­sive and ex­haus­tive, then the ad­di­tion rule can be ap­plied, since the sum of the as­sessed prob­a­bil­i­ties should be one. Wright and Whal­ley (1983) found that most un­trained prob­a­bil­ity as­ses­sors fol­lowed the ad­di­tiv­ity ax­iom in sim­ple two-out­come as­sess­ments in­volv­ing the prob­a­bil­i­ties of an event hap­pen­ing and not hap­pen­ing. How­ev­er, as the num­ber of mu­tu­ally ex­clu­sive and ex­haus­tive events in a set in­creased, more fore­cast­ers be­came supra-ad­di­tive, and to a greater ex­tent, in that their as­sessed prob­a­bil­i­ties added up to more than one. Other co­her­ence checks can be used when events are in­ter­de­pen­dent (Good­win and Wright 1998; Wright, et al. 1994).

    There is a de­bate in the lit­er­a­ture as to whether de­com­pos­ing an­a­lyt­i­cally com­plex as­sess­ments into an­a­lyt­i­cally more sim­ple mar­ginal and con­di­tional as­sess­ments of prob­a­bil­ity is worth­while as a means of sim­pli­fy­ing the as­sess­ment task. This de­bate is cur­rently un­re­solved (Wright, Saun­ders and Ay­ton 1988; Wright et al. 1994). Our view is that the best so­lu­tion to prob­lems of in­con­sis­tency and in­co­her­ence in prob­a­bil­ity as­sess­ment is for the poll­ster to show fore­cast­ers the re­sults of such checks and then al­low in­ter­ac­tive res­o­lu­tion be­tween them of de­par­tures from con­sis­tency and co­her­ence. Mac­Gre­gor (2001) con­cludes his re­view of de­com­po­si­tion ap­proaches with sim­i­lar ad­vice.

    When as­sess­ing prob­a­bil­ity dis­tri­b­u­tions (e.g., for the fore­cast range within which an un­cer­tainty qual­ity will lie), in­di­vid­u­als tend to be over­con­fi­dent in that they fore­cast too nar­row a range. Some re­sponse modes fail to coun­ter­act this ten­den­cy. For ex­am­ple, if one asks a fore­caster ini­tially for the me­dian value of the dis­tri­b­u­tion (the value the fore­caster per­ceives as hav­ing a 50% chance of be­ing ex­ceed­ed), this can act as an an­chor. Tver­sky and Kah­ne­man (1974) were the first to show that peo­ple are un­likely to make suffi­cient ad­just­ments from this an­chor when as­sess­ing other val­ues in the dis­tri­b­u­tion. To counter this bi­as, Good­win and Wright (1998) de­scribe the “prob­a­bil­ity method” for elic­it­ing prob­a­bil­ity dis­tri­b­u­tions, an as­sess­ment method that de-em­pha­sizes the use of the me­dian as a re­sponse an­chor. Mc­Clel­land and Bol­ger (1994) dis­cuss over­con­fi­dence in the as­sess­ment of prob­a­bil­ity dis­tri­b­u­tions and point prob­a­bil­i­ties. Wright and Ay­ton (1994) pro­vide a gen­eral overview of psy­cho­log­i­cal re­search on sub­jec­tive prob­a­bil­i­ty. Arkes (2001) lists a num­ber of prin­ci­ples to help fore­cast­ers to coun­ter­act over­con­fi­dence.

    ↩︎
  35. Dar­win dis­cusses this fur­ther in the con­text of brain preser­va­tion in his “Sci­ence Fic­tion, Dou­ble Fea­ture, 2: Part 2”; see also my es­say.↩︎

  36. You might get the op­po­site im­pres­sion read­ing ar­ti­cles like this New York Times ar­ti­cle, but con­sider the flip side of large per­cent­age growth in phil­an­thropy­—they must be start­ing off from a small ab­solute base!↩︎

  37. Charles Si­monyi is ac­tu­ally the first per­son to come to mind when I think about ‘weird wealthy Amer­i­can tech­nol­o­gist in­ter­ested in old and long-term in­for­ma­tion who has al­ready demon­strated phil­an­thropy on a large scale’↩︎

  38. Walker was the sec­ond, due to his . In­for­ma­tion on his net wealth is­n’t too easy to come by; he had ~410 bil­lion in 2000 but fell to 300 mil­lion and Busi­ness In­sider claims “Al­though he never re­cov­ered fi­nan­cial­ly, Walker had enough mon­ey—bare­ly—to com­plete an ex­pen­sive dream home in Con­necti­cut, which in­cludes a fan­tas­tic per­sonal li­brary (fea­tured by Wired in 2008).”↩︎

  39. See Thiel’s es­say “The Op­ti­mistic Thought Ex­per­i­ment: In the long run, there are no good bets against glob­al­iza­tion”↩︎

  40. And I use ‘seem’ ad­vis­ed­ly; it’s re­mark­able how selfish .↩︎

  41. I ini­tially could­n’t find any­thing on char­i­ta­ble giv­ing by Jobs. Even­tu­ally I found a The Times in­ter­view with Jobs where the re­porter says “Jobs had vol­un­teered him­self as an ad­vi­sor to John Ker­ry’s un­suc­cess­ful cam­paign for the White House. He and his wife, Lau­ren, had given hun­dreds of thou­sands of dol­lars to De­mo­c­ra­tic causes over the last few years.” Ars Tech­nica men­tions a few oth­ers, but con­flates Jobs with Ap­ple. A large $150m do­na­tion spec­u­lated to be Jobs has been con­firmed to not be from him, al­though ru­mors about a $50m do­na­tion to a hos­pi­tal con­tinue to cir­cu­late. (In 2004, For­tune es­ti­mated Job­s’s for­tune at $2.1 bil­lion.) And in gen­er­al, ab­sence of ev­i­dence is ev­i­dence of ab­sence. While Bono praises Ap­ple for its char­i­ty:

    Through the sale of (RED) prod­ucts, Ap­ple has been (RED)’s largest con­trib­u­tor to the Global Fund to Fight AIDS, Tu­ber­cu­lo­sis and Malar­i­a—­giv­ing tens of mil­lions of dol­lars that have trans­formed the lives of more than two mil­lion Africans through H.I.V. test­ing, treat­ment and coun­sel­ing. This is se­ri­ous and sig­nifi­cant. And Ap­ple’s in­volve­ment has en­cour­aged other com­pa­nies to step up.

    Isaac­son’s 2011 Steve Jobs bi­og­ra­phy (fin­ished be­fore he died and so in­cludes noth­ing on Job­s’s will) does oc­ca­sion­ally dis­cuss Job­s’s few acts of phil­an­thropy and offers a differ­ent ver­sion of the (RED) con­tri­bu­tions:

    He was not par­tic­u­larly phil­an­throp­ic. He briefly set up a foun­da­tion, but he dis­cov­ered that it was an­noy­ing to have to deal with the per­son he had hired to run it, who kept talk­ing about “ven­ture” phil­an­thropy and how to “lever­age” giv­ing. Jobs be­came con­temp­tu­ous of peo­ple who made a dis­play of phil­an­thropy or think­ing they could rein­vent it. Ear­lier he had qui­etly sent in a $5,000 check to help launch Larry Bril­liant’s Seva Foun­da­tion to fight dis­eases of pover­ty, and he even agreed to join the board. But when Bril­liant brought some board mem­bers, in­clud­ing Wavy Gravy and Jerry Gar­cia, to Ap­ple right after its IPO to so­licit a do­na­tion, Jobs was not forth­com­ing. He in­stead worked on find­ing ways that a do­nated Ap­ple II and a Visi­Calc pro­gram could make it eas­ier for the foun­da­tion to do a sur­vey it was plan­ning on blind­ness in Nepal….His biggest per­sonal gift was to his par­ents, Paul and Clara Jobs, to whom he gave about $750,000 worth of stock. They sold some to pay off the mort­gage on their Los Al­tos home, and their son came over for the lit­tle cel­e­bra­tion. “It was the first time in their lives they did­n’t have a mort­gage,” Jobs re­called. “They had a hand­ful of their friends over for the par­ty, and it was re­ally nice.” Still, they did­n’t con­sider buy­ing a nicer house. “They weren’t in­ter­ested in that,” Jobs said. “They had a life they were happy with.” Their only splurge was to take a Princess cruise each year. The one through the Panama Canal “was the big one for my dad,” ac­cord­ing to Jobs, be­cause it re­minded him of when his Coast Guard ship went through on its way to San Fran­cisco to be de­com­mis­sioned…[­Mona Simp­son’s nov­el] de­picts Job­s’s quiet gen­eros­ity to, and pur­chase of a spe­cial car for, a bril­liant friend who had de­gen­er­a­tive bone dis­ease, and it ac­cu­rately de­scribes many un­flat­ter­ing as­pects of his re­la­tion­ship with Lisa, in­clud­ing his orig­i­nal de­nial of pa­ter­ni­ty…Bono got Jobs to do an­other deal with him in 2006, this one for his Prod­uct Red cam­paign that raised money and aware­ness to fight AIDS in Africa. Jobs was never much in­ter­ested in phil­an­thropy, but he agreed to do a spe­cial red iPod as part of Bono’s cam­paign. It was not a whole­hearted com­mit­ment. He balked, for ex­am­ple, at us­ing the cam­paign’s sig­na­ture treat­ment of putting the name of the com­pany in paren­the­ses with the word “red” in su­per­script after it, as in (APPLE)RED. “I don’t want Ap­ple in paren­the­ses,” Jobs in­sist­ed. Bono replied, “But Steve, that’s how we show unity for our cause.” The con­ver­sa­tion got heat­ed-to the F-you stage-be­fore they agreed to sleep on it. Fi­nally Jobs com­pro­mised, sort of. Bono could do what he wanted in his ads, but Jobs would never put Ap­ple in paren­the­ses on any of his prod­ucts or in any of his stores. The iPod was la­beled (PRODUCT)RED, not (APPLE)RED.

    In­side Ap­ple: How Amer­i­ca’s Most Ad­mired—and Se­cre­tive—­Com­pany Re­ally Works (Lashin­sky 2012) re­port­edly in­cludes the fol­low­ing choice anec­dote:

    A high­light of the Top 100 for at­ten­dees was an ex­tended Q&A be­tween Jobs and his ex­ec­u­tives. One asked why Jobs him­self was­n’t more phil­an­throp­ic. He re­sponded that he thought giv­ing away money was a waste of time.

    “The Job After Steve Jobs: Tim Cook and Ap­ple; From the mo­ment he be­came CEO of Ap­ple, Tim Cook found him­self in the shadow of his boss”

    Cook’s sec­ond de­ci­sion was to start a char­ity pro­gram, match­ing do­na­tions of up to $10,000, dol­lar for dol­lar an­nu­al­ly. This too was widely em­braced: The lack of an Ap­ple cor­po­rate-match­ing pro­gram had long been a sore point for many em­ploy­ees. Jobs had con­sid­ered match­ing pro­grams par­tic­u­larly in­effec­tive be­cause the con­tri­bu­tions would never amount to enough to make a differ­ence. Some of his friends be­lieved that Jobs would have taken up some causes once he had more time, but Jobs used to say that he was con­tribut­ing to so­ci­ety more mean­ing­fully by build­ing a good com­pany and cre­at­ing jobs. Cook be­lieved firmly in char­i­ty. “My ob­jec­tive—one day—is to to­tally help oth­ers,” he said. “To me, that’s real suc­cess, when you can say, ‘I don’t need it any­more. I’m go­ing to do some­thing else.’”

    Of course, if we re­ally want to res­cue Job­s’s rep­u­ta­tion, we can still do. It could be the case that Jobs was very char­i­ta­ble, but he does so com­pletely anony­mously or per­haps that he has pre­ferred to rein­vest his wealth in gain­ing more wealth and only do­nat­ing after his death; a Buffet-like strat­egy that—ex post—­would seem to be a very wise one given the stock per­for­mance of AAPL. Job­s’s death Oc­to­ber 2011 means that this the­ory is fal­si­fi­able sooner than I had ex­pected while writ­ing this es­say. Based on Job­s’s pre­vi­ous char­i­ta­ble giv­ing, and the gen­eral im­pres­sion I have from the ha­gio­graphic press cov­er­age is Ap­ple it­self is Job­s’s char­i­ta­ble gift to the world (which I can’t help but sus­pect ei­ther in­flu­enced or is in­flu­enced by the man him­self). My own gen­eral ex­pec­ta­tion is that he will defi­nitely not do­nate ~99% of his wealth to char­ity like Buffett or Gates (80%), prob­a­bly not >50% (70%), and more likely some­where in the 0-10% range (60%). As of 2013-01-01, my pre­dic­tions have been borne out. If any phil­an­thropy comes of Job­s’s Pixar bil­lions, I ex­pect it to be at the be­hest of his wid­ow, , who has long been in­volved in non-profits; to quote Isaac­son again:

    Job­s’s re­la­tion­ship with his wife was some­times com­pli­cated but al­ways loy­al. Savvy and com­pas­sion­ate, Lau­rene Pow­ell was a sta­bi­liz­ing in­flu­ence and an ex­am­ple of his abil­ity to com­pen­sate for some of his selfish im­pulses by sur­round­ing him­self with strong-willed and sen­si­ble peo­ple. She weighed in qui­etly on busi­ness is­sues, firmly on fam­ily con­cerns, and fiercely on med­ical mat­ters. Early in their mar­riage, she co­founded and launched Col­lege Track, a na­tional after-school pro­gram that helps dis­ad­van­taged kids grad­u­ate from high school and get into col­lege. Since then she had be­come a lead­ing force in the ed­u­ca­tion re­form move­ment. Jobs pro­fessed an ad­mi­ra­tion for his wife’s work: “What she’s done with Col­lege Track re­ally im­presses me.” But he tended to be gen­er­ally dis­mis­sive of phil­an­thropic en­deav­ors and never vis­ited her after-school cen­ters.

    She has also tried her hand at lob­by­ing for im­mi­gra­tion re­form, and con­tin­ued her reg­u­lar do­na­tions. At­tempt­ing to de­fend the Jobs’ rep­u­ta­tions, is Laura Ar­ril­la­ga-An­dreessen

    If you to­tal up in your mind all of the phil­an­thropic in­vest­ments that Lau­rene has made that the pub­lic knows about, that is prob­a­bly a frac­tion of 1% of what she ac­tu­ally does, and that’s the most I can say.

    ↩︎
  42. “The trou­ble with Steve Jobs”, , 2008-03-05:

    Last year the founder of the Stan­ford So­cial In­no­va­tion Re­view called Ap­ple one of “Amer­i­ca’s Least Phil­an­thropic Com­pa­nies.” Jobs had ter­mi­nated all of Ap­ple’s long-s­tand­ing cor­po­rate phil­an­thropy pro­grams within weeks after re­turn­ing to Ap­ple in 1997, cit­ing the need to cut costs un­til profitabil­ity re­bound­ed. But the pro­grams have never been re­stored.

    Un­like Bill Gates—the tech world’s other tow­er­ing fig­ure—Jobs has not shown much in­cli­na­tion to hand over the reins of his com­pany to cre­ate a differ­ent kind of per­sonal lega­cy. While his wife is deeply in­volved in an ar­ray of char­i­ta­ble pro­jects, Jobs’ only se­ri­ous foray into per­sonal phil­an­thropy was short­-lived. In Jan­u­ary 1987, after launch­ing Next, he al­so, with­out fan­fare or pub­lic no­tice, in­cor­po­rated the Steven P. Jobs Foun­da­tion. “He was very in­ter­ested in food and health is­sues and veg­e­tar­i­an­ism,” re­calls Mark Ver­mil­ion, the com­mu­nity affairs ex­ec­u­tive Jobs hired to run it. Ver­mil­ion per­suaded Jobs to fo­cus on “so­cial en­tre­pre­neur­ship” in­stead. But the Jobs foun­da­tion never did much of any­thing, be­sides hir­ing famed graphic de­signer Paul Rand to de­sign its lo­go. (Ex­plains Ver­mil­ion: “He wanted a logo wor­thy of his ex­pec­ta­tions.”) Jobs shut down the foun­da­tion after less than 15 months.

    ↩︎