The Existential Risk of Math Errors

Mathematical mistake/error-rates limit our understanding of rare risks and ability to defend against them
philosophy, transhumanism, statistics, survey, insight-porn
2012-07-202019-08-18 finished certainty: likely importance: 10


How em­pir­i­cally cer­tain can we be in any use of math­e­mat­i­cal rea­son­ing to make em­pir­i­cal claims? In con­trast to er­rors in many other forms of knowl­edge such as med­i­cine or psy­chol­o­gy, which have enor­mous lit­er­a­tures clas­si­fy­ing and quan­ti­fy­ing er­ror rates, rich meth­ods of meta-analy­sis and pool­ing ex­pert be­lief, and much one can say about the prob­a­bil­ity of any re­sult be­ing true, math­e­mat­i­cal er­ror has been rarely ex­am­ined ex­cept as a pos­si­bil­ity and a mo­ti­vat­ing rea­son for re­search into for­mal meth­ods. There is lit­tle known be­yond anec­dotes about how often pub­lished proofs are wrong, in what ways they are wrong, the im­pact of such er­rors, how er­rors vary by sub­field, what meth­ods de­crease (or in­crease) er­rors, and so on. Yet, math­e­mat­ics is surely not im­mune to er­ror, and for all the rich­ness of the sub­ject, math­e­mati­cians can usu­ally agree at least in­for­mally on what has turned out to be right or wrong1, or good by other cri­te­ria like fruit­ful­ness or beau­ty. Gaif­man 2004 claims that er­rors are com­mon but any such analy­sis would be uned­i­fy­ing:

An agent might even have be­liefs that log­i­cally con­tra­dict each oth­er. Mersenne be­lieved that 267-1 is a prime num­ber, which was proved false in 1903, cf. Bell (1951). [The fac­tor­iza­tion, dis­cov­ered by Cole, is: 193,707,721 × 761,838,257,287.]…Now, there is no short­age of de­duc­tive er­rors and of false math­e­mat­i­cal be­liefs. Mersen­ne’s is one of the most known in a rich his­tory of math­e­mat­i­cal er­rors, in­volv­ing very promi­nent fig­ures (cf. De Millo et al. 1979, 269–270). The ex­plo­sion in the num­ber of math­e­mat­i­cal pub­li­ca­tions and re­search re­ports has been ac­com­pa­nied by a sim­i­lar ex­plo­sion in er­ro­neous claims; on the whole, er­rors are noted by small groups of ex­perts in the area, and many go un­heed­ed. There is noth­ing philo­soph­i­cally in­ter­est­ing that can be said about such fail­ures.2

I dis­agree. Quan­ti­ta­tive ap­proaches can­not cap­ture every­thing, but why should we be­lieve math­e­mat­ics is, un­like so many other fields like med­i­cine, uniquely un­quan­tifi­able and in­effa­bly in­scrutable? As a non-math­e­mati­cian look­ing at math­e­mat­ics largely as a black box, I think such er­rors are quite in­ter­est­ing, for sev­eral rea­sons: given the ex­ten­sive role of math­e­mat­ics through­out the sci­ences, er­rors have se­ri­ous po­ten­tial im­pact; but in col­lect­ing all the anec­dotes I have found, the im­pact seems skewed to­wards er­rors in qua­si­-for­mal proofs but not the ac­tual re­sults; and this may tell us some­thing about what it is that math­e­mati­cians do sub­con­sciously when they “do math” or why con­jec­ture res­o­lu­tion times are ex­po­nen­tial­ly-dis­trib­uted or what the role of for­mal meth­ods ought to be or what we should think about prac­ti­cally im­por­tant but un­re­solved prob­lems like P=NP.

Untrustworthy proofs

“Be­ware of bugs in the above code; I have only proved it cor­rect, not tried it.”

Don­ald Knuth

“When you have elim­i­nated the im­pos­si­ble, what­ever re­mains is often more im­prob­a­ble than your hav­ing made a mis­take in one of your im­pos­si­bil­ity proofs.”

Steven Kaas

In some re­spects, there is noth­ing to be said; in other re­spects, there is much to be said. dis­cusses a ba­sic is­sue with : any use­ful dis­cus­sion will be rig­or­ous, hope­fully with physics and math proofs; but proofs them­selves are em­pir­i­cally un­re­li­able. Given that math­e­mat­i­cal proofs have long been claimed to be the most re­li­able form of epis­te­mol­ogy hu­mans know and the only way to guar­an­tee truth3, this sets a ba­sic up­per bound on how much con­fi­dence we can put on any be­lief, and given the lurk­ing ex­is­tence of sys­tem­atic bi­as­es, it may even be pos­si­ble for there to be too much ev­i­dence for a claim (). There are other rare risks, from men­tal dis­eases4 to hard­ware er­rors5 to how to deal with con­tra­dic­tions6, but we’ll look at math­e­mat­i­cal er­ror.

Error distribution

“When I asked what it was, he said, ‘It is the prob­a­bil­ity that the test bomb will ig­nite the whole at­mos­phere.’ I de­cided I would check it my­self! The next day when he came for the an­swers I re­marked to him, ‘The arith­metic was ap­par­ently cor­rect but I do not know about the for­mu­las for the cap­ture cross sec­tions for oxy­gen and ni­tro­gen—after all, there could be no ex­per­i­ments at the needed en­ergy lev­els.’ He replied, like a physi­cist talk­ing to a math­e­mati­cian, that he wanted me to check the arith­metic not the physics, and left. I said to my­self, ‘What have you done, Ham­ming, you are in­volved in risk­ing all of life that is known in the Uni­verse, and you do not know much of an es­sen­tial part?’ I was pac­ing up and down the cor­ri­dor when a friend asked me what was both­er­ing me. I told him. His re­ply was, ‘Never mind, Ham­ming, no one will ever blame you.’”

1998

“…of the two ma­jor ther­monu­clear cal­cu­la­tions made that sum­mer at Berke­ley, they got one right and .”

, 2020

This up­per bound on our cer­tainty may force us to dis­re­gard cer­tain rare risks be­cause the effect of er­ror on our es­ti­mates of ex­is­ten­tial risks is asym­met­ric: an er­ror will usu­ally re­duce the risk, not in­crease it. The er­rors are not dis­trib­uted in any kind of sym­met­ri­cal around a mean: an ex­is­ten­tial risk is, by de­fi­n­i­tion, bump­ing up against the up­per bound on pos­si­ble dam­age. If we were try­ing to es­ti­mate, say, av­er­age hu­man height, and er­rors were dis­trib­uted like a bell curve, then we could ig­nore them. But if we are cal­cu­lat­ing the risk of a su­per-as­teroid im­pact which will kill all of hu­man­i­ty, an er­ror which means the su­per-as­teroid will ac­tu­ally kill hu­man­ity twice over is ir­rel­e­vant be­cause it’s the same thing (we can’t die twice); how­ev­er, the mir­ror er­ror—the su­per-as­teroid ac­tu­ally killing half of hu­man­i­ty—­mat­ters a great deal!

XKCD #809 “Los Alamos”

How big is this up­per bound? Math­e­mati­cians have often made er­rors in proofs. But it’s for ideas to be ac­cepted for a long time and then re­ject­ed. But we can di­vide er­rors into 2 ba­sic cases cor­re­spond­ing to :

  1. Mis­takes where the the­o­rem is still true, but the proof was in­cor­rect (type I)
  2. Mis­takes where the the­o­rem was false, and the proof was also nec­es­sar­ily in­cor­rect (type II)

Be­fore some­one comes up with a fi­nal an­swer, a math­e­mati­cian may have many lev­els of in­tu­ition in for­mu­lat­ing & work­ing on the prob­lem, but we’ll con­sider the fi­nal end-prod­uct where the math­e­mati­cian feels sat­is­fied that he has solved it. Case 1 is per­haps the most com­mon case, with in­nu­mer­able ex­am­ples; this is some­times due to mis­takes in the proof that any­one would ac­cept is a mis­take, but many of these cases are due to chang­ing stan­dards of proof. For ex­am­ple, when David Hilbert dis­cov­ered er­rors in Eu­clid’s proofs which no one no­ticed be­fore, the the­o­rems were still true, and the gaps more due to Hilbert be­ing a mod­ern math­e­mati­cian think­ing in terms of for­mal sys­tems (which of course Eu­clid did not think in). (David Hilbert him­self turns out to be a use­ful ex­am­ple of the other kind of er­ror: his fa­mous was ac­com­pa­nied by defi­nite opin­ions on the out­come of each prob­lem and some­times tim­ings, sev­eral of which were wrong or ques­tion­able7.) Sim­i­lar­ly, early cal­cu­lus used ‘in­fin­i­tes­i­mals’ which were some­times treated as be­ing 0 and some­times treated as an in­defi­nitely small non-zero num­ber; this was in­co­her­ent and strictly speak­ing, prac­ti­cally all of the cal­cu­lus re­sults were wrong be­cause they re­lied on an in­co­her­ent con­cep­t—but of course the re­sults were some of the great­est math­e­mat­i­cal work ever con­ducted8 and when later math­e­mati­cians put cal­cu­lus on a more rig­or­ous foot­ing, they im­me­di­ately re-derived those re­sults (some­times with im­por­tant qual­i­fi­ca­tion­s), and doubt­less as mod­ern math evolves other fields have some­times needed to go back and clean up the foun­da­tions and will in the fu­ture.9 Other cases are more straight­for­ward, with math­e­mati­cians pub­lish­ing mul­ti­ple proofs/patches10 or covertly cor­rect­ing pa­pers11. Some­times they make it into text­books: re­al­ized that his proof for , which is still open, was wrong only after 2 read­ers saw it in his 1914 text­book The The­ory of Num­bers and ques­tioned it. At­tempts to for­mal­ize re­sults into ex­per­i­men­tal­ly-ver­i­fi­able re­sults (in the case of physic­s-re­lated math) or ma­chine-checked proofs, or at least some sort of soft­ware form, some­times turns up is­sues with12 ac­cepted13 re­sults14, al­though not al­ways im­por­tant (eg the cor­rec­tion in Romero & Ru­bio 2013). Poin­caré points out this math­e­mat­i­cal ver­sion of the in “In­tu­ition and Logic in Math­e­mat­ics”:

Strange! If we read over the works of the an­cients we are tempted to class them all among the in­tu­ition­al­ists. And yet na­ture is al­ways the same; it is hardly prob­a­ble that it has be­gun in this cen­tury to cre­ate minds de­voted to log­ic. If we could put our­selves into the flow of ideas which reigned in their time, we should rec­og­nize that many of the old geome­ters were in ten­dency an­a­lysts. Eu­clid, for ex­am­ple, erected a sci­en­tific struc­ture wherein his con­tem­po­raries could find no fault. In this vast con­struc­tion, of which each piece how­ever is due to in­tu­ition, we may still to­day, with­out much effort, rec­og­nize the work of a lo­gi­cian.

… What is the cause of this evo­lu­tion? It is not hard to find. In­tu­ition can not give us rigour, nor even cer­tain­ty; this has been rec­og­nized more and more. Let us cite some ex­am­ples. We know there ex­ist con­tin­u­ous func­tions lack­ing de­riv­a­tives. Noth­ing is more shock­ing to in­tu­ition than this propo­si­tion which is im­posed upon us by log­ic. Our fa­thers would not have failed to say: “It is ev­i­dent that every con­tin­u­ous func­tion has a de­riv­a­tive, since every curve has a tan­gent.” How can in­tu­ition de­ceive us on this point?

… I shall take as sec­ond ex­am­ple on which rest so many the­o­rems of math­e­mat­i­cal physics; to­day we es­tab­lish it by rea­son­ings very rig­or­ous but very long; hereto­fore, on the con­trary, we were con­tent with a very sum­mary proof. A cer­tain in­te­gral de­pend­ing on an ar­bi­trary func­tion can never van­ish. Hence it is con­cluded that it must have a min­i­mum. The flaw in this rea­son­ing strikes us im­me­di­ate­ly, since we use the ab­stract term func­tion and are fa­mil­iar with all the sin­gu­lar­i­ties func­tions can present when the word is un­der­stood in the most gen­eral sense. But it would not be the same had we used con­crete im­ages, had we, for ex­am­ple, con­sid­ered this func­tion as an elec­tric po­ten­tial; it would have been thought le­git­i­mate to affirm that elec­tro­sta­tic equi­lib­rium can be at­tained. Yet per­haps a phys­i­cal com­par­i­son would have awak­ened some vague dis­trust. But if care had been taken to trans­late the rea­son­ing into the lan­guage of geom­e­try, in­ter­me­di­ate be­tween that of analy­sis and that of physics, doubt­less this dis­trust would not have been pro­duced, and per­haps one might thus, even to­day, still de­ceive many read­ers not fore­warned.

…A first ques­tion presents it­self. Is this evo­lu­tion end­ed? Have we fi­nally at­tained ab­solute rigour? At each stage of the evo­lu­tion our fa­thers also thought they had reached it. If they de­ceived them­selves, do we not like­wise cheat our­selves?

We be­lieve that in our rea­son­ings we no longer ap­peal to in­tu­ition; the philoso­phers will tell us this is an il­lu­sion. Pure logic could never lead us to any­thing but tau­tolo­gies; it could cre­ate noth­ing new; not from it alone can any sci­ence is­sue. In one sense these philoso­phers are right; to make arith­metic, as to make geom­e­try, or to make any sci­ence, some­thing else than pure logic is nec­es­sary.

Isaac New­ton, in­ci­den­tal­ly, gave two proofs of the same so­lu­tion to a prob­lem in prob­a­bil­i­ty, one via enu­mer­a­tion and the other more ab­stract; the enu­mer­a­tion was cor­rect, but the other proof to­tally wrong and this was not no­ticed for a long time, lead­ing Stigler to re­mark:15

If New­ton fooled him­self, he ev­i­dently took with him a suc­ces­sion of read­ers more than 250 years lat­er. Yet even they should feel no em­bar­rass­ment. As once wrote, “Every­one makes er­rors in prob­a­bil­i­ties, at times, and big ones.” (Graves, 1889, page 459)

Type I > Type II?

“Lef­schetz was a purely in­tu­itive math­e­mati­cian. It was said of him that he had never given a com­pletely cor­rect proof, but had never made a wrong guess ei­ther.”

Gi­an-Carlo Rota16

Case 2 is dis­turbing, since it is a case in which we wind up with false be­liefs and also false be­liefs about our be­liefs (we no longer know that we don’t know). Case 2 could lead to ex­tinc­tion.

The preva­lence of case 1 might lead us to be very pes­simistic; case 1, case 2, what’s the differ­ence? We have demon­strated a large er­ror rate in math­e­mat­ics (and physics is prob­a­bly even worse off). Ex­cept, er­rors do not seem to be evenly & ran­domly dis­trib­uted be­tween case 1 and case 2. There seem to be far more case 1s than case 2s, as al­ready men­tioned in the early cal­cu­lus ex­am­ple: far more than 50% of the early cal­cu­lus re­sults were cor­rect when checked more rig­or­ous­ly. Richard Ham­ming (1998) at­trib­utes to a com­ment that while edit­ing that “of the new re­sults in the pa­pers re­viewed most are true but the cor­re­spond­ing proofs are per­haps half the time plain wrong”. (WP men­tions as well that “His first math­e­mat­ics pub­li­ca­tion was writ­ten…after he dis­cov­ered an in­cor­rect proof in an­other pa­per.”) gives us an ex­am­ple with Hilbert:

Once more let me be­gin with Hilbert. When the Ger­mans were plan­ning to pub­lish Hilbert’s col­lected pa­pers and to present him with a set on the oc­ca­sion of one of his later birth­days, they re­al­ized that they could not pub­lish the pa­pers in their orig­i­nal ver­sions be­cause they were full of er­rors, some of them quite se­ri­ous. There­upon they hired a young un­em­ployed math­e­mati­cian, , to go over Hilbert’s pa­pers and cor­rect all mis­takes. Olga la­bored for three years; it turned out that all mis­takes could be cor­rected with­out any ma­jor changes in the state­ment of the the­o­rems. There was one ex­cep­tion, a pa­per Hilbert wrote in his old age, which could not be fixed; it was a pur­ported proof of the con­tin­uum hy­poth­e­sis, you will find it in a vol­ume of the Math­e­ma­tis­che An­nalen of the early thir­ties. At last, on Hilbert’s birth­day, a freshly printed set of Hilbert’s col­lected pa­pers was pre­sented to the Geheim­rat. Hilbert leafed through them care­fully and did not no­tice any­thing.17

So only one of those pa­pers was ir­repara­ble, while all the oth­ers were cor­rect and fix­able? Rota him­self ex­pe­ri­enced this:

Now let us shift to the other end of the spec­trum, and al­low me to re­late an­other per­sonal anec­dote. In the sum­mer of 1979, while at­tend­ing a phi­los­o­phy meet­ing in Pitts­burgh, I was struck with a case of de­tached reti­nas. Thanks to Joni’s prompt in­ter­ven­tion, I man­aged to be op­er­ated on in the nick of time and my eye­sight was saved. On the morn­ing after the op­er­a­tion, while I was ly­ing on a hos­pi­tal bed with my eyes ban­daged, Joni dropped in to vis­it. Since I was to re­main in that Pitts­burgh hos­pi­tal for at least a week, we de­cided to write a pa­per. Joni fished a man­u­script out of my suit­case, and I men­tioned to her that the text had a few mis­takes which she could help me fix. There fol­lowed twenty min­utes of si­lence while she went through the draft. “Why, it is all wrong!” she fi­nally re­marked in her youth­ful voice. She was right. Every state­ment in the man­u­script had some­thing wrong. Nev­er­the­less, after la­bor­ing for a while, she man­aged to cor­rect every mis­take, and the pa­per was even­tu­ally pub­lished.

There are two kinds of mis­takes. There are fa­tal mis­takes that de­stroy a the­o­ry; but there are also con­tin­gent ones, which are use­ful in test­ing the sta­bil­ity of a the­o­ry.

A math­e­mati­cian of my ac­quain­tance re­ferred me to pg118 of The Ax­iom of Choice, Jech 1973; he had found the sus­tained effect of the 5 foot­notes hu­mor­ous:

  1. The re­sult of Prob­lem 11 con­tra­dicts the re­sults an­nounced by Levy [1963b]. Un­for­tu­nate­ly, the con­struc­tion pre­sented there can­not be com­plet­ed.
  2. The trans­fer to ZF was also claimed by Marek [1966] but the out­lined method ap­pears to be un­sat­is­fac­tory and has not been pub­lished.
  3. A con­tra­dict­ing re­sult was an­nounced and later with­drawn by Truss [1970].
  4. The ex­am­ple in Prob­lem 22 is a coun­terex­am­ple to an­other con­di­tion of Mostowski, who con­jec­tured its suffi­ciency and sin­gled out this ex­am­ple as a test case.
  5. The in­de­pen­dence re­sult con­tra­dicts the claim of Fel­gner [1969] that the Co­fi­nal­ity Prin­ci­ple im­plies the Ax­iom of Choice. An er­ror has been found by Mor­ris (see Fel­gn­er’s cor­rec­tions to [1969]).

And re­ferred me also to the en­tries in the in­dex of Fourier Analy­sis by Tom Körner con­cern­ing the prob­lem of the “point­wise con­ver­gence of Fourier se­ries”:

Some prob­lems are no­to­ri­ous for pro­vok­ing re­peated false proofs. at­tracts count­less cranks and se­ri­ous at­tempts, of course, but also amus­ing is ap­par­ently the Ja­co­bian Con­jec­ture:

The (in)­fa­mous Ja­co­bian Con­jec­ture was con­sid­ered a the­o­rem since a 1939 pub­li­ca­tion by Keller (who claimed to prove it). Then Sha­fare­vich found a new proof and pub­lished it in some con­fer­ence pro­ceed­ings pa­per (in early 1950-ies). This con­jec­ture states that any poly­no­mial map from C^2 to C^2 is in­vert­ible if its Ja­co­bian is nowhere ze­ro. In 1960-ies, Vi­tushkin found a coun­terex­am­ple to all the proofs known to date, by con­struct­ing a com­plex an­a­lytic map, not in­vert­ible and with nowhere van­ish­ing Ja­co­bian. It is still a main source of em­bar­rass­ment for Arx­iv.org con­trib­u­tors, who pub­lish about 3–5 false proofs year­ly. Here is a funny refu­ta­tion for one of the proofs:

The prob­lem of Ja­co­bian Con­jec­ture is very hard. Per­haps it will take hu­man be­ing an­other 100 years to solve it. Your at­tempt is no­ble, Maybe the Gods of Olym­pus will smile on you one day. Do not be too dis­ap­point­ed. B. Sagre has the honor of pub­lish­ing three wrong proofs and C. Cheval­ley mis­takes a wrong proof for a cor­rect one in the 1950’s in his Math Re­view com­ments, and I.R. Sha­fare­vich uses Ja­co­bian Con­jec­ture (to him it is a the­o­rem) as a fact…

This look into the prover­bial sausage fac­tory should not come as a sur­prise to any­one tak­ing an Out­side View: why would­n’t we ex­pect any area of in­tel­lec­tual en­deav­our to have er­ror rates within a few or­ders of mag­ni­tude as any other area? How ab­surd to think that the rate might be ~0%; but it’s also a lit­tle ques­tion­able to be as op­ti­mistic as An­ders Sand­berg’s math­e­mati­cian friend: “he re­sponded that he thought a far smaller num­ber [1%] of pa­pers in math were this flawed.”

Heuristics

Other times, the cor­rect re­sult is known and proven, but many are un­aware of the an­swers19. The fa­mous —those that have been solved, any­way—have a long his­tory of failed proofs (Fer­mat surely did not prove & may have re­al­ized this only after boast­ing20 and nei­ther did 21). What ex­plains this? The guid­ing fac­tor that keeps pop­ping up when math­e­mati­cians make leaps seems to go un­der the name of ‘el­e­gance’ or , which widely con­sid­ered im­por­tant222324. This im­bal­ance sug­gests that math­e­mati­cians are quite cor­rect when they say proofs are not the heart of math­e­mat­ics and that they pos­sess in­sight into math, a 6th sense for math­e­mat­i­cal truth, a nose for aes­thetic beauty which cor­re­lates with ve­rac­i­ty: they dis­pro­por­tion­ately go after the­o­rems rather than their nega­tions.

Why this is so, I do not know.

Out­right Pla­ton­ism like Godel ap­par­ently be­lieved in seems un­like­ly—­math­e­mat­i­cal ex­per­tise re­sem­bles a com­plex skill like chess-play­ing more than it does a sen­sory modal­ity like vi­sion. Pos­si­bly they have well-de­vel­oped heuris­tics and short­-cuts and they fo­cus on the sub­sets of re­sults on which those heuris­tics work well (the drunk search­ing un­der the spot­light), or per­haps they do run full rig­or­ous proofs but are do­ing so sub­con­sciously and merely ex­press them­selves in­eptly con­sciously with omis­sions and er­ro­neous for­mu­la­tions ‘left as an ex­er­cise for the reader’25.

We could try to jus­tify the heuris­tic par­a­digm by ap­peal­ing to as-yet poorly un­der­stood as­pects of the brain, like our vi­sual cor­tex: ar­gue that what is go­ing on is that math­e­mati­cians are sub­con­sciously do­ing tremen­dous amounts of com­pu­ta­tion (like we do tremen­dous amounts of com­pu­ta­tion in a thought as or­di­nary as rec­og­niz­ing a face), which they are un­able to bring up ex­plic­it­ly. So after pro­longed in­tro­spec­tion and some com­par­a­tively sim­ple ex­plicit sym­bol ma­nip­u­la­tion or thought, they feel that a con­jec­ture is true and this is due to a sum­mary of said mas­sive com­pu­ta­tions.

Per­haps they are check­ing many in­stances? Per­haps they are and look­ing for bound­aries? Could there be some sort of “log­i­cal prob­a­bil­ity” where go­ing down pos­si­ble proof-paths yield prob­a­bilis­tic in­for­ma­tion about the fi­nal tar­get the­o­rem, maybe in some sort of of proof-trees? Do serve to con­sol­i­date & prune & mem­o­ries of in­com­plete lines of thought, fine­tun­ing heuris­tics or in­tu­itions for fu­ture at­tacks and get­ting deeper into a prob­lem (per­haps anal­o­gous to )? Read­ing great math­e­mati­cians like dis­cuss the heuris­tics they use on un­solved prob­lems26, they bear some re­sem­blances to com­puter sci­ence tech­niques. This would be con­sis­tent with a pre­lim­i­nary ob­ser­va­tion about to solve math­e­mat­i­cal con­jec­tures: while in­fer­ence is ren­dered diffi­cult by the ex­po­nen­tial growth in the global pop­u­la­tion and of math­e­mati­cians, the dis­tri­b­u­tion of time-to-so­lu­tion roughly matches a mem­o­ry­less (one with a con­stant chance of solv­ing it in any time pe­ri­od) rather than a more in­tu­itive dis­tri­b­u­tion like a type 1 (where a con­jec­ture gets eas­ier to solve over time, per­haps as re­lated math­e­mat­i­cal knowl­edge ac­cu­mu­lates), sug­gest­ing a model of math­e­mat­i­cal ac­tiv­ity in which many in­de­pen­dent ran­dom at­tempts are made, each with a small chance of suc­cess, and even­tu­ally one suc­ceeds. This idea of ex­ten­sive un­con­scious com­pu­ta­tion neatly ac­cords with Poin­car­é’s ac­count of math­e­mat­i­cal cre­ativ­ity in which after long fruit­less effort (prepa­ra­tion), he aban­doned the prob­lem for a time and en­gaged in or­di­nary ac­tiv­i­ties (), is sud­denly struck by an an­swer or in­sight, and then ver­i­fies its cor­rect­ness con­scious­ly. The ex­is­tence of an in­cu­ba­tion effect seems con­firmed by psy­cho­log­i­cal stud­ies and par­tic­u­lar the ob­ser­va­tion that in­cu­ba­tion effects in­crease with the time al­lowed for in­cu­ba­tion & also if the sub­ject does not un­der­take de­mand­ing men­tal tasks dur­ing the in­cu­ba­tion pe­riod (see Sio & Ormerod 2009), and is con­sis­tent with ex­ten­sive un­con­scious com­pu­ta­tion. Some of this com­pu­ta­tion may hap­pen dur­ing sleep; sleep & cog­ni­tion have long been as­so­ci­ated in a murky fash­ion (“sleep on it”), but it may have to do with re­view­ing the events of the day & diffi­cult tasks, with rel­e­vant re­in­forced or per­haps more think­ing go­ing on. I’ve seen more than one sug­ges­tion of this, and math­e­mati­cian sug­gests this as well.27 (It’s un­clear how many re­sults oc­cur this way; men­tions find­ing one re­sult but never again28; J Thomas men­tions one suc­cess but one fail­ure by a teacher29; R. W. Thoma­son dreamed of a dead friend mak­ing a clearly false claim and pub­lished ma­te­r­ial based on his dis­proof of the ghost’s claim30; and re­port­edly had a use­ful dream & an early sur­vey of 69 math­e­mati­cians yielded 63 nulls, 5 low-qual­ity re­sults, and 1 hit31.)

Heuris­tics, how­ev­er, do not gen­er­al­ize, and fail out­side their par­tic­u­lar do­main. Are we for­tu­nate enough that the do­main math­e­mati­cians work in is—de­lib­er­ately or ac­ci­den­tal­ly—just that do­main in which their heuristics/intuition suc­ceeds? Sand­berg sug­gests not:

Un­for­tu­nately I sus­pect that the con­nois­seur­ship of math­e­mati­cians for truth might be lo­cal to their do­main. I have dis­cussed with friends about how “brit­tle” differ­ent math­e­mat­i­cal do­mains are, and our con­sen­sus is that there are defi­nitely differ­ences be­tween log­ic, geom­e­try and cal­cu­lus. Philoso­phers also seem to have a good nose for what works or does­n’t in their do­main, but it does­n’t seem to carry over to other do­mains. Now mov­ing out­side to ap­plied do­mains things get even trick­i­er. There does­n’t seem to be the same “nose for truth” in risk as­sess­ment, per­haps be­cause it is an in­ter­dis­ci­pli­nary, messy do­main. The cog­ni­tive abil­i­ties that help de­tect cor­rect de­ci­sions are likely lo­cal to par­tic­u­lar do­mains, trained through ex­pe­ri­ence and maybe tal­ent (i.e. some con­for­mity be­tween neural path­ways and deep prop­er­ties of the do­main). The only thing that re­mains is gen­er­al-pur­pose in­tel­li­gence, and that has its own lim­i­ta­tions.

ad­vo­cates for ma­chine-checked proofs and a more rig­or­ous style of proofs sim­i­lar to , not­ing a math­e­mati­cian ac­quain­tance guesses at a broad er­ror rate of 1⁄332 and that he rou­tinely found mis­takes in his own proofs and, worse, be­lieved false con­jec­tures33.

We can prob­a­bly add soft­ware to that list: early soft­ware en­gi­neer­ing work found that, dis­may­ing­ly, bug rates seem to be sim­ply a func­tion of , and one would ex­pect . So one would ex­pect that in go­ing from the ~4,000 lines of code of the Mi­crosoft DOS op­er­at­ing sys­tem ker­nel to the ~50,000,000 lines of code in Win­dows Server 2003 (with full sys­tems of ap­pli­ca­tions and li­braries be­ing even larg­er: the com­pre­hen­sive repos­i­tory in 2007 con­tained ~323,551,126 lines of code) that the num­ber of ac­tive bugs at any time would be… fairly large. Math­e­mat­i­cal soft­ware is hope­fully bet­ter, but prac­ti­tion­ers still run into is­sues (eg Durán et al 2014, Fon­seca et al 2017) and I don’t know of any re­search pin­ning down how buggy key math­e­mat­i­cal sys­tems like Math­e­mat­ica are or how much pub­lished math­e­mat­ics may be er­ro­neous due to bugs. This gen­eral prob­lem led to pre­dic­tions of doom and spurred much re­search into au­to­mated proof-check­ing, sta­tic analy­sis, and func­tional lan­guages34.

The doom, how­ev­er, did not man­i­fest and ar­guably op­er­at­ing sys­tems & ap­pli­ca­tions are more re­li­able in the 2000s+ than they were in the 1980–1990s35 (eg. the gen­eral dis­ap­pear­ance of the ). Users may not ap­pre­ci­ate this point, but pro­gram­mers who hap­pen to think one day of just how the sausage of Gmail is made—how many in­ter­act­ing tech­nolo­gies and stacks of for­mats and pro­to­cols are in­volved—­may get the shakes and won­der how it could ever work, much less be work­ing at that mo­ment. The an­swer is not re­ally clear: it seems to be a com­bi­na­tion of abun­dant com­put­ing re­sources dri­ving down per-line er­ror rates by avoid­ing op­ti­miza­tion, mod­u­lar­iza­tion re­duc­ing in­ter­ac­tions be­tween lines, greater use of test­ing in­vok­ing an ad­ver­sar­ial at­ti­tude to one’s code, and a light sprin­kling of for­mal meth­ods & sta­tic checks36.

While hope­ful, it’s not clear how many of these would ap­ply to ex­is­ten­tial risks: how does one use ran­dom­ized test­ing on the­o­ries of ex­is­ten­tial risk, or trade­off code clar­ity for com­put­ing per­for­mance?

Type I vs Type II

So we might for­give case 1 er­rors en­tire­ly: if a com­mu­nity of math­e­mati­cians take an ‘in­cor­rect’ proof about a par­tic­u­lar ex­is­ten­tial risk and rat­ify it (ei­ther by ver­i­fy­ing the proof sub­con­sciously or see­ing what their heuris­tics say), it not be­ing writ­ten out be­cause it would be te­dious too37, then we may be more con­fi­dent in it38 than lump­ing the two er­ror rates to­geth­er. Case 2 er­rors are the prob­lem, and they can some­times be sys­tem­at­ic. Most dra­mat­i­cal­ly, when an en­tire group of pa­pers with all their re­sults turn out to be wrong since they made a since-dis­proved as­sump­tion:

In the 1970s and 1980s, math­e­mati­cians dis­cov­ered that framed man­i­folds with Arf- equal to 1—od­dball man­i­folds not sur­gi­cally re­lated to a sphere—do in fact ex­ist in the first five di­men­sions on the list: 2, 6, 14, 30 and 62. A clear pat­tern seemed to be es­tab­lished, and many math­e­mati­cians felt con­fi­dent that this pat­tern would con­tinue in higher di­men­sion­s…Re­searchers de­vel­oped what Ravenel calls an en­tire “cos­mol­ogy” of con­jec­tures based on the as­sump­tion that man­i­folds with Ar­f-K­er­vaire in­vari­ant equal to 1 ex­ist in all di­men­sions of the form . Many called the no­tion that these man­i­folds might not ex­ist the “Dooms­day Hy­poth­e­sis,” as it would wipe out a large body of re­search. Ear­lier this year, Vic­tor Snaith of the Uni­ver­sity of Sheffield in Eng­land pub­lished a book about this re­search, warn­ing in the pref­ace, “…this might turn out to be a book about things which do not ex­ist.”

Just weeks after Snaith’s book ap­peared, Hop­kins an­nounced on April 21 that Snaith’s worst fears were jus­ti­fied: that Hop­kins, Hill and Ravenel had proved that no man­i­folds of Ar­f-K­er­vaire in­vari­ant equal to 1 ex­ist in di­men­sions 254 and high­er. Di­men­sion 126, the only one not cov­ered by their analy­sis, re­mains a mys­tery. The new find­ing is con­vinc­ing, even though it over­turns many math­e­mati­cians’ ex­pec­ta­tions, Hovey said.39

The is an­other fas­ci­nat­ing ex­am­ple of math­e­mat­i­cal er­ror of the sec­ond kind; its his­tory is re­plete with false proofs even by greats like (on what strike the mod­ern reader as bizarre grounds)40, self­-de­cep­tion, and mis­un­der­stand­ings— de­vel­oped a non-Euclid­ean geom­e­try flaw­lessly but con­cluded it was flawed:

The sec­ond pos­si­bil­ity turned out to be harder to re­fute. In fact he was un­able to de­rive a log­i­cal con­tra­dic­tion and in­stead de­rived many non-in­tu­itive re­sults; for ex­am­ple that tri­an­gles have a max­i­mum fi­nite area and that there is an ab­solute unit of length. He fi­nally con­cluded that: “the hy­poth­e­sis of the acute an­gle is ab­solutely false; be­cause it is re­pug­nant to the na­ture of straight lines”. To­day, his re­sults are the­o­rems of .

We could look upon Type II er­rors as hav­ing a benev­o­lent as­pect: they show both that our ex­ist­ing meth­ods are too weak & in­for­mal and that our intuition/heuristics break down at it—im­ply­ing that all pre­vi­ous math­e­mat­i­cal effort has been sys­tem­at­i­cally mis­led in avoid­ing that area (as emp­ty), and that there is much low-hang­ing fruit. (Con­sider how many scores or hun­dreds of key the­o­rems were proven by the very first math­e­mati­cians to work in the non-Euclid­ean geome­tries!)

Future implications

Should such wide­ly-be­lieved con­jec­tures as 41 or the turn out be false, then be­cause they are as­sumed by so many ex­ist­ing proofs, en­tire text­book chap­ters (and per­haps text­books) would dis­ap­pear—and our pre­vi­ous es­ti­mates of er­ror rates will turn out to have been sub­stan­tial un­der­es­ti­mates. But it may be a cloud with a sil­ver lin­ing: it is not what you don’t know that’s dan­ger­ous, but what you know that ain’t so.

See Also

Appendix

Jones 1998

“A credo of sorts”; Vaughan Jones (Truth in Math­e­mat­ics, 1998), pg208–209:

Proofs are in­dis­pens­able, but I would say they are nec­es­sary but not suffi­cient for math­e­mat­i­cal truth, at least truth as per­ceived by the in­di­vid­ual.

To jus­tify this at­ti­tude let me in­voke two ex­pe­ri­ences of cur­rent math­e­mat­ics, which very few math­e­mati­cians to­day have es­caped.

The first is com­puter pro­gram­ming. To write a short pro­gram, say 100 lines of C code, is a rel­a­tively pain­less ex­pe­ri­ence. The de­bug­ging will take longer than the writ­ing, but it will not en­tail sui­ci­dal thoughts. How­ev­er, should an in­ex­pe­ri­enced pro­gram­mer un­der­take to write a slightly longer pro­gram, say 1000 lines, dis­tress­ing re­sults will fol­low. The de­bug­ging process be­comes an emo­tional night­mare in which one will doubt one’s own san­i­ty. One will cer­tainly in­sult the com­piler in words that are in­ap­pro­pri­ate for this es­say. The math­e­mati­cian, hav­ing gone through this tor­ture, can­not but ask: “Have I ever sub­jected the proofs of any of my the­o­rems to such close scruti­ny?” In my case at least the an­swer is surely “no”. So while I do not doubt that my proofs are cor­rect (at least the sig­nifi­cant ones), my be­lief in the re­sults needs bol­ster­ing. Com­pare this with the de­bug­ging process. At the end of de­bug­ging we are happy with our pro­gram be­cause of the con­sis­tency of the out­put it gives, not be­cause we feel we have proved it cor­rec­t—after all we did that at least twenty times while de­bug­ging and we were wrong every time. Why not a twen­ty-first? In fact we are acutely aware that our poor pro­gram has only been tested with a lim­ited set of in­puts and we fully ex­pect more bugs to man­i­fest them­selves when in­puts are used which we have not yet con­sid­ered. If the pro­gram is suffi­ciently im­por­tant, it will be fur­ther de­bugged in the course of time un­til it be­comes se­cure with re­spect to all in­puts. (With much larger pro­grams this will never hap­pen.) So it is with our the­o­rems. Al­though we may have proofs ga­lore and a rich sur­round­ing struc­ture, if the re­sult is at all diffi­cult it is only the test of time that will cause ac­cep­tance of the “truth” of the re­sult.

The sec­ond ex­pe­ri­ence con­cern­ing the need for sup­ple­ments to proof is one which I used to dis­like in­tense­ly, but have come to ap­pre­ci­ate and even search for. It is the sit­u­a­tion where one has two wa­ter­tight, well-de­signed ar­gu­ments—that lead in­ex­orably to op­po­site con­clu­sions. Re­mem­ber that re­search in math­e­mat­ics in­volves a foray into the un­known. We may not know which of the two con­clu­sions is cor­rect or even have any feel­ing or guess. Proof at this point is our only ar­biter. And it seems to have let us down. I have known my­self to be in this sit­u­a­tion for months on end. It in­duces ob­ses­sive and an­ti-so­cial be­hav­iour. Per­haps we have found an in­con­sis­tency in math­e­mat­ics. But no, even­tu­ally some crack is seen in one of the ar­gu­ments and it be­gins to look more and more shaky. Even­tu­ally we kick our­selves for be­ing so ut­terly stu­pid and life goes on. But it was no tool of logic that saved us. The search for a chink in the ar­mour often in­volved many tricks in­clud­ing elab­o­rate thought ex­per­i­ments and per­haps com­puter cal­cu­la­tions. Much struc­tural un­der­stand­ing is cre­at­ed, which is why I now so value this process. One’s feel­ing of hav­ing ob­tained truth at the end is ap­proach­ing the ab­solute. Though I should add that I have been forced to re­verse the con­clu­sion on oc­ca­sions…


  1. Ex­am­ples like the ABC con­jec­ture be­ing the ex­cep­tions that prove the rule.↩︎

  2. Ci­ta­tions:

    ↩︎
  3. As a prag­ma­tist & em­piri­cist, I must have the temer­ity to dis­agree with the likes of Plato about the role of proof: if math­e­mat­i­cal proof truly was so re­li­able, then I would have lit­tle to write about in this es­say. How­ever rig­or­ous logic is, it is still cre­ated & used by fal­li­ble hu­mans. There is no ‘Pla­to­nia’ we can tap into to ob­tain tran­scen­dent truth.↩︎

  4. There are var­i­ous delu­sions (eg. ), s, com­pul­sive ly­ing (), dis­or­ders pro­vok­ing such as the gen­eral symp­tom of ; in a dra­matic ex­am­ple of how the mind is what the brain does, some anosog­nosia can be tem­porar­ily cured by squirt­ing cold wa­ter in an ear; from “The Apol­o­gist and the Rev­o­lu­tion­ary”:

    Take the ex­am­ple of the woman dis­cussed in Lish­man’s Or­ganic Psy­chi­a­try. After a right-hemi­sphere stroke, she lost move­ment in her left arm but con­tin­u­ously de­nied it. When the doc­tor asked her to move her arm, and she ob­served it not mov­ing, she claimed that it was­n’t ac­tu­ally her arm, it was her daugh­ter’s. Why was her daugh­ter’s arm at­tached to her shoul­der? The pa­tient claimed her daugh­ter had been there in the bed with her all week. Why was her wed­ding ring on her daugh­ter’s hand? The pa­tient said her daugh­ter had bor­rowed it. Where was the pa­tien­t’s arm? The pa­tient “turned her head and searched in a be­mused way over her left shoul­der”…In any case, a pa­tient who has been deny­ing paral­y­sis for weeks or months will, upon hav­ing cold wa­ter placed in the ear, ad­mit to paral­y­sis, ad­mit to hav­ing been par­a­lyzed the past few weeks or months, and ex­press be­wil­der­ment at hav­ing ever de­nied such an ob­vi­ous fact. And then the effect wears off, and the pa­tient not only de­nies the paral­y­sis but de­nies ever hav­ing ad­mit­ted to it.

    ↩︎
  5. , Hales 2014:

    As an ex­am­ple, we will cal­cu­late the ex­pected num­ber of soft er­rors in one of the math­e­mat­i­cal cal­cu­la­tions of Sec­tion 1.17. The At­las Project cal­cu­la­tion of the E8 char­ac­ter ta­ble was a 77 hour cal­cu­la­tion that re­quired 64 gi­ga­bytes RAM [Ada07]. Soft er­rors rates are gen­er­ally mea­sured in units of fail­ures-in-time (FIT). One FIT is de­fined as one er­ror per 109 hours of op­er­a­tion. If we as­sume a soft er­ror rate of 103 FIT per Mbit, (which is a typ­i­cal rate for a mod­ern mem­ory de­vice op­er­at­ing at sea level 15 [Tez04]), then we would ex­pect there to be about 40 soft er­rors in mem­ory dur­ing the cal­cu­la­tion:

    This ex­am­ple shows that soft er­rors can be a re­al­is­tic con­cern in math­e­mat­i­cal cal­cu­la­tions. (As added con­fir­ma­tion, the E8 cal­cu­la­tion has now been re­peated about 5 times with iden­ti­cal re­sult­s.)…The soft er­ror rate is re­mark­ably sen­si­tive to el­e­va­tion; a cal­cu­la­tion in Den­ver pro­duces about three times more soft er­rors than the same cal­cu­la­tion on iden­ti­cal hard­ware in Boston…­Soft er­rors are de­press­ing news in the ul­tra­-re­li­able world of proof as­sis­tants. Al­pha par­ti­cles rain on per­fect and im­per­fect soft­ware alike. In fact, be­cause the num­ber of soft er­rors is pro­por­tional to the ex­e­cu­tion time of a cal­cu­la­tion, by be­ing slow and me­thod­i­cal, the prob­a­bil­ity of a soft er­ror dur­ing a cal­cu­la­tion in­side a proof as­sis­tant can be much higher than the prob­a­bil­ity when done out­side.

    ↩︎
  6. Most/all math re­sults re­quire their sys­tem to be con­sis­tent; but this is one par­tic­u­lar philo­soph­i­cal view. , in :

    If a con­tra­dic­tion were now ac­tu­ally found in arith­metic—that would only prove that an arith­metic with such a con­tra­dic­tion in it could ren­der very good ser­vice; and it would be bet­ter for us to mod­ify our con­cept of the cer­tainty re­quired, than to say it would re­ally not yet have been a proper arith­metic.

    , , points out one way to re­act to such is­sues:

    A skep­ti­cal so­lu­tion of a philo­soph­i­cal prob­lem be­gins… by con­ced­ing that the skep­tic’s neg­a­tive as­ser­tions are unan­swer­able. Nev­er­the­less our or­di­nary prac­tice or be­lief is jus­ti­fied be­cause-con­trary ap­pear­ances notwith­stand­ing-it need not re­quire the jus­ti­fi­ca­tion the scep­tic has shown to be un­ten­able. And much of the value of the scep­ti­cal ar­gu­ment con­sists pre­cisely in the fact that he has shown that an or­di­nary prac­tice, if it is to be de­fended at all, can­not be de­fended in a cer­tain way.

    ↩︎
  7. Lip­ton lists sev­er­al:

    1. the tran­scen­dal­ity of and : re­solved as pre­dict­ed, but >78 years faster than he pre­dict­ed.
    2. proof of the con­sis­tency of arith­metic: pre­dic­tion that arith­metic was con­sis­tent and this was prov­able was fal­si­fied (Goedel show­ing it is un­prov­able)

    One could add to this Hilbert list: the (in­de­pen­den­t); and the al­go­rithm for solv­ing Dio­phan­tines (im­pos­si­ble to give, to the sur­prise of who said re­view­ing one of the pa­pers “Well, that’s not the way it’s gonna go.”). From Math­Over­flow:

    Hilbert’s 21st prob­lem, on the ex­is­tence of lin­ear DEs with pre­scribed mon­odromy group, was for a long time thought to have been solved by Plemelj in 1908. In fact, Plemelj died in 1967 still be­liev­ing he had solved the prob­lem. How­ev­er, in 1989, Boli­bruch dis­cov­ered a coun­terex­am­ple. De­tails are in the book The Rie­man­n-Hilbert Prob­lem by Anosov and Boli­bruch (Vieweg-Teub­ner 1994), and a nice pop­u­lar re­count­ing of the story is in Ben Yan­del­l’s The Hon­ors Class (A K Pe­ters 2002).

    Lip­ton also pro­vides as ex­am­ples:

    • War­ren Hirsch’s poly­tope con­jec­ture

    • Sub­hash Khot’s con­jec­ture that his Unique Games prob­lem is NP-hard (not fal­si­fied but sub­stan­tially weak­ened)

    • the search for a proof of Eu­clid’s fifth pos­tu­late (cov­ered al­ready)

    • George Pólya’s prime fac­tor­iza­tion con­jec­ture

    • Euler’s gen­er­al­iza­tion of Fer­mat’s last the­o­rem

    • Vir­ginia Rags­dale’s com­bi­na­to­r­ial con­jec­ture, re­lated to a Hilbert prob­lem

    • Erik Zee­man’s knot-ty­ing con­jec­ture; the res­o­lu­tion is too good to not quote:

      After try­ing to prove this for al­most ten years, one day he worked on the op­po­site di­rec­tion, and solved it in hours.

    • a von Neu­mann topo­log­i­cal con­jec­ture

    • con­ven­tional wis­dom in com­plex­ity the­ory “that bound­ed-width pro­grams could not com­pute the ma­jor­ity func­tion, and many other func­tions”

    • dit­to, “Most be­lieved that non­de­ter­min­is­tic log­space (NLOG) is not closed un­der com­ple­ment.”

    • Béla Julesz’s hu­man vi­sion sta­tis­tics con­jec­ture

    ↩︎
  8. John von Neu­mann, “The Math­e­mati­cian” (1947):

    That Eu­clid’s ax­iom­a­ti­za­tion does at some mi­nor points not meet the mod­ern re­quire­ments of ab­solute ax­iomatic rigour is of lesser im­por­tance in this re­spec­t…The first for­mu­la­tions of the cal­cu­lus were not even math­e­mat­i­cally rig­or­ous. An in­ex­act, semi­-phys­i­cal for­mu­la­tion was the only one avail­able for over a hun­dred and fifty years after New­ton! And yet, some of the most im­por­tant ad­vances of analy­sis took place dur­ing this pe­ri­od, against this in­ex­act, math­e­mat­i­cally in­ad­e­quate back­ground! Some of the lead­ing math­e­mat­i­cal spir­its of the pe­riod were clearly not rig­or­ous, like Euler; but oth­ers, in the main, were, like Gauss or Ja­co­bi. The de­vel­op­ment was as con­fused and am­bigu­ous as can be, and its re­la­tion to em­piri­cism was cer­tainly not ac­cord­ing to our present (or Eu­clid­’s) ideas of ab­strac­tion and rigour. Yet no math­e­mati­cian would want to ex­clude it from the fold—that pe­riod pro­duced math­e­mat­ics as first-class as ever ex­ist­ed! And even after the reign of rigour was es­sen­tially re-estab­lished with Cauchy, a very pe­cu­liar re­lapse into semi­-phys­i­cal meth­ods took place with Rie­mann.

    ↩︎
  9. Stephen Wol­fram men­tions a re­cent ex­am­ple I had­n’t run into used be­fore, in a long dis­cus­sion of ex­pand­ing Math­e­mat­ica to au­to­mat­i­cally in­cor­po­rate old pa­pers’ re­sults

    Of course, there are all sorts of prac­ti­cal is­sues. Newer pa­pers are pre­dom­i­nantly in TeX, so it’s not too diffi­cult to pull out the­o­rems with all their math­e­mat­i­cal no­ta­tion. But older pa­pers need to be scanned, which re­quires math OCR, which has yet to be prop­erly de­vel­oped. Then there are is­sues like whether the­o­rems stated in pa­pers are ac­tu­ally valid. And even whether the­o­rems that were con­sid­ered valid, say, 100 years ago are still con­sid­ered valid to­day. For ex­am­ple, for con­tin­ued frac­tions, there are lots of pre-1950 the­o­rems that were suc­cess­fully proved in their time, but which ig­nore branch cuts, and so would­n’t be con­sid­ered cor­rect to­day. And in the end of course it re­quires lots of ac­tu­al, skilled math­e­mati­cians to guide the cu­ra­tion process, and to en­code the­o­rems. But in a sense this kind of mo­bi­liza­tion of math­e­mati­cians is not com­pletely un­fa­mil­iar; it’s some­thing like what was needed when Zen­tral­blatt was started in 1931, or Math­e­mat­i­cal Re­views in 1941.

    ↩︎
  10. (), Melvyn B. Nathanson 2009:

    The his­tory of math­e­mat­ics is full of philo­soph­i­cally and eth­i­cally trou­bling re­ports about bad proofs of the­o­rems. For ex­am­ple, the states that every poly­no­mial of de­gree n with com­plex co­effi­cients has ex­actly n com­plex roots. D’Alem­bert pub­lished a proof in 1746, and the the­o­rem be­came known “D’Alem­bert’s the­o­rem”, but the proof was wrong. Gauss pub­lished his first proof of the fun­da­men­tal the­o­rem in 1799, but this, too, had . Gauss’s sub­se­quent proofs, in 1816 and 1849, were OK. It seems to have been hard to de­ter­mine if a proof of the fun­da­men­tal the­o­rem of al­ge­bra was cor­rect. Why?

    was awarded a prize from King Os­car II of Swe­den and Nor­way for a pa­per on the , and his pa­per was pub­lished in Acta Math­e­mat­ica in 1890. But the pub­lished pa­per was not the prize-win­ning pa­per. The pa­per that won the prize con­tained se­ri­ous mis­takes, and Poin­care and other math­e­mati­cians, most im­por­tant­ly, , en­gaged in a con­spir­acy to sup­press the truth and to re­place the er­ro­neous pa­per with an ex­ten­sively al­tered and cor­rected one.

    The three­-body prob­lem is fas­ci­nat­ing as it gives us an ex­am­ple of a bad proof by Poin­caré & at­tempt to cover it up, but also an ex­am­ple of a im­pos­si­bil­ity proof: Bruns & Poin­caré proved in 1887 that the usual ap­proaches could not work, typ­i­cally in­ter­preted as the 3 or n-body prob­lem be­ing un­solv­able. Ex­cept in 1906/1909, pro­vided an (im­prac­ti­cal) al­go­rithm us­ing differ­ent tech­niques to solve it. See “The So­lu­tion of the n-body Prob­lem” & “A Visit to the New­ton­ian N-body Prob­lem via El­e­men­tary Com­plex Vari­ables”.↩︎

  11. , Hales 2014:

    Why use com­put­ers to ver­ify math­e­mat­ics? The sim­ple an­swer is that care­fully im­ple­mented proof check­ers make fewer er­rors than math­e­mati­cians (ex­cept J.-P. Ser­re). In­cor­rect proofs of cor­rect state­ments are so abun­dant that they are im­pos­si­ble to cat­a­logue. Ralph Boas, for­mer ex­ec­u­tive ed­i­tor of Math Re­views, once re­marked that proofs are wrong “half the time” [Aus08]. Kem­pe’s claimed proof of the four-color the­o­rem stood for more than a decade be­fore Hea­wood re­futed it [Mac01, p. 115]. “More than a thou­sand false proofs [of Fer­mat’s Last The­o­rem] were pub­lished be­tween 1908 and 1912 alone” [Cor10]. Many pub­lished the­o­rems are like the hang­ing chad bal­lots of the 2000 U.S. pres­i­den­tial elec­tion, with scrawls too am­biva­lent for a clear yea or nay. One math­e­mati­cian even pro­posed to me that a new jour­nal is needed that un­like the oth­ers only pub­lishes re­li­able re­sults. Eu­clid gave us a method, but even he erred in the proof of the very first propo­si­tion of the El­e­ments when he as­sumed with­out proof that two cir­cles, each pass­ing through the oth­er’s cen­ter, must in­ter­sect. The con­cept that is needed to re­pair the gap in Eu­clid’s rea­son­ing is an in­ter­me­di­ate value the­o­rem. This de­fect was not reme­died un­til Hilbert’s Foun­da­tions of Geom­e­try. Ex­am­ples of widely ac­cepted proofs of false or un­prov­able state­ments show that our meth­ods of proof-check­ing are far from per­fect. La­grange thought he had a proof of the par­al­lel pos­tu­late, but had enough doubt in his ar­gu­ment to with­hold it from pub­li­ca­tion. In some cas­es, en­tire schools have be­come slop­py, such as the or real analy­sis be­fore the rev­o­lu­tion in rigor to­wards the end of the nine­teenth cen­tu­ry. Plemelj’s 1908 ac­cepted so­lu­tion to Hilbert’s 21st prob­lem on the mon­odromy of lin­ear differ­en­tial equa­tions was re­futed in 1989 by Boli­bruch. Aus­lan­der gives the ex­am­ple of a the­o­rem12 pub­lished by Warask­iewicz in 1937, gen­er­al­ized by Cho­quet in 1944, then re­futed with a coun­terex­am­ple by Bing in 1948 [Aus08]. An­other ex­am­ple is the ap­prox­i­ma­tion prob­lem for Sobolev maps be­tween two man­i­folds [Bet91], which con­tains a faulty proof of an in­cor­rect state­ment. The cor­rected the­o­rem ap­pears in [H­L03]. Such ex­am­ples are so plen­ti­ful that a Wiki page has been set up to clas­sify them, with ref­er­ences to longer dis­cus­sions at Math Over­flow [Wik11], [Ove09], [Ove10].

    ↩︎
  12. , Si­mon Colton 2007:

    A more re­cent ex­am­ple was the dis­cov­ery that An­drew Wiles’ orig­i­nal proof of Fer­mat’s Last The­o­rem was flawed (but not, as it turned out, fa­tally flawed, as Wiles man­aged to fix the prob­lem (S­ingh, 1997))…­More re­cent­ly, Larry Wos has been us­ing Ot­ter to find smaller proofs of the­o­rems than the cur­rent ones. To this end, he uses Ot­ter to find more suc­cinct meth­ods than those orig­i­nally pro­posed. This often re­sults in de­tect­ing dou­ble nega­tions and re­mov­ing un­nec­es­sary lem­mas, some of which were thought to be in­dis­pens­able. (Wos, 1996) presents a method­ol­ogy us­ing a strat­egy known as res­o­nance to search for el­e­gant proofs with Ot­ter. He gives ex­am­ples from math­e­mat­ics and log­ic, and also ar­gues that this work also im­pli­ca­tions for other fields such as cir­cuit de­sign.

    (Fleu­riot & Paulson, 1998) have stud­ied the geo­met­ric proofs in New­ton’s Prin­cipia and in­ves­ti­gated ways to prove them au­to­mat­i­cally with the Is­abelle in­ter­ac­tive the­o­rem prover (Paulson, 1994). To do this, they for­mal­ized the Prin­cipia in both Eu­clid­ean geom­e­try and non-s­tan­dard analy­sis. While work­ing through one of the key re­sults (propo­si­tion 11 of book 1, the Ke­pler prob­lem) they dis­cov­ered an anom­aly in the rea­son­ing. New­ton was ap­peal­ing to a cross-mul­ti­pli­ca­tion re­sult which was­n’t true for in­fin­i­tes­i­mals or in­fi­nite num­bers. Is­abelle could there­fore not prove the re­sult, but Fleu­riot man­aged to de­rive an al­ter­na­tive proof of the the­o­rem that the sys­tem found ac­cept­able.

    ↩︎
  13. Colton 2007: “For ex­am­ple, Hea­wood dis­cov­ered a flaw in Kem­pe’s 1879 proof of the ,2 which had been ac­cepted for 11 years.” It would ul­ti­mately be proved with a com­puter in 1976—­maybe.↩︎

  14. Hales 2014:

    The­o­rems that are cal­cu­la­tions or enu­mer­a­tions are es­pe­cially prone to er­ror. Feyn­man laments, “I don’t no­tice in the morass of things that some­thing, a lit­tle limit or sign, goes wrong… . . I have math­e­mat­i­cally proven to my­self so many things that aren’t true” [Fey00, p. 885]. Else­where, Feyn­man de­scribes two teams of physi­cists who car­ried out a two-year cal­cu­la­tion of the elec­tron mag­netic mo­ment and in­de­pen­dently ar­rived at the same pre­dicted val­ue. When ex­per­i­ment dis­agreed with pre­dic­tion, the dis­crep­ancy was even­tu­ally traced to an arith­metic er­ror made by the physi­cists, whose cal­cu­la­tions were not so in­de­pen­dent as orig­i­nally be­lieved [Fey85, p. 117]. Pon­trya­gin and Rokhlin erred in com­put­ing sta­ble ho­mo­topy groups of spheres. Lit­tle’s ta­bles of knots from 1885 con­tains du­pli­cate en­tries that went un­de­tected un­til 1974. In enu­mer­a­tive geom­e­try, in 1848, Steiner counted 7776 plane con­ics tan­gent to 5 gen­eral plane con­ics, when there are ac­tu­ally only 3264. One of the most per­sis­tent blun­ders in the his­tory of math­e­mat­ics has been the mis­clas­si­fi­ca­tion (or mis­de­fi­n­i­tion) of con­vex Archimedean poly­he­dra. Time and again, the pseudo rhom­bic cuboc­ta­he­dron has been over­looked or il­log­i­cally ex­cluded from the clas­si­fi­ca­tion (Fig­ure 21) [Grue11].

    ↩︎
  15. Stigler is also kind in , em­pha­siz­ing that while many of the sta­tis­ti­cians in­volved in max­i­mum like­li­hood in­cor­rectly proved false claims, they were very pro­duc­tive mis­takes.↩︎

  16. “Fine Hall in its golden age: Re­mem­brances of Prince­ton in the early fifties”, In­dis­crete Thoughts↩︎

  17. “Ten Lessons I wish I had been Taught”, Gi­an-Carlo Rota 1996↩︎

  18. There are 2 20th cen­tury math­e­mati­cians, born too late to work with Fara­day, and the tele­graph in­ven­tor Samuel Morse who while over­lap­ping with Fara­day, has a Wikipedia en­try men­tion­ing no work in math­e­mat­ics; I do not know which Morse may be meant.↩︎

  19. An ex­am­ple of this would be “An En­dur­ing Er­ror”, Branko Grün­baum:

    Math­e­mat­i­cal truths are im­mutable, but math­e­mati­cians do make er­rors, es­pe­cially when car­ry­ing out non-triv­ial enu­mer­a­tions. Some of the er­rors are “in­no­cent”—plain mis­takes that get cor­rected as soon as an in­de­pen­dent enu­mer­a­tion is car­ried out. For ex­am­ple, Daubleb­sky [14] in 1895 found that there are pre­cisely 228 types of con­fig­u­ra­tions (123), that is, col­lec­tions of 12 lines and 12 points, each in­ci­dent with three of the oth­ers. In fact, as found by Gropp [19] in 1990, the cor­rect num­ber is 229. An­other ex­am­ple is pro­vided by the enu­mer­a­tion of the uni­form tilings of the 3-di­men­sional space by An­dreini [1] in 1905; he claimed that there are pre­cisely 25 types. How­ev­er, as shown [20] in 1994, the cor­rect num­ber is 28. An­dreini listed some tilings that should not have been in­clud­ed, and missed sev­eral oth­er­s—but again, these are sim­ple er­rors eas­ily cor­rect­ed…It is sur­pris­ing how er­rors of this type es­cape de­tec­tion for a long time, even though there is fre­quent men­tion of the re­sults. One ex­am­ple is pro­vided by the enu­mer­a­tion of 4-di­men­sional sim­ple poly­topes with 8 facets, by Brück­ner [7] in 1909. He re­places this enu­mer­a­tion by that of 3-di­men­sional “di­a­grams” that he in­ter­preted as Schlegel di­a­grams of con­vex 4-poly­topes, and claimed that the enu­mer­a­tion of these ob­jects is equiv­a­lent to that of the poly­topes. How­ev­er, aside from sev­eral “in­no­cent” mis­takes in his enu­mer­a­tion, there is a fun­da­men­tal er­ror: While to all 4-poly­topes cor­re­spond 3-di­men­sional di­a­grams, there is no rea­son to as­sume that every di­a­gram arises from a poly­tope. At the time of Brück­n­er’s pa­per, even the cor­re­spond­ing fact about 3-poly­he­dra and 2-di­men­sional di­a­grams has not yet been es­tab­lished—this fol­lowed only from Steinitz’s char­ac­ter­i­za­tion of com­plexes that de­ter­mine con­vex poly­he­dra [45], [46]. In fact, in the case con­sid­ered by Brück­n­er, the as­sump­tion is not only un­jus­ti­fied, but ac­tu­ally wrong: One of Brück­n­er’s poly­topes does not ex­ist, see [25].

    …Poly­he­dra have been stud­ied since an­tiq­ui­ty. It is, there­fore, rather sur­pris­ing that even con­cern­ing some of the poly­he­dra known since that time there is a lot of con­fu­sion, re­gard­ing both ter­mi­nol­ogy and essence. But even more un­ex­pected is the fact that many ex­po­si­tions of this topic com­mit se­ri­ous math­e­mat­i­cal and log­i­cal er­rors. More­over, this hap­pened not once or twice, but many times over the cen­turies, and con­tin­ues to this day in many printed and elec­tronic pub­li­ca­tions; the most re­cent case is in the sec­ond is­sue for 2008 of this jour­nal…With our un­der­stand­ings and ex­clu­sions, there are four­teen con­vex poly­he­dra that sat­isfy the lo­cal cri­te­rion and should be called “Archimedean”, but only thir­teen that sat­isfy the global cri­te­rion and are ap­pro­pri­ately called “uni­form” (or “semi­reg­u­lar”). Rep­re­sen­ta­tives of the thir­teen uni­form con­vex poly­he­dra are shown in the sources men­tioned above, while the four­teenth poly­he­dron is il­lus­trated in Fig­ure 1. It sat­is­fies the lo­cal cri­te­rion but not the global one, and there­fore is—in our ter­mi­nol­o­gy—Archimedean but not uni­form. The his­tory of the re­al­iza­tion that the lo­cal cri­te­rion leads to four­teen poly­he­dra will be dis­cussed in the next sec­tion; it is re­mark­able that this de­vel­op­ment oc­curred only in the 20th cen­tu­ry. This im­plies that prior to the twen­ti­eth cen­tury all enu­mer­a­tions of the poly­he­dra sat­is­fy­ing the lo­cal cri­te­rion were mis­tak­en. Un­for­tu­nate­ly, many later enu­mer­a­tions make the same er­ror.

    ↩︎
  20. Dana Mack­inzie, The Uni­verse in Zero Words: The Story of Math­e­mat­ics as Told through Equa­tions (as quoted by John D. Cook):

    Fer­mat re­peat­edly bragged about the n = 3 and n = 4 cases and posed them as chal­lenges to other math­e­mati­cians … But he never men­tioned the gen­eral case, n = 5 and high­er, in any of his let­ters. Why such re­straint? Most like­ly, ar­gues, be­cause Fer­mat had re­al­ized that his “truly won­der­ful proof” did not work in those cas­es…Ev­ery math­e­mati­cian has had days like this. You think you have a great in­sight, but then you go out for a walk, or you come back to the prob­lem the next day, and you re­al­ize that your great idea has a flaw. Some­times you can go back and fix it. And some­times you can’t.

    ↩︎
  21. From , “Fer­mat’s Last The­o­rem”:

    Much ad­di­tional progress was made over the next 150 years, but no com­pletely gen­eral re­sult had been ob­tained. Buoyed by false con­fi­dence after his proof that pi is tran­scen­den­tal, the math­e­mati­cian Lin­de­mann pro­ceeded to pub­lish sev­eral proofs of Fer­mat’s Last The­o­rem, all of them in­valid (Bell 1937, pp. 464–465). A prize of 100000 Ger­man marks, known as the Wolfskehl Prize, was also offered for the first valid proof (Ball and Cox­eter 1987, p. 72; Barner 1997; Hoff­man 1998, pp. 193–194 and 199).

    A re­cent false alarm for a gen­eral proof was raised by Y. Miyaoka (Cipra 1988) whose proof, how­ev­er, turned out to be flawed. Other at­tempted proofs among both pro­fes­sional and am­a­teur math­e­mati­cians are dis­cussed by vos Sa­vant (1993), al­though vos Sa­vant er­ro­neously claims that work on the prob­lem by Wiles (dis­cussed be­low) is in­valid.

    ↩︎
  22. To take a ran­dom ex­am­ple (which could be mul­ti­plied in­defi­nite­ly); from Gödel and the Na­ture of Math­e­mat­i­cal Truth: A Talk with Re­becca Gold­stein (6.8.2005):

    Ein­stein told the philoso­pher of sci­ence that he’d known even be­fore the so­lar eclipse of 1918 sup­ported his gen­eral the­ory of rel­a­tiv­ity that the the­ory must be true be­cause it was so beau­ti­ful. And , who worked on both rel­a­tiv­ity the­ory and quan­tum me­chan­ics, said “My work al­ways tried to unite the true with the beau­ti­ful, but when I had to choose one or the oth­er, I usu­ally chose the beau­ti­ful.”…Math­e­mat­ics seems to be the one place where you don’t have to choose, where truth and beauty are al­ways unit­ed. One of my al­l-time fa­vorite books is . tries to demon­strate to a gen­eral au­di­ence that math­e­mat­ics is in­ti­mately about beau­ty. He gives as ex­am­ples two proofs, one show­ing that the square root of 2 is ir­ra­tional, the other show­ing that there’s no largest prime num­ber. Sim­ple, eas­ily gras­pable proofs, that stir the soul with won­der.

    ↩︎
  23. Nathanson 2009 claims the op­po­site:

    Many math­e­mati­cians have the op­po­site opin­ion; they do not or can­not dis­tin­guish the beauty or im­por­tance of a the­o­rem from its proof. A the­o­rem that is first pub­lished with a long and diffi­cult proof is highly re­gard­ed. Some­one who, prefer­ably many years lat­er, finds a short proof is “bril­liant.” But if the short proof had been ob­tained in the be­gin­ning, the the­o­rem might have been dis­par­aged as an “easy re­sult.” Erdős was a ge­nius at find­ing bril­liantly sim­ple proofs of deep re­sults, but, un­til re­cent­ly, much of his work was ig­nored by the math­e­mat­i­cal es­tab­lish­ment.

    ↩︎
  24. From “Aes­thet­ics as a Lib­er­at­ing Force in Math­e­mat­ics Ed­u­ca­tion?”, by Nathalie Sin­clair (reprinted in The Best Writ­ing on Math­e­mat­ics 2010, ed. Mircea Piti­ci); pg208:

    There is a long tra­di­tion in math­e­mat­ics of de­scrib­ing proofs and the­o­rems in aes­thetic terms, often us­ing words such as ‘el­e­gance’ and ‘depth’. Fur­ther, math­e­mati­cians have also ar­gued that their sub­ject is more akin to an art than it is to a sci­ence (see ; Lit­tle­wood, 1986; Sul­li­van 1925/1956), and, like the arts, as­cribe to math­e­mat­ics aes­thetic goals. For ex­am­ple, the math­e­mati­cian W. Krull (1930/1987) writes: “the pri­mary goals of the math­e­mati­cian are aes­thet­ic, and not epis­te­mo­log­i­cal” (p. 49). This state­ment seems con­tra­dic­tory with the oft-cited con­cern of math­e­mat­ics with find­ing or dis­cov­er­ing truths, but it em­pha­sises the fact that the math­e­mati­cian’s in­ter­est is in ex­press­ing truth, and in do­ing so in clev­er, sim­ple, suc­cinct ways.

    While Krull fo­cuses on math­e­mat­i­cal ex­pres­sion, the math­e­mati­cian H. Poin­care (1908/1966) con­cerns him­self with the psy­chol­ogy of math­e­mat­i­cal in­ven­tion, but he too un­der­lines the aes­thetic di­men­sion of math­e­mat­ics, not the log­i­cal. In Poin­car­e’s the­o­ry, a large part of a math­e­mati­cian’s work is done at the sub­con­scious lev­el, where an aes­thetic sen­si­bil­ity is re­spon­si­ble for alert­ing the math­e­mati­cians to the most fruit­ful and in­ter­est­ing of ideas. Other math­e­mati­cians have spo­ken of this spe­cial sen­si­bil­ity as well and also in terms of the way it guides math­e­mati­cians to choose cer­tain prob­lems. This choice is es­sen­tial in math­e­matic given that there ex­ists no ex­ter­nal re­al­ity against which math­e­mati­cians can de­cide which prob­lems or which branches of math­e­mat­ics are im­por­tant (see von Neu­mann, 1947 [“The Math­e­mati­cian”]): the choice in­volves hu­man val­ues and pref­er­ence—and, in­deed, these change over time, as ex­em­pli­fied by the dis­missal of geom­e­try by some promi­nent math­e­mati­cians in the early 20th cen­tury (see White­ley, 1999).

    • Lit­tle­wood, 1986: “The math­e­mati­cian’s art of work”; in B. Bol­lobas (ed.), Lit­tle­wood’s mis­cel­lany, Cam­bridge Uni­ver­sity press
    • Sul­li­van 1925/1956: “Math­e­mat­ics as an art”; in J. New­man (ed.), The world of math­e­mat­ics, vol 3 (p 2015–2021)
    ↩︎
  25. From pg 211–212, Sin­clair 2009:

    The sur­vey of math­e­mati­cians con­ducted by Wells (1990) pro­vides a more em­pir­i­cal­ly-based chal­lenge to the in­trin­sic view of the math­e­mat­i­cal aes­thet­ic. Wells ob­tained re­sponses from over 80 math­e­mati­cians, who were asked to iden­tify the most beau­ti­ful the­o­rem from a given set of 24 the­o­rems. (These the­o­rems were cho­sen be­cause they were ‘fa­mous’, in the sense that Wells judged them to be well-known by most math­e­mati­cians, and of in­ter­est to the dis­ci­pline in gen­er­al, rather than to a par­tic­u­lar sub­field.) Wells finds that the math­e­mati­cians var­ied widely in their judg­ments. More in­ter­est­ing­ly, in ex­plain­ing their choic­es, the math­e­mati­cians re­vealed a wide range of per­sonal re­sponses affect­ing their aes­thetic re­sponses to the the­o­rems. Wells effec­tively puts to rest the be­lief that math­e­mati­cians have some kind of se­cret agree­ment on what counts as beau­ti­ful in math­e­mat­ic­s…Bur­ton’s (2004) work fo­cuses on the prac­tices of math­e­mati­cians and their un­der­stand­ing of those prac­tices. Based on ex­ten­sive in­ter­views with a wide range of math­e­mati­cian­s…She points out that math­e­mati­cians range on a con­tin­uum from unim­por­tant to cru­cial in terms of their po­si­tion­ings on the role of the aes­thet­ic, with only 3 of the 43 math­e­mati­cians dis­miss­ing its im­por­tance. For ex­am­ple, one said “Beauty does­n’t mat­ter. I have never seen a beau­ti­ful math­e­mat­i­cal pa­per in my life” (p. 65). An­other math­e­mati­cian was ini­tially dis­mis­sive about math­e­mat­i­cal beauty but lat­er, when speak­ing about the re­view process, said: “If it was a very el­e­gant way of do­ing things, I would be in­clined to for­give a lot of faults” (p. 65).

    ↩︎
  26. Tao left a lengthy com­ment on a pre­vi­ously linked Lip­ton post:

    It is pos­si­ble to gather rea­son­ably con­vinc­ing sup­port for a con­jec­ture by a va­ri­ety of means, long be­fore it is ac­tu­ally proven, though many math­e­mati­cians are re­luc­tant to ar­gue too strongly based on such sup­port due to the lack of rigour or the risk of em­bar­rass­ment in hind­sight. Ex­am­ples of sup­port in­clude:

    • Nu­mer­i­cal ev­i­dence; but one has to be care­ful in sit­u­a­tions where the null hy­poth­e­sis would also give com­pa­ra­ble nu­mer­i­cal ev­i­dence. The first ten tril­lion ze­roes of zeta on the crit­i­cal line is, in my opin­ion, only mild ev­i­dence in favour of RH (the null hy­poth­e­sis may be, for in­stance, that the ze­roes go hay­wire once log log t be­comes suffi­ciently large); but the nu­mer­i­cal data on spac­ings of ze­roes is quite con­vinc­ing ev­i­dence for the GUE hy­poth­e­sis, in my view. (It is a pri­ori con­ceiv­able that the spac­ings are dis­trib­uted ac­cord­ing to GUE plus an­other cor­rec­tion that dom­i­nates when log log t (say) is large, but this be­gins to strain Oc­cam’s ra­zor.)
    • Non-triv­ial spe­cial cas­es. But it de­pends on how “rep­re­sen­ta­tive” one be­lieves the spe­cial cases to be. For in­stance, if one can ver­ify low-di­men­sional cases of a con­jec­ture that is true in high di­men­sions, this is usu­ally only weak (but not en­tirely in­signifi­cant) ev­i­dence, as it is pos­si­ble that there ex­ist high­-di­men­sional patholo­gies that sink the con­jec­ture but can­not be folded up into a low-di­men­sional sit­u­a­tion. But if one can do all odd­-di­men­sional cas­es, and all even-di­men­sional cases up to di­men­sion 8 (say), then that be­gins to look more con­vinc­ing.
    • Proofs of par­al­lel, anal­o­gous, or sim­i­lar con­jec­tures. Par­tic­u­larly if these proofs were non-triv­ial and led to new in­sights and tech­niques. RH in func­tion fields is a good ex­am­ple here; it raises the hope of some sort of grand uni­fied ap­proach to GRH that some­how han­dles all num­ber fields (or some other gen­eral class) si­mul­ta­ne­ous­ly.
    • Con­verse of the con­jec­ture is prov­able, and looks “op­ti­mal” some­how. One might be able to con­struct a list of all ob­vi­ous ex­am­ples of ob­jects with prop­erty X, find sig­nifi­cant diffi­culty ex­tend­ing the list, and then con­jec­ture that this is list is com­plete. This is a com­mon way to make con­jec­tures, but can be dan­ger­ous, as one may sim­ply have a lack of imag­i­na­tion. So this is thin ev­i­dence by it­self (many false con­jec­tures have arisen from this con­verse-tak­ing method), but it does carry a lit­tle bit of weight once com­bined with other strands of ev­i­dence.
    • Con­jec­ture is am­bi­tious and pow­er­ful, and yet is not im­me­di­ately sunk by the ob­vi­ous con­sis­tency checks. This is vaguely anal­o­gous to the con­cept of a “fal­si­fi­able the­ory” in sci­ence. A strong con­jec­ture could have many pow­er­ful con­se­quences in a va­ri­ety of dis­parate ar­eas of math­e­mat­ic­s—so pow­er­ful, in fact, that one would not be sur­prised that they could be dis­proven with var­i­ous coun­terex­am­ples. But sur­pris­ing­ly, when one checks the cases that one does un­der­stand quite well, the con­jec­ture holds up. A typ­i­cal ex­am­ple here might in­clude a very gen­eral con­jec­tured iden­tity which, when spe­cialised to var­i­ous well-un­der­stood spe­cial cas­es, be­come a prov­able iden­ti­ty—but with the iden­tity in each spe­cial case be­ing prov­able by very differ­ent meth­ods, and the con­nec­tion be­tween all the iden­ti­ties be­ing mys­te­ri­ous other than via the con­jec­ture. The gen­eral con­jec­ture that the primes be­have pseudo­ran­domly after ac­count­ing for small mod­uli is an ex­am­ple of such a con­jec­ture; we usu­ally can’t con­trol how the primes be­have, but when we can, the pseudo­ran­domess heuris­tic lines up per­fect­ly.
    • At­tempts at dis­proof run into in­ter­est­ing ob­sta­cles. This one is a bit hard to for­malise, but some­times you can get a sense that at­tempts to dis­prove a con­jec­ture are fail­ing not due to one’s own lack of abil­i­ty, or due to ac­ci­den­tal con­tin­gen­cies, but rather due to “en­emy ac­tiv­ity”; some lurk­ing hid­den struc­ture to the prob­lem, cor­ners of which emerge every time one tries to build a coun­terex­am­ple. The ques­tion is then whether this “en­emy” is stu­pid enough to be out­wit­ted by a suffi­ciently clever coun­terex­am­ple, or is pow­er­ful enough to block all such at­tempts. Iden­ti­fy­ing this en­emy pre­cisely is usu­ally the key to re­solv­ing the con­jec­ture (or trans­form­ing the con­jec­ture into a stronger and bet­ter con­jec­ture).
    • Con­jec­ture gen­er­alises to a broader con­jec­ture that en­joys sup­port of the types listed above. The twin prime con­jec­ture, by it­self, is diffi­cult to sup­port on its own; but when it comes with an as­ymp­totic that one can then ver­ify nu­mer­i­cally to high ac­cu­racy and is a con­se­quence of the much more pow­er­ful prime tu­ples con­jec­ture (and more gen­er­al­ly, the pseudo­ran­dom­ness heuris­tic for the primes) which is sup­ported both be­cause of its high fal­si­fi­a­bil­ity and also its na­ture as an op­ti­mal-look­ing con­verse (the only struc­ture to the primes are the “ob­vi­ous” struc­tures), it be­comes much more con­vinc­ing. An­other text­book ex­am­ple is the Poin­care con­jec­ture, which be­came much more con­vinc­ing after be­ing in­ter­preted as a spe­cial case of geometri­sa­tion (which had a lot of sup­port, e.g. the two-di­men­sional ana­logue, Haken man­i­folds, lots of fal­si­fi­able pre­dic­tions, etc.).

    It can be fun (though a lit­tle risky, rep­u­ta­tion-wise) to de­bate how strong var­i­ous pieces of ev­i­dence re­ally are, but one soon reaches a point of di­min­ish­ing re­turns, as often we are lim­ited by our own ig­no­rance, lack of imag­i­na­tion, or cog­ni­tive bi­as­es. But we are at least rea­son­ably able to per­form rel­a­tive com­par­isons of the strength of ev­i­dence of two con­jec­tures in the same topic (I guess com­plex­ity the­ory is full of in­stances of this…).

    ↩︎
  27. pg190–191 of Fas­ci­nat­ing Math­e­mat­i­cal Peo­ple, edited by Al­bers 2011:

    Guy: If I do any math­e­mat­ics at all I think I do it in my sleep.

    MP: Do you think a lot of math­e­mati­cians work that way?

    Guy: I do. Yes. The hu­man brain is a re­mark­able thing, and we are a long way from un­der­stand­ing how it works. For most math­e­mat­i­cal prob­lems, im­me­di­ate thought and pen­cil and pa­per—the usual things one as­so­ciates with solv­ing math­e­mat­i­cal prob­lem­s—are just to­tally in­ad­e­quate. You need to un­der­stand the prob­lem, make a few sym­bols on pa­per and look at them. Most of us, as op­posed to Erdős who would prob­a­bly give an an­swer to a prob­lem al­most im­me­di­ate­ly, would then prob­a­bly have to go off to bed, and, if we’re lucky, when we wake up in the morn­ing, we would al­ready have some in­sight into the prob­lem. On those rare oc­ca­sions when I have such in­sight, I quite often don’t know that I have it, but when I come to work on the prob­lem again, to put pen­cil to pa­per, some­how the ideas just seem to click to­geth­er, and the thing goes through. It is clear to me that my brain must have gone on, in an al­most com­bi­na­to­r­ial way, check­ing the cases or do­ing an enor­mous num­ber of fairly triv­ial arith­meti­cal com­pu­ta­tions. It seems to know the way to go. I first no­ticed this with , which are in­deed fi­nite com­bi­na­to­r­ial prob­lems. The first in­di­ca­tion that I was in­ter­ested in com­bi­na­toric­s—I did­n’t know I had the in­ter­est, and I did­n’t even know there was such a sub­ject as com­bi­na­toric­s—was that I used to com­pose chess endgames. I would sit up late into the night try­ing to an­a­lyze a po­si­tion. Even­tu­ally I would sink into slum­ber and wake up in the morn­ing to re­al­ize that if I had only moved the pawns over one file the whole thing would have gone through clear­ly. My brain must have been check­ing over this fi­nite but mod­er­ately large num­ber of pos­si­bil­i­ties dur­ing the night. I think a lot of math­e­mati­cians must work that way.

    MP: Have you talked to any other math­e­mati­cians about that?

    Guy: No. But in Jacques Hadamard’s book on in­ven­tion in the math­e­mat­i­cal field, he quotes some ex­am­ples there where it is fairly clear that peo­ple do that kind of thing. There was some­one ear­lier this week who was talk­ing about Jean-Paul Serre. He said that if you ask Serre a ques­tion he ei­ther gives you the an­swer im­me­di­ate­ly, or, if he hes­i­tates, and you push him in any way, he will say, “How can I think about the ques­tion when I don’t know the an­swer?” I thought that was a lovely re­mark. At a much lower lev­el, one should think, “What shape should the an­swer be?” Then your mind can start check­ing whether you’re right and how to find some log­i­cal se­quence to get you where you want to go.

    ↩︎
  28. Jan­u­ary 14, 1974, in “Con­ver­sa­tions with Gi­an-Carlo Rota”; as quoted on pg262 of Tur­ing’s Cathe­dral (2012) by :

    Once in my life I had a math­e­mat­i­cal dream which proved cor­rect. I was twenty years old. I thought, my God, this is won­der­ful, I won’t have to work, it will all come in dreams! But it never hap­pened again.

    ↩︎
  29. J Thomas:

    Once after I had spent sev­eral days try­ing to prove a topol­ogy the­o­rem, I dreamed about it and woke up with as coun­terex­am­ple. In the dream it just con­structed it­self, and I could see it. I did­n’t have a fever then, though. Later one of my teach­ers, an old Pol­ish wom­an, ex­plained her ex­pe­ri­ence. She kept a note­book by her bed so she could write down any in­sights she got in her sleep. She woke up in the night with a won­der­ful proof, and wrote it down, and in the morn­ing when she looked at it it was all garbage. “You can­not do math in your sleep. You will have to work.”

    ↩︎
  30. “Higher al­ge­braic K-the­ory of schemes and of de­rived cat­e­gories”, Thoma­son & Trobaugh 1990:

    The first au­thor must state that his coau­thor and close friend, Tom Trobaugh, quite in­tel­li­gent, sin­gu­larly orig­i­nal, and in­or­di­nately gen­er­ous, killed him­self con­se­quent to en­doge­nous de­pres­sion. 94 days lat­er, in my dream, Tom’s sim­u­lacrum re­marked, “The di­rect limit char­ac­ter­i­za­tion of per­fect com­plexes shows that they ex­tend, just as one ex­tends a co­her­ent sheaf.” Awak­ing with a start, I knew this idea had to be wrong, since some per­fect com­plexes have a non-van­ish­ing K0 ob­struc­tion to ex­ten­sion. I had worked on this prob­lem for 3 years, and saw this ap­proach to be hope­less. But Tom’s sim­u­lacrum had been so in­sis­tent, I knew he would­n’t let me sleep undis­turbed un­til I had worked out the ar­gu­ment and could point to the gap. This work quickly led to the key re­sults of this pa­per. To Tom, I could have ex­plained why he must be listed as a coau­thor.

    ↩︎
  31. , (1945), pg27

    Let us come to math­e­mati­cians. One of them, Mail­let, started a first in­quiry as to their meth­ods of work. One fa­mous ques­tion, in par­tic­u­lar, was al­ready raised by him that of the “math­e­mat­i­cal dream”, it hav­ing been sug­gested often that the so­lu­tion of prob­lems that have de­fied in­ves­ti­ga­tion may ap­pear in dreams. Though not as­sert­ing the ab­solute non-ex­is­tence of “math­e­mat­i­cal dreams”, Mail­let’s in­quiry shows that they can­not be con­sid­ered as hav­ing a se­ri­ous sig­nifi­cance. Only one re­mark­able ob­ser­va­tion is re­ported by the promi­nent Amer­i­can math­e­mati­cian, Leonard Eu­gene Dick­son, who can pos­i­tively as­sert its ac­cu­ra­cy…Ex­cept for that very cu­ri­ous case, most of the 69 cor­re­spon­dents who an­swered Mail­let on that ques­tion never ex­pe­ri­enced any math­e­mat­i­cal dream (I never did) or, in that line, dreamed of wholly ab­surd things, or were un­able to state pre­cisely the ques­tion they hap­pened to dream of. 5 dreamed of quite naive ar­gu­ments. There is one more pos­i­tive an­swer; but it is diffi­cult to take ac­count of it, as its au­thor re­mains anony­mous.

    ↩︎
  32. From his 1993 “How to Write a Proof”:

    Anec­do­tal ev­i­dence sug­gests that as many as a third of all pa­pers pub­lished in math­e­mat­i­cal jour­nals con­tain mis­takes—not just mi­nor er­rors, but in­cor­rect the­o­rems and proof­s…My in­for­ma­tion about math­e­mati­cians’ er­rors and em­bar­rass­ment comes mainly from George Bergman.

    ↩︎
  33. 1993 “How to Write a Proof”:

    Some twenty years ago, I de­cided to write a proof of the Schroed­er-Bern­stein the­o­rem for an in­tro­duc­tory math­e­mat­ics class. The sim­plest proof I could find was in Kel­ley’s clas­sic gen­eral topol­ogy text [4, page 28]. Since Kel­ley was writ­ing for a more so­phis­ti­cated au­di­ence, I had to add a great deal of ex­pla­na­tion to his half-page proof. I had writ­ten five pages when I re­al­ized that Kel­ley’s proof was wrong. Re­cent­ly, I wanted to il­lus­trate a lec­ture on my proof style with a con­vinc­ing in­cor­rect proof, so I turned to Kel­ley. I could find noth­ing wrong with his proof; it seemed ob­vi­ously cor­rect! Read­ing and reread­ing the proof con­vinced me that ei­ther my mem­ory had failed, or else I was very stu­pid twenty years ago. Still, Kel­ley’s proof was short and would serve as a nice ex­am­ple, so I started rewrit­ing it as a struc­tured proof. Within min­utes, I re­dis­cov­ered the er­ror.

    My in­ter­est in proofs stems from writ­ing cor­rect­ness proofs of al­go­rithms. These proofs are sel­dom deep, but usu­ally have con­sid­er­able de­tail. Struc­tured proofs pro­vided a way of cop­ing with this de­tail. The style was first ap­plied to proofs of or­di­nary the­o­rems in a pa­per I wrote with Mar­tin Abadi [2]. He had al­ready writ­ten con­ven­tional proof­s|proofs that were good enough to con­vince us and, pre­sum­ably, the ref­er­ees. Rewrit­ing the proofs in a struc­tured style, we dis­cov­ered that al­most every one had se­ri­ous mis­takes, though the the­o­rems were cor­rect. Any hope that in­cor­rect proofs might not lead to in­cor­rect the­o­rems was de­stroyed in our next col­lab­o­ra­tion [1]. Time and again, we would make a con­jec­ture and write a proof sketch on the black­board—a sketch that could eas­ily have been turned into a con­vinc­ing con­ven­tional proof—only to dis­cov­er, by try­ing to write a struc­tured proof, that the con­jec­ture was false. Since then, I have never be­lieved a re­sult with­out a care­ful, struc­tured proof. My skep­ti­cism has helped avoid nu­mer­ous er­rors.

    , Lam­port 2011:

    My ear­lier pa­per on struc­tured proofs de­scribed how effec­tive they are at catch­ing er­rors. It re­counted how only by writ­ing such a proof was I able to re-dis­cover an er­ror in a proof of the Schroed­er-Bern­stein the­o­rem in a well-known topol­ogy text [2, page 28]. I re­cently re­ceived email from a math­e­mati­cian say­ing that he had tried un­suc­cess­fully to find that er­ror by writ­ing a struc­tured proof. I asked him to send me his proof, and he re­spond­ed:

    I tried typ­ing up the proof that I’d hand-writ­ten, and in the process, I think I’ve found the fun­da­men­tal er­ror. . . I now re­ally be­gin to un­der­stand what you mean about the power of this method, even if it did take me hours to get to this point!

    It is in­struc­tive that, to find the er­ror, he had to re-write his proof to be read by some­one else. Elim­i­nat­ing er­rors re­quires care.

    ↩︎
  34. , 1996:

    Twenty years ago it was rea­son­able to pre­dict that the size and am­bi­tion of soft­ware prod­ucts would be se­verely lim­ited by the un­re­li­a­bil­ity of their com­po­nent pro­grams. Crude es­ti­mates sug­gest that pro­fes­sion­ally writ­ten pro­grams de­liv­ered to the cus­tomer can con­tain be­tween one and ten in­de­pen­dently cor­rectable er­rors per thou­sand lines of code; and any soft­ware er­ror in prin­ci­ple can have spec­tac­u­lar effect (or worse: a sub­tly mis­lead­ing effect) on the be­hav­iour of the en­tire sys­tem. Dire warn­ings have been is­sued..The ar­gu­ments were suffi­ciently per­sua­sive to trig­ger a sig­nifi­cant re­search effort de­voted to the prob­lem of pro­gram cor­rect­ness. A pro­por­tion of this re­search was based on the ideal of cer­tainty achieved by math­e­mat­i­cal proof.

    ↩︎
  35. Hoare 1996:

    For­tu­nate­ly, the prob­lem of pro­gram cor­rect­ness has turned out to be far less se­ri­ous than pre­dict­ed. A re­cent analy­sis by Macken­zie has shown that of sev­eral thou­sand deaths so far re­li­ably at­trib­uted to de­pen­dence on com­put­ers, only ten or so can be ex­plained by er­rors in the soft­ware: most of these were due to a cou­ple of in­stances of in­cor­rect dosage cal­cu­la­tions in the treat­ment of can­cer by ra­di­a­tion. Sim­i­larly pre­dic­tions of col­lapse of soft­ware due to size have been fal­si­fied by con­tin­u­ous op­er­a­tion of re­al-time soft­ware sys­tems now mea­sured in tens of mil­lions of lines of code, and sub­jected to thou­sands of up­dates per year…And air­craft, both civil and mil­i­tary, are now fly­ing with the aid of soft­ware mea­sured in mil­lions of lines—though not all of it is safe­ty-crit­i­cal. Com­pil­ers and op­er­at­ing sys­tems of a sim­i­lar size now num­ber their sat­is­fied cus­tomers in mil­lions. So the ques­tions arise: why have twenty years of pes­simistic pre­dic­tions been fal­si­fied? Was it due to suc­cess­ful ap­pli­ca­tion of the re­sults of the re­search which was mo­ti­vated by the pre­dic­tions? How could that be, when clearly lit­tle soft­ware has ever has been sub­jected to the rigours of for­mal proof?

    ↩︎
  36. Hoare 1996:

    Suc­cess in the use of math­e­mat­ics for spec­i­fi­ca­tion, de­sign and code re­views does not re­quire strict for­mal­i­sa­tion of all the proofs. In­for­mal rea­son­ing among those who are flu­ent in the id­ioms of math­e­mat­ics is ex­tremely effi­cient, and re­mark­ably re­li­able. It is not im­mune from fail­ure; for ex­am­ple sim­ple mis­prints can be sur­pris­ingly hard to de­tect by eye. For­tu­nate­ly, these are ex­actly the kind of er­ror that can be re­moved by early tests. More for­mal cal­cu­la­tion can be re­served for the most cru­cial is­sues, such as in­ter­rupts and re­cov­ery pro­ce­dures, where bugs would be most dan­ger­ous, ex­pen­sive, and most diffi­cult to di­ag­nose by test­s…­Many more tests should be de­signed than there will ever be time to con­duct; they should be gen­er­ated as di­rectly as pos­si­ble from the spec­i­fi­ca­tion, prefer­ably au­to­mat­i­cally by com­puter pro­gram. Ran­dom se­lec­tion at the last minute will pro­tect against the dan­ger that un­der pres­sure of time the pro­gram will be adapted to pass the tests rather than meet­ing the rest of its spec­i­fi­ca­tion. There is some ev­i­dence that early at­ten­tion to a com­pre­hen­sive and rig­or­ous test strat­egy can im­prove re­li­a­bil­ity of a de­liv­ered pro­duct, even when at the last minute there was no time to con­duct the tests be­fore de­liv­ery!

    ↩︎
  37. The miss­ing steps may be quite diffi­cult to fully prove, though; Nathanson 2009:

    There is a lovely but prob­a­bly apoc­ryphal anec­dote about . Teach­ing a class at MIT, he wrote some­thing on the black­board and said it was ‘ob­vi­ous.’ One stu­dent had the temer­ity to ask for a proof. Weiner started pac­ing back and forth, star­ing at what he had writ­ten on the board and say­ing noth­ing. Fi­nal­ly, he left the room, walked to his office, closed the door, and worked. After a long ab­sence he re­turned to the class­room. ‘It is ob­vi­ous’, he told the class, and con­tin­ued his lec­ture.

    ↩︎
  38. What con­di­tions count as full scrutiny by the math com­mu­nity may not be too clear; Nathanson 2009 tren­chantly mocks math talks:

    So­cial pres­sure often hides mis­takes in proofs. In a sem­i­nar lec­ture, for ex­am­ple, when a math­e­mati­cian is prov­ing a the­o­rem, it is tech­ni­cally pos­si­ble to in­ter­rupt the speaker in or­der to ask for more ex­pla­na­tion of the ar­gu­ment. Some­times the de­tails will be forth­com­ing. Other times the re­sponse will be that it’s “ob­vi­ous” or “clear” or “fol­lows eas­ily from pre­vi­ous re­sults.” Oc­ca­sion­ally speak­ers re­spond to a ques­tion from the au­di­ence with a look that con­veys the mes­sage that the ques­tioner is an id­iot. That’s why most math­e­mati­cians sit qui­etly through sem­i­nars, un­der­stand­ing very lit­tle after the in­tro­duc­tory re­marks, and ap­plaud­ing po­litely at the end of a mostly wasted hour.

    ↩︎
  39. “Math­e­mati­cians solve 45-year-old Ker­vaire in­vari­ant puz­zle”, Er­ica Klar­re­ich 2009↩︎

  40. “Why Did La­grange ‘Prove’ the Par­al­lel Pos­tu­late?”, Gra­biner 2009:

    It is true that La­grange never did pub­lish it, so he must have re­al­ized there was some­thing wrong. In an­other ver­sion of the sto­ry, told by , who claims to have been there (though the min­utes do not list his name), every­body there could see that some­thing was wrong, so La­grange’s talk was fol­lowed by a mo­ment of com­plete si­lence [2, p. 84]. Still, La­grange kept the man­u­script with his pa­pers for pos­ter­ity to read.

    Why work on it at all?

    The his­tor­i­cal fo­cus on the fifth pos­tu­late came be­cause it felt more like the kind of thing that gets proved. It is not self­-ev­i­dent, it re­quires a di­a­gram even to ex­plain, so it might have seemed more as though it should be a the­o­rem. In any case, there is a tra­di­tion of at­tempted proofs through­out the Greek and then Is­lamic and then eigh­teen­th-cen­tury math­e­mat­i­cal worlds. La­grange fol­lows many eigh­teen­th-cen­tury math­e­mati­cians in see­ing the lack of a proof of the fifth pos­tu­late as a se­ri­ous de­fect in . But La­grange’s crit­i­cism of the pos­tu­late in his man­u­script is un­usu­al. He said that the as­sump­tions of geom­e­try should be demon­stra­ble “just by the ”-the same way, he said, that we know the ax­iom that the whole is greater than the part [32, p. 30R]. The the­ory of par­al­lels rests on some­thing that is not self­-ev­i­dent, he be­lieved, and he wanted to do some­thing about this.

    What was the strange and alien to the mod­ern mind ap­proach that La­grange used?

    Re­call that La­grange said in this man­u­script that ax­ioms should fol­low from the prin­ci­ple of con­tra­dic­tion. But, he added, be­sides the prin­ci­ple of con­tra­dic­tion, “There is an­other prin­ci­ple equally self­-ev­i­dent,” and that is Leib­niz’s . That is: noth­ing is true “un­less there is a suffi­cient rea­son why it should be so and not oth­er­wise” [42, p. 31; ital­ics added]. This, said La­grange, gives as solid a ba­sis for math­e­mat­i­cal proof as does the prin­ci­ple of con­tra­dic­tion [32, p. 30V]. But is it le­git­i­mate to use the prin­ci­ple of suffi­cient rea­son in math­e­mat­ics? La­grange said that we are jus­ti­fied in do­ing this, be­cause it has al­ready been done. For ex­am­ple, Archimedes to es­tab­lish that equal weights at equal dis­tances from the ful­crum of a lever bal­ance. La­grange added that we also use it to show that three equal forces act­ing on the same point along lines sep­a­rated by a third of the cir­cum­fer­ence of a cir­cle are in equi­lib­rium [32, pp. 31R-31V]…The mod­ern reader may ob­ject that La­grange’s sym­me­try ar­gu­ments are, like the unique­ness of par­al­lels, equiv­a­lent to Eu­clid’s pos­tu­late. But the log­i­cal cor­rect­ness, or lack there­of, of La­grange’s proof is not the point. (In this man­u­script, by the way, La­grange went on to give an anal­o­gous proof—also by the prin­ci­ple of suffi­cient rea­son-that be­tween two points there is just one straight line, be­cause if there were a sec­ond straight line on one side of the first, we could then draw a third straight line on the other side, and so on [32, pp. 34R-34V]. La­grange, then, clearly liked this sort of ar­gu­men­t.)

    …Why did philoso­phers con­clude that space had to be in­finite, ho­mo­ge­neous, and the same in all di­rec­tions? Effec­tive­ly, be­cause of the prin­ci­ple of suffi­cient rea­son. For in­stance, in 1600 ar­gued that the uni­verse must be in­fi­nite be­cause there is no rea­son to stop at any point; the ex­is­tence of an in­fin­ity of worlds is no less rea­son­able than the ex­is­tence of a fi­nite num­ber of them. Descartes used sim­i­lar rea­son­ing in his Prin­ci­ples of Phi­los­o­phy: “We rec­og­nize that this world. . . has no lim­its in its ex­ten­sion. . . . Wher­ever we imag­ine such lim­its, we . . . imag­ine be­yond them some in­defi­nitely ex­tended space” [28, p. 104]. Sim­i­lar ar­gu­ments were used by other sev­en­teen­th-cen­tury au­thors, in­clud­ing New­ton. Descartes iden­ti­fied space and the ex­ten­sion of mat­ter, so geom­e­try was, for him, about real phys­i­cal space. But geo­met­ric space, for Descartes, had to be Eu­clid­ean…Descartes, some 50 years be­fore New­ton pub­lished his first law of mo­tion, was a co-dis­cov­erer of what we call lin­ear in­er­tia: that in the ab­sence of ex­ter­nal in­flu­ences a mov­ing body goes in a straight line at a con­stant speed. Descartes called this the first law of na­ture, and for him, this law fol­lows from what we now rec­og­nize as the prin­ci­ple of suffi­cient rea­son. Descartes said, “Nor is there any rea­son to think that, if [a part of mat­ter] moves. . . and is not im­peded by any­thing, it should ever by it­self cease to move with the same force” [30, p. 75]…Leib­niz, by con­trast, did not be­lieve in ab­solute space. He not only said that spa­tial re­la­tions were just the re­la­tions be­tween bod­ies, he used the prin­ci­ple of suffi­cient rea­son to show this. If there were ab­solute space, there would have to be a rea­son to ex­plain why two ob­jects would be re­lated in one way if East is in one di­rec­tion and West in the op­po­site di­rec­tion, and re­lated in an­other way if East and West were re­versed [24, p. 147]. Sure­ly, said Leib­niz, the re­la­tion be­tween two ob­jects is just one thing! But Leib­niz did use ar­gu­ments about sym­me­try and suffi­cient rea­son-suffi­cient rea­son was his prin­ci­ple, after all. Thus, al­though Descartes and Leib­niz did not be­lieve in empty ab­solute space and New­ton did, they all agreed that what I am call­ing the Eu­clid­ean prop­er­ties of space are es­sen­tial to physics.

    …In his 1748 es­say “Re­flec­tions on Space and Time”, Euler ar­gued that space must be re­al; it can­not be just the re­la­tions be­tween bod­ies as the Leib­nizians claim [10]. This is be­cause of the prin­ci­ples of me­chan­ic­s—that is, New­ton’s first and sec­ond laws. These laws are be­yond doubt, be­cause of the “mar­velous” agree­ment they have with the ob­served mo­tions of bod­ies. The in­er­tia of a sin­gle body, Euler said, can­not pos­si­bly de­pend on the be­hav­ior of other bod­ies. The con­ser­va­tion of uni­form mo­tion in the same di­rec­tion makes sense, he said, only if mea­sured with re­spect to im­mov­able space, not to var­i­ous other bod­ies. And space is not in our minds, said Euler; how can physic­s-real physic­s-de­pend on some­thing in our mind­s?…in his of 1781, Kant placed space in the mind nonethe­less. We or­der our per­cep­tions in space, but space it­self is in the mind, an in­tu­ition of the in­tel­lect. Nev­er­the­less, Kan­t’s space turned out to be Eu­clid­ean too. Kant ar­gued that we need the in­tu­ition of space to prove the­o­rems in geom­e­try. This is be­cause it is in space that we make the con­struc­tions nec­es­sary to prove the­o­rems. And what the­o­rem did Kant use as an ex­am­ple? The sum of the an­gles of a tri­an­gle is equal to two right an­gles, a re­sult whose proof re­quires the truth of the par­al­lel pos­tu­late [26, “Of space,” p. 423]…La­grange him­self is sup­posed to have said that spher­i­cal trigonom­e­try does not need Eu­clid’s par­al­lel pos­tu­late [4, pp. 52–53]. But the sur­face of a sphere, in the eigh­teen­th-cen­tury view, is not non-Euclid­ean; it ex­ists in 3-di­men­sional Eu­clid­ean space [20, p. 71]. The ex­am­ple of the sphere helps us see that the eigh­teen­th-cen­tury dis­cus­sion of the par­al­lel pos­tu­late’s re­la­tion­ship to the other pos­tu­lates is not re­ally about what is log­i­cally pos­si­ble, but about what is true of real space.

    The fi­nal step:

    was one of the math­e­mati­cians who worked on the prob­lem of Pos­tu­late 5. Lam­bert ex­plic­itly rec­og­nized that he had not been able to prove it, and con­sid­ered that it might al­ways have to re­main a pos­tu­late. He even briefly sug­gested a pos­si­ble geom­e­try on a sphere with an imag­i­nary ra­dius. But Lam­bert also ob­served that the par­al­lel pos­tu­late is re­lated to the law of the lever [20, p. 75]. He said that a lever with weight­less arms and with equal weights at equal dis­tances is bal­anced by a force in the op­po­site di­rec­tion at the cen­ter equal to the sum of the weights, and that all these forces are par­al­lel. So ei­ther we are us­ing the par­al­lel pos­tu­late, or per­haps, Lam­bert thought, some day we could use this phys­i­cal re­sult to prove the par­al­lel pos­tu­late…These men did not want to do me­chan­ics, as, say, New­ton had done. They wanted to show not only that the world was this way, but that it nec­es­sar­ily had to be. A mod­ern philo­soph­i­cal crit­ic, Hel­mut Pul­te, has said that La­grange’s at­tempt to “re­duce” me­chan­ics to analy­sis strikes us to­day as “a mis­placed en­deav­our to math­e­ma­tize. . . an em­pir­i­cal sci­ence, and thus to en­dow it with in­fal­li­bil­ity” [39, p. 220]. La­grange would have re­spond­ed, “Right! That’s just ex­actly what we are all do­ing.”

    ↩︎
  41. Sup­pos­ing P=NP:

    Much of CS the­ory would dis­ap­pear. In my own re­search some of Ken’s and my “best” re­sults would sur­vive, but many would be de­stroyed. The Karp-Lip­ton The­o­rem is gone in this world. Ditto all “di­chotomy” re­sults be­tween P and NP-com­plete, and for P = #P, Jin-Y­i’s sim­i­lar work. Many bar­rier re­sults, such as or­a­cle the­o­rems and nat­ural proofs, lose their main mo­ti­va­tion, while much fine struc­ture in hard­ness-ver­sus-ran­dom­ness trade­offs would be blown up. The PCP The­o­rem and all the re­lated work is gone. Mod­ern cryp­tog­ra­phy could sur­vive if the al­go­rithm were galac­tic, but oth­er­wise would be in trou­ble. I am cur­rently teach­ing Com­plex­ity The­ory at Tech us­ing the text­book by San­jeev Arora and Boaz Barak…­Most of the 573 pages of Aro­ra-Barak would be gone:

    • Delete all of chap­ter 3 on NP.
    • Delete all of chap­ter 5 on the poly­no­mial hi­er­ar­chy.
    • Delete most of chap­ter 6 on cir­cuits.
    • Delete all of chap­ter 7 on prob­a­bilis­tic com­pu­ta­tion.
    • Mark as dan­ger­ous chap­ter 9 on cryp­tog­ra­phy.
    • Delete most of chap­ter 10 on quan­tum com­pu­ta­tion—who would care about Shor’s al­go­rithm then?
    • Delete all of chap­ter 11 on the PCP the­o­rem.

    I will stop here. Most of the ini­tial part of the book is gone. The same for much of Home­r-Sel­man, and ba­si­cally all of the “Re­ducibil­ity and Com­plete­ness” CRC chap­ter.

    ↩︎