Spaced Repetition for Efficient Learning

Efficient memorization using the spacing effect: literature review of widespread applicability, tips on use & what it’s good for.
psychology, Haskell, bibliography
2009-03-112019-05-17 finished certainty: highly likely importance: 9


Spaced rep­e­ti­tion is a cen­turies-old psy­cho­log­i­cal tech­nique for effi­cient mem­o­riza­tion & prac­tice of skills where in­stead of at­tempt­ing to mem­o­rize by ‘cram­ming’, mem­o­riza­tion can be done far more effi­ciently by in­stead spac­ing out each re­view, with in­creas­ing du­ra­tions as one learns the item, with the sched­ul­ing done by soft­ware. Be­cause of the greater effi­ciency of its slow but steady ap­proach, spaced rep­e­ti­tion can scale to mem­o­riz­ing hun­dreds of thou­sands of items (while crammed items are al­most im­me­di­ately for­got­ten) and is es­pe­cially use­ful for for­eign lan­guages & med­ical stud­ies.

I re­view what this tech­nique is use­ful for, some of the large re­search lit­er­a­ture on it and the test­ing effect (up to ~2013, pri­mar­i­ly), the avail­able soft­ware tools and use pat­terns, and mis­cel­la­neous ideas & ob­ser­va­tions on it.

One of the most fruit­ful ar­eas of com­put­ing is mak­ing up for hu­man frail­ties. They do arith­metic per­fectly be­cause we can’t1. They re­mem­ber ter­abytes be­cause we’d for­get. They make the best cal­en­dars, be­cause they al­ways check what there is to do to­day. Even if we do not re­mem­ber ex­act­ly, merely re­mem­ber­ing a ref­er­ence can be just as good, like the point of read­ing a man­ual or text­book all the way through: it is not to re­mem­ber every­thing that is in it for later but to later re­mem­ber that some­thing is in it (and skim­ming them, you learn the right words to search for when you ac­tu­ally need to know more about a par­tic­u­lar top­ic).

We use any num­ber of such s2, but there are al­ways more to be dis­cov­ered. They’re worth look­ing for be­cause they are so valu­able: a shovel is much more effec­tive than your hand, but a is or­ders of mag­ni­tude bet­ter than both - even if it re­quires train­ing and ex­per­tise to use.

Spacing effect

“You can get a good deal from re­hearsal,
If it just has the proper dis­per­sal.
You would just be an ass,
To do it en masse,
Your re­mem­ber­ing would turn out much wor­sal.”

Ul­rich Neisser3

My cur­rent fa­vorite pros­the­sis is the class of soft­ware that ex­ploits the , a cen­turies-old ob­ser­va­tion in cog­ni­tive psy­chol­o­gy, to achieve re­sults in study­ing or mem­o­riza­tion much bet­ter than con­ven­tional stu­dent tech­niques; it is, alas, ob­scure4.

The spac­ing effect es­sen­tially says that if you have a ques­tion (“What is the fifth let­ter in this ran­dom se­quence you learned?”), and you can only study it, say, 5 times, then your mem­ory of the an­swer (‘e’) will be strongest if you spread your 5 tries out over a long pe­riod of time - days, weeks, and months. One of the worst things you can do is blow your 5 tries within a day or two. You can think of the ‘’ as be­ing like a chart of a ra­dioac­tive : each re­view bumps your mem­ory up in strength 50% of the chart, say, but re­view does­n’t do much in the early days be­cause the mem­ory sim­ply has­n’t de­cayed much! (Why does the spac­ing effect work, on a bi­o­log­i­cal lev­el? There are clear neu­ro­chem­i­cal differ­ences be­tween massed and spaced in an­i­mal mod­els with spac­ing (>1 hour) en­hanc­ing but not massed5, but the why and where­fore - that’s an open ques­tion; see the con­cept of or the sleep stud­ies.) A graph­i­cal rep­re­sen­ta­tion of the for­get­ting curve:

Stahl et al 2010; CNS Spec­trums

Even bet­ter, it’s known that is a far su­pe­rior method of learn­ing than sim­ply pas­sively be­ing ex­posed to in­for­ma­tion.6 Spac­ing also scales to huge quan­ti­ties of in­for­ma­tion; gam­bler/­fi­nancier har­nessed “spaced learn­ing” when he was a physics grad stu­dent “in or­der to be able to work longer and harder”7, and set mul­ti­ple records on the quiz show 2010–2011 in part thanks to us­ing Anki to mem­o­rize chunks of a col­lec­tion of >200,000 past ques­tions8; a later Jeop­ardy win­ner, Arthur Chu, also used spaced rep­e­ti­tion9. Med school stu­dents (who have be­come a ma­jor de­mo­graphic for SRS due to the ex­tremely large amounts of fac­tual ma­te­r­ial they are ex­pected to mem­o­rize dur­ing med­ical school) usu­ally have thou­sands of cards, es­pe­cially if us­ing pre-made decks (more fea­si­ble for med­i­cine due to fairly stan­dard­ized cur­ricu­lums & gen­eral lack of time to make cus­tom card­s). For­eign-lan­guage learn­ers can eas­ily reach 10-30,000 cards; one Anki user re­ports a deck of >765k au­to­mat­i­cal­ly-gen­er­ated cards filled with Japan­ese au­dio sam­ples from many sources (“Youtube videos, video games, TV shows, etc”).

A graphic might help; imag­ine here one can afford to re­view a given piece of in­for­ma­tion a few times (one is a busy per­son). By look­ing at the odds we can re­mem­ber the item, we can see that cram­ming wins in the short term, but un­ex­er­cised mem­o­ries de­cay so fast that after not too long spac­ing is much su­pe­ri­or:

Wired (o­rig­i­nal, Woz­ni­ak?); massed vs spaced (al­ter­na­tive)

It’s more dra­matic if we look at a video vi­su­al­iz­ing de­cay of a cor­pus of mem­ory with ran­dom re­view vs most-re­cent re­view vs spaced re­view.

If you’re so good, why aren’t you rich

Most peo­ple find the con­cept of pro­gram­ming ob­vi­ous, but the do­ing im­pos­si­ble.10

Of course, the lat­ter strat­egy (cram­ming) is pre­cisely what stu­dents do. They cram the night be­fore the test, and a month later can’t re­mem­ber any­thing. So why do peo­ple do it? (I’m not in­no­cent my­self.) Why is spaced rep­e­ti­tion so dread­fully un­pop­u­lar, even among the peo­ple who try it on­ce?11

SCum­bag Brain meme: knows every­thing when cram­ming the night be­fore the test / and for­gets every­thing a month later

Be­cause it does work. Sort of. Cram­ming is a trade-off: you trade a strong mem­ory now for weak mem­ory lat­er. (Very weak12.) And tests are usu­ally of all the new ma­te­ri­al, with oc­ca­sional old ques­tions, so this strat­egy pays off! That’s the damnable thing about it - its mem­ory longevity & qual­ity are, in sum, less than that of spaced rep­e­ti­tion, but cram­ming de­liv­ers its goods now13. So cram­ming is a ra­tio­nal, if short­-sight­ed, re­spon­se, and even SRS soft­ware rec­og­nize its util­ity & sup­port it to some de­gree14. (But as one might ex­pect, if the test­ing is con­tin­u­ous and in­cre­men­tal, then the learn­ing tends to also be long-lived15; I do not know if this is be­cause that kind of test­ing is a dis­guised ac­ci­den­tal spaced rep­e­ti­tion sys­tem, or the stu­dents/­sub­jects sim­ply study­ing/act­ing differ­ently in re­sponse to smal­l­-s­takes ex­am­s.) In ad­di­tion to this short­-term ad­van­tage, there’s an ig­no­rance of the ad­van­tages of spac­ing and a sub­jec­tive il­lu­sion that the gains per­sist1617 (cf.Son & Si­mon 201218, Mul­li­gan & Pe­ter­son 2014, Bjork et al 2013, ); from Ko­r­nell 2009’s study of GRE vo­cab (em­pha­sis added):

Across ex­per­i­ments, spac­ing was more effec­tive than mass­ing for 90% of the par­tic­i­pants, yet after the first study ses­sion, 72% of the par­tic­i­pants be­lieved that mass­ing had been more effec­tive than spac­ing….When they do con­sider spac­ing, they often ex­hibit the il­lu­sion that massed study is more effec­tive than spaced study, even when the re­verse is true (Dun­losky & Nel­son, 1994; Ko­r­nell & Bjork, 2008a; Si­mon & Bjork 2001; Zech­meis­ter & Shaugh­nessy, 1980).

As one would ex­pect if the test­ing and spac­ing effects are real things, stu­dents who nat­u­rally test them­selves and study well in ad­vance of ex­ams tend to have higher GPAs.19 If we in­ter­pret ques­tions as tests, we are not sur­prised to see that 1-on-1 tu­tor­ing works than reg­u­lar teach­ing and that tu­tored stu­dents an­swer or­ders of mag­ni­tude more ques­tions20.

This short­-term per­spec­tive is not a good thing in the long term, of course. Knowl­edge builds on knowl­edge; one is not learn­ing in­de­pen­dent bits of triv­ia. re­calls in :

You ob­serve that most great sci­en­tists have tremen­dous dri­ve. I worked for ten years with at . He had tremen­dous dri­ve. One day about three or four years after I joined, I dis­cov­ered that John Tukey was slightly younger than I was. John was a ge­nius and I clearly was not. Well I went storm­ing into ’s office and said, “How can any­body my age know as much as John Tukey does?” He leaned back in his chair, put his hands be­hind his head, grinned slight­ly, and said, “You would be sur­prised Ham­ming, how much you would know if you worked as hard as he did that many years.” I sim­ply slunk out of the office!

What Bode was say­ing was this: “Knowl­edge and pro­duc­tiv­ity are like .” Given two peo­ple of ap­prox­i­mately the same abil­ity and one per­son who works 10% more than the oth­er, the lat­ter will more than twice out­pro­duce the for­mer. The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the op­por­tu­nity - it is very much like com­pound in­ter­est. I don’t want to give you a rate, but it is a very high rate. Given two peo­ple with ex­actly the same abil­i­ty, the one per­son who man­ages day in and day out to get in one more hour of think­ing will be tremen­dously more pro­duc­tive over a life­time. I took Bode’s re­mark to heart; I spent a good deal more of my time for some years try­ing to work a bit harder and I found, in fact, I could get more work done.

Knowl­edge needs to ac­cu­mu­late, and flash­cards with spaced rep­e­ti­tion can aid in just that ac­cu­mu­la­tion, fos­ter­ing steady re­view even as the num­ber of cards and in­tel­lec­tual pre­req­ui­sites mounts into the thou­sands.

This long term fo­cus may ex­plain why ex­plicit spaced rep­e­ti­tion is an un­com­mon study­ing tech­nique: the pay-off is dis­tant & coun­ter­in­tu­itive, the cost of self­-con­trol near & vivid. (See .) It does­n’t help that it’s pretty diffi­cult to fig­ure out when one should re­view - the op­ti­mal point is when you’re just about to for­get about it, but that’s the kick­er: if you’re just about to for­get about it, how are you sup­posed to re­mem­ber to re­view it? You only re­mem­ber to re­view what you re­mem­ber, and what you al­ready re­mem­ber is­n’t what you need to re­view!21

The para­dox is re­solved by let­ting a com­puter han­dle all the cal­cu­la­tions. We can thank Her­mann Ebbing­haus for in­ves­ti­gat­ing in such te­dious de­tail than we can, in fact, pro­gram a com­puter to cal­cu­late both the for­get­ting curve and op­ti­mal set of re­views22. This is the in­sight be­hind soft­ware: ask the same ques­tion over and over, but over in­creas­ing spans of time. You start with ask­ing it once every few days, and soon the hu­man re­mem­bers it rea­son­ably well. Then you ex­pand in­ter­vals out to weeks, then months, and then years. Once the mem­ory is formed and dis­patched to long-term mem­o­ry, it needs but oc­ca­sional ex­er­cise to re­main hale and hearty23 - I re­mem­ber well the large di­nosaurs made of card­board for my 4th or 5th birth­day, or the tun­nel made out of box­es, even though I rec­ol­lect them once or twice a year at most.

Literature review

But don’t take my word for it - Nul­lius in verba! We can look at the sci­ence. Of course, if you do take my word for it, you prob­a­bly just want to read about how to use it and all the nifty things you can do, so I sug­gest you skip all the way down to that sec­tion. Every­one else, we start at the be­gin­ning:

Background: testing works!

“If you read a piece of text through twenty times, you will not learn it by heart so eas­ily as if you read it ten times while at­tempt­ing to re­cite from time to time and con­sult­ing the text when your mem­ory fails.” –,

The is the es­tab­lished psy­cho­log­i­cal ob­ser­va­tion that the mere act of test­ing some­one’s mem­ory will strengthen the mem­ory (re­gard­less of whether there is feed­back). Since is just test­ing on par­tic­u­lar days, we ought to es­tab­lish that test­ing works bet­ter than reg­u­lar re­view or study, and that it works out­side of mem­o­riz­ing ran­dom dates in his­to­ry. To cover a few pa­pers:

  1. Al­len, G.A., Mahler, W.A., & Es­tes, W.K. (1969). “Effects of re­call tests on long-term re­ten­tion of paired as­so­ciates”. Jour­nal of Ver­bal Learn­ing and Ver­bal Be­hav­ior, 8, 463-470

    1 test re­sults in mem­o­ries as strong a day later as study­ing 5 times; in­ter­vals im­prove re­ten­tion com­pared to massed pre­sen­ta­tion.

  2. Karpicke & Roedi­ger (2003). “The Crit­i­cal Im­por­tance of Re­trieval for Learn­ing”

    In learn­ing Swahili vo­cab­u­lary, stu­dents were given vary­ing rou­tines of test­ing or study­ing or test­ing and study­ing; this re­sulted in sim­i­lar scores dur­ing the learn­ing phase. Stu­dents were asked to pre­dict what per­cent­age they’d re­mem­ber (av­er­age: 50% over all group­s). One week lat­er, the stu­dents who tested re­mem­bered ~80% of the vo­cab­u­lary ver­sus ~35% for non-test­ing stu­dents. Some stu­dents were tested or stud­ied more than oth­ers; di­min­ish­ing re­turns set in quickly once the mem­ory had formed the first day. Stu­dents re­ported rarely test­ing them­selves and not test­ing al­ready learned items.

    Lesson: again, test­ing im­proves mem­ory com­pared to study­ing. Al­so, no stu­dent knows this.

  3. Roedi­ger & Karpicke (2006a). “Test-En­hanced Learn­ing: Tak­ing Mem­ory Tests Im­proves Long-Term Re­ten­tion”

    Stu­dents were tested (with no feed­back) on read­ing com­pre­hen­sion of a pas­sage over 5 min­utes, 2 days, and 1 week. Study­ing beat test­ing over 5 min­utes, but nowhere else; stu­dents be­lieved study­ing su­pe­rior to test­ing over all in­ter­vals. At 1 week, test­ing scores were ~60% ver­sus ~40%.

    Lesson: test­ing im­proves mem­ory com­pared to study­ing. Every­one (teach­ers & stu­dents) ‘knows’ the op­po­site.

  4. Karpicke & Roedi­ger (2006a). “Ex­pand­ing re­trieval pro­motes short­-term re­ten­tion, but equal in­ter­val re­trieval en­hances long-term re­ten­tion”

    Gen­eral sci­en­tific prose com­pre­hen­sion; from Roedi­ger & Karpicke 2006b: “After 2 days, ini­tial test­ing pro­duced bet­ter re­ten­tion than restudy­ing (68% vs. 54%), and an ad­van­tage of test­ing over restudy­ing was also ob­served after 1 week (56% vs. 42%).”

  5. Roedi­ger & Karpicke (2006b).

    Lit­er­a­ture re­view; 7 stud­ies be­fore 1941 demon­strat­ing test­ing im­proves re­ten­tion, and 6 after­wards. See also the re­views “Spac­ing Learn­ing Events Over Time: What the Re­search Says” & “Us­ing spac­ing to en­hance di­verse forms of learn­ing: Re­view of re­cent re­search and im­pli­ca­tions for in­struc­tion”, Car­pen­ter et al 2012.

  6. Agar­wal et al 2008, “Ex­am­in­ing the Test­ing Effect with Open- and Closed-Book Tests”

    As with #2, the purer forms of test­ing (in this case, open-book ver­sus closed-book test­ing) did bet­ter over the long run, and stu­dents were de­luded about what worked best.

  7. Bangert-Drowns et al 1991. “Effects of fre­quent class­room test­ing”

    Meta-analy­sis of 35 stud­ies (1929-1989) vary­ing tests dur­ing school se­mes­ters. 29 found ben­e­fits; 5 found neg­a­tives; 1 null re­sult. Meta-s­tudy found large ben­e­fits to test­ing even on­ce, then di­min­ish­ing re­turns.

  8. Cook 2006, “Im­pact of self­-assess­ment ques­tions and learn­ing styles in We­b-based learn­ing: a ran­dom­ized, con­trolled, crossover trial”; fi­nal scores were higher when the doc­tors (res­i­dents) learned with ques­tions.

  9. John­son & Kiviniemi 2009, (“This study ex­am­ined the effec­tive­ness of com­pul­so­ry, mas­tery-based, weekly read­ing quizzes as a means of im­prov­ing exam and course per­for­mance. Com­ple­tion of read­ing quizzes was re­lated to both bet­ter exam and course per­for­mance.”); see also Mc­Daniel et al 2012.

  10. Met­sä­muuro­nen 2013, “Effect of Re­peated Test­ing on the De­vel­op­ment of Sec­ondary Lan­guage Pro­fi­ciency”

  11. Meyer & Lo­gan 2013, “Tak­ing the Test­ing Effect Be­yond the Col­lege Fresh­man: Ben­e­fits for Life­long Learn­ing”; ver­i­fies test­ing effect in older adults has sim­i­lar effect size as younger

  12. Larsen & But­ler 2013, “Test-en­hanced learn­ing”

(One might be tempted to ob­ject that test­ing works for some , per­haps ver­bal styles. This is an un­sup­ported as­ser­tion inas­much as the ex­per­i­men­tal lit­er­a­ture on learn­ing styles is poor and the ex­ist­ing ev­i­dence mixed that there are such things as learn­ing styles.24)

Subjects

The above stud­ies often used pairs of words or words them­selves. How well does the test­ing effect gen­er­al­ize?

Ma­te­ri­als which ben­e­fited from test­ing:

  • for­eign vo­cab­u­lary (eg. Karpicke & Roedi­ger 2003, Cepeda et al 2009, Fritz et al 200725, de la Rou­viere 2012)
  • ma­te­ri­als (like vo­cab, Ko­r­nell 2009); prose pas­sages on gen­eral sci­en­tific top­ics (Karpicke & Roedi­ger, 2006a; Pash­ler et al, 2003)
  • trivia (Mc­Daniel & Fisher 1991)
  • el­e­men­tary & mid­dle school lessons with sub­jects such as bi­o­graph­i­cal ma­te­r­ial and sci­ence (Gates 1917; Spitzer 193926 and Vlach & Sand­hofer 201227, re­spec­tive­ly)
  • Agar­wal et al (2008): short­-an­swer tests su­pe­rior on text­book pas­sages
  • his­tory text­books; re­ten­tion bet­ter with ini­tial short­-an­swer test rather than mul­ti­ple choice (Nungester & Duchas­tel 1982)
  • La­Porte & Voss (1975) also found bet­ter re­ten­tion com­pared to mul­ti­ple-choice or recog­ni­tion prob­lems
  • : 6 months after test­ing, test­ing beat study­ing in re­ten­tion of a his­tory pas­sage
  • Duchas­tel (1981): free re­call de­ci­sively beat short­-an­swer & mul­ti­ple choice for read­ing com­pre­hen­sion of a his­tory pas­sage
  • Glover (1989): free re­call self­-test beat recog­ni­tion or ; sub­ject mat­ter was the la­bels for parts of flow­ers
  • Kang, Mc­Der­mott, and Roedi­ger (2007): prose pas­sages; ini­tial short an­swer test­ing pro­duced su­pe­rior re­sults 3 days later on both mul­ti­ple choice and short an­swer tests
  • Leem­ing (2002): tests in 2 psy­chol­ogy cours­es, in­tro­duc­tory & mem­o­ry/learn­ing; “80% vs. 74% for the in­tro­duc­tory psy­chol­ogy course and 89% vs. 80% for the learn­ing and mem­ory course”28

This cov­ers a pretty broad range of what one might call ‘de­clar­a­tive’ knowl­edge. Ex­tend­ing test­ing to other fields is more diffi­cult and may re­duce to ‘write many fre­quent analy­ses, not large ones’ or ‘do lots of small ex­er­cises’, what­ever those might mean in those fields:

A third is­sue, which re­lates to the sec­ond, is whether our pro­posal of test­ing is re­ally ap­pro­pri­ate for courses with com­plex sub­ject mat­ters, such as the phi­los­o­phy of Spin­oza, Shake­speare’s come­dies, or cre­ative writ­ing. Cer­tain­ly, we agree that most forms of ob­jec­tive test­ing would be diffi­cult in these sorts of cours­es, but we do be­lieve the gen­eral phi­los­o­phy of test­ing (broadly speak­ing) would hold-s­tu­dents should be con­tin­u­ally en­gaged and chal­lenged by the sub­ject mat­ter, and there should not be merely a midterm and fi­nal exam (even if they are es­say ex­am­s). Stu­dents in a course on Spin­oza might be as­signed spe­cific read­ings and thought-pro­vok­ing es­say ques­tions to com­plete every week. This would be a trans­fer­-ap­pro­pri­ate form of weekly ‘test­ing’ (al­beit with take-home ex­am­s). Con­tin­u­ous test­ing re­quires stu­dents to con­tin­u­ously en­gage them­selves in a course; they can­not coast un­til near a midterm exam and a fi­nal exam and be­gin study­ing only then.29

Downsides

Test­ing does have some known flaws:

  1. in­ter­fer­ence in re­call - abil­ity to re­mem­ber tested items dri­ves out abil­ity to re­mem­ber sim­i­lar untested items

    Most/all stud­ies were in lab­o­ra­tory set­tings and found rel­a­tively small effects:

    In sum, al­though var­i­ous types of re­call in­ter­fer­ence are quite real (and quite in­ter­est­ing) phe­nom­e­na, we do not be­lieve that they com­pro­mise the no­tion of test-en­hanced learn­ing. At worst, in­ter­fer­ence of this sort might dampen pos­i­tive test­ing effects some­what. How­ev­er, the pos­i­tive effects of test­ing are often so large that in most cir­cum­stances they will over­whelm the rel­a­tively mod­est in­ter­fer­ence effects.

  2. mul­ti­ple choice tests can ac­ci­den­tally lead to ‘neg­a­tive sug­ges­tion effects’ where hav­ing pre­vi­ously seen a false­hood as an item on the test makes one more likely to be­lieve it.

    This is mit­i­gated or elim­i­nated when there’s quick feed­back about the right an­swer (see But­ler & Roedi­ger 2008 “Feed­back en­hances the pos­i­tive effects and re­duces the neg­a­tive effects of mul­ti­ple-choice test­ing”). So­lu­tion: don’t use mul­ti­ple choice; in­fe­rior in test­ing abil­ity to free re­call or short an­swers, any­way.

Nei­ther prob­lem seems ma­jor.

Distributed

A lot de­pends on when you do all your test­ing. Above we saw some ben­e­fits to test­ing a lot the mo­ment you learn some­thing, but the same num­ber of tests could be spread out over time, to give us the spac­ing effect or spaced rep­e­ti­tion. There are hun­dreds of stud­ies in­volv­ing the spac­ing effect:

Al­most unan­i­mously they find spac­ing out tests is su­pe­rior to massed test­ing when the fi­nal test/mea­sure­ment is con­ducted days or years later30, al­though the mech­a­nism is­n’t clear31. Be­sides all the pre­vi­ously men­tioned stud­ies, we can throw in:

The re­search lit­er­a­ture fo­cuses ex­ten­sively on the ques­tion of what kind of spac­ing is best and what this im­plies about mem­o­ry: a spac­ing that has sta­tic fixed in­ter­vals or a spac­ing which ex­pands? This is im­por­tant for un­der­stand­ing mem­ory and build­ing mod­els of it, and would be help­ful for in­te­grat­ing spaced rep­e­ti­tion into class­rooms (for ex­am­ple, Kel­ley & What­son 2013’s 10 min­utes study­ing / 10 min­utes break sched­ule, re­peat­ing the same ma­te­r­ial 3 times, de­signed to trig­ger LTM for­ma­tion on that block of ma­te­ri­al?) But for prac­ti­cal pur­pos­es, this is un­in­ter­est­ing; to sum it up, there are many stud­ies point­ing each way, and what­ever differ­ence in effi­ciency ex­ists, is min­i­mal. Most ex­ist­ing soft­ware fol­lows Su­per­Memo in us­ing an ex­pand­ing spac­ing al­go­rithm, so it’s not worth wor­ry­ing about; as Mnemosyne de­vel­oper Pe­ter Bi­en­st­man says, it’s not clear the more com­plex al­go­rithms re­ally help32, and the Anki de­vel­op­ers were con­cerned about the com­plex­i­ty, diffi­culty of reim­ple­ment­ing SM’s pro­pri­etary al­go­rithms, lack of sub­stan­tial gains, & larger er­rors SM3+ risks at­tempt­ing to be more op­ti­mal. So too here.

For those in­ter­est­ed, 3 of the stud­ies that found fixed spac­ings bet­ter than ex­pand­ing:

  1. Car­pen­ter, S. K., & De­Losh, E. L. (2005). “Ap­pli­ca­tion of the test­ing and spac­ing effects to name learn­ing”. Ap­plied Cog­ni­tive Psy­chol­ogy, 19, 619-63633

  2. Lo­gan, J. M. (2004). Spaced and ex­panded re­trieval effects in younger and older adults. Un­pub­lished doc­toral dis­ser­ta­tion, Wash­ing­ton Uni­ver­si­ty, St. Louis, MO

    This the­sis is in­ter­est­ing inas­much as Lo­gan found that young adults did con­sid­er­ably worse with an ex­pand­ing spac­ing after a day.

  3. Karpicke & Roedi­ger, 2006a

The fixed vs ex­pand­ing is­sue aside, a list of ad­di­tional generic stud­ies find­ing ben­e­fits to spaced vs massed:

Generality of spacing effect

We have al­ready seen that spaced rep­e­ti­tion is effec­tive on a va­ri­ety of aca­d­e­mic fields and medi­ums. Be­yond that, spac­ing effects can be found in:

  • var­i­ous “do­mains (e.g., learn­ing per­cep­tual mo­tor tasks or learn­ing lists of words)”42 such as spa­tial43
  • “across species (e.g., rats, pi­geons, and hu­mans [or or , and sea slugs, Carew et al 1972 & ])”
  • “across age groups [in­fancy44, child­hood45, adult­hood46, the el­derly47] and in­di­vid­u­als with differ­ent mem­ory im­pair­ments”
  • “and across re­ten­tion in­ter­vals of sec­onds48 [to days49] to months” (we have al­ready seen stud­ies us­ing years)

The do­mains are lim­it­ed, how­ev­er. Cepeda et al 2006:

[Moss 1995, re­view­ing 120 ar­ti­cles] con­cluded that longer ISIs fa­cil­i­tate learn­ing of ver­bal in­for­ma­tion (e.g., spelling50) and mo­tor skills (e.g., mir­ror trac­ing); in each case, over 80% of stud­ies showed a dis­trib­uted prac­tice ben­e­fit. In con­trast, only one third of in­tel­lec­tual skill (e.g., math com­pu­ta­tion) stud­ies showed a ben­e­fit from dis­trib­uted prac­tice, and half showed no effect from dis­trib­uted prac­tice.

…[­Dono­van and Ra­do­se­vich (1999)] The largest effect sizes were seen in low rigor stud­ies with low com­plex­ity tasks (e.g., ro­tary pur­suit, typ­ing, and peg re­ver­sal), and re­ten­tion in­ter­val failed to in­flu­ence effect size. The only in­ter­ac­tion Dono­van and Ra­do­se­vich ex­am­ined was the in­ter­ac­tion of ISI and task do­main. It is im­por­tant to note that task do­main mod­er­ated the dis­trib­uted prac­tice effect; de­pend­ing on task do­main and lag, an in­crease in ISI ei­ther in­creased or de­creased effect size. Over­all, Dono­van and Ra­do­se­vich found that in­creas­ingly dis­trib­uted prac­tice re­sulted in larger effect sizes for ver­bal tasks like free re­call, for­eign lan­guage, and ver­bal dis­crim­i­na­tion, but these tasks also showed an in­verse-U func­tion, such that very long lags pro­duced smaller effect sizes. In con­trast, in­creased lags pro­duced smaller effect sizes for skill tasks like typ­ing, gym­nas­tics, and mu­sic per­for­mance.

Skills like gym­nas­tics and mu­sic per­for­mance raise an im­por­tant point about the test­ing effect and spaced rep­e­ti­tion: they are for the main­te­nance of mem­o­ries or skills, they do not in­crease it be­yond what was al­ready learned. If one is a gifted am­a­teur when one starts re­view­ing, one re­mains a gifted am­a­teur. Er­ic­s­son cov­ers what is nec­es­sary to im­prove and at­tain new ex­per­tise: 51. From :

The view that merely en­gag­ing in a suffi­cient amount of prac­tice—re­gard­less of the struc­ture of that prac­tice—leads to max­i­mal per­for­mance, has a long and con­tested his­to­ry. In their clas­sic stud­ies of Morse Code op­er­a­tors, Bryan and Har­ter (, ) iden­ti­fied plateaus in skill ac­qui­si­tion, when for long pe­ri­ods sub­jects seemed un­able to at­tain fur­ther im­prove­ments. How­ev­er, with ex­tended efforts, sub­jects could re­struc­ture their skill to over­come plateaus…Even very ex­pe­ri­enced Morse Code op­er­a­tors could be en­cour­aged to dra­mat­i­cally in­crease their per­for­mance through de­lib­er­ate efforts when fur­ther im­prove­ments were re­quired…­More gen­er­al­ly, Thorndike (1921) ob­served that adults per­form at a level far from their max­i­mal level even for tasks they fre­quently carry out. For in­stance, adults tend to write more slowly and il­leg­i­bly than they are ca­pa­ble of do­ing…The most cited con­di­tion [for op­ti­mal learn­ing and im­prove­ment of per­for­mance] con­cerns the sub­jects’ mo­ti­va­tion to at­tend to the task and ex­ert effort to im­prove their per­for­mance…The sub­jects should re­ceive im­me­di­ate in­for­ma­tive feed­back and knowl­edge of re­sults of their per­for­mance…In the ab­sence of ad­e­quate feed­back, effi­cient learn­ing is im­pos­si­ble and im­prove­ment only min­i­mal even for highly mo­ti­vated sub­jects. Hence mere rep­e­ti­tion of an ac­tiv­ity will not au­to­mat­i­cally lead to im­prove­ment in, es­pe­cial­ly, ac­cu­racy of per­for­mance…In con­trast to play, de­lib­er­ate prac­tice is a highly struc­tured ac­tiv­i­ty, the ex­plicit goal of which is to im­prove per­for­mance. Spe­cific tasks are in­vented to over­come weak­ness­es, and per­for­mance is care­fully mon­i­tored to pro­vide cues for ways to im­prove it fur­ther. We claim that de­lib­er­ate prac­tice re­quires effort and is not in­her­ently en­joy­able.

Motor skills

It should be noted that re­views con­flict on how much spaced rep­e­ti­tion ap­plies to mo­tor skills; Lee & Gen­ovese 1988 find ben­e­fits, while Adams 1987 and ear­lier do not. The differ­ence may be that sim­ple mo­tor tasks ben­e­fit from spac­ing as sug­gested by Shea & Mor­gan 1979 (ben­e­fits to a ran­dom­ized/­spaced sched­ule), while com­plex ones where the sub­ject is al­ready op­er­at­ing at his lim­its do not ben­e­fit, sug­gested by Wulf & Shea 2002. Stam­baugh 2009 men­tions some di­ver­gent stud­ies:

The con­tex­tual in­ter­fer­ence hy­poth­e­sis (Shea and Mor­gan 1979, Bat­tig 1966 [“Fa­cil­i­ta­tion and in­ter­fer­ence” in Ac­qui­si­tion of skill]) pre­dicted the blocked con­di­tion would ex­hibit su­pe­rior per­for­mance im­me­di­ately fol­low­ing prac­tice (ac­qui­si­tion) but the ran­dom con­di­tion would per­form bet­ter at de­layed re­ten­tion test­ing. This hy­poth­e­sis is gen­er­ally con­sis­tent in lab­o­ra­tory mo­tor learn­ing stud­ies (e.g.Lee & Mag­ill 1983, Brady 2004), but less con­sis­tent in ap­plied stud­ies of sports skills (with a mix of pos­i­tive & neg­a­tive e.g.Landin & Hebert 1997, Hall et al 1994, Re­gal 2013) and fine-mo­tor skills (Ol­lis et al 2005, Ste-Marie et al 2004).

Some of the pos­i­tive spaced rep­e­ti­tion stud­ies (from Son & Si­mon 2012):

Per­haps even prior to the em­pir­i­cal work on cog­ni­tive learn­ing and the spac­ing effect, the ben­e­fits of spaced study had been ap­par­ent in an ar­ray of mo­tor learn­ing tasks, in­clud­ing maze learn­ing (Culler 1912), type­writ­ing (Pyle 1915), archery (Lash­ley 1915), and javelin throw­ing (Mur­phy 1916; see Ruch 1928, for a larger re­view of the mo­tor learn­ing tasks which reap ben­e­fits from spac­ing; see also Moss 1996, for a more re­cent re­view of mo­tor learn­ing tasks). Thus, as in the cog­ni­tive lit­er­a­ture, the study of prac­tice dis­tri­b­u­tion in the mo­tor do­main is long es­tab­lished (see re­views by Adams 1987; Schmidt and Lee 2005), and most in­ter­est has cen­tered around the im­pact of vary­ing the sep­a­ra­tion of learn­ing tri­als of mo­tor skills in learn­ing and re­ten­tion of prac­ticed skills. Lee and Gen­ovese (1988) con­ducted a re­view and meta-analy­sis of stud­ies on dis­tri­b­u­tion of prac­tice, and they con­cluded that mass­ing of prac­tice tends to de­press both im­me­di­ate per­for­mance and learn­ing, where learn­ing is eval­u­ated at some re­moved time from the prac­tice pe­ri­od. Their main find­ing was, as in the cog­ni­tive lit­er­a­ture, that learn­ing was rel­a­tively stronger after spaced than after massed prac­tice (although see Am­mons 1988; Christina and Shea 1988; Newell et al. 1988 for crit­i­cisms of the re­view)…Prob­a­bly the most widely cited ex­am­ple is Bad­de­ley and Long­man’s (1978) study con­cern­ing how op­ti­mally to teach postal work­ers to type. They had learn­ers prac­tice once a day or twice a day, and for ses­sion lengths of ei­ther 1 or 2 h at a time. The main find­ings were that learn­ers took the fewest cu­mu­la­tive hours of prac­tice to achieve a per­for­mance cri­te­rion in their typ­ing when they were in the most dis­trib­uted prac­tice con­di­tion. This find­ing pro­vides clear ev­i­dence for the ben­e­fits of spac­ing prac­tice for en­hanc­ing learn­ing. How­ev­er, as has been pointed out (Newell et al. 1988; Lee and Wishart 2005), there is also trade-off to be con­sid­ered in that the to­tal elapsed time (num­ber of days) be­tween the be­gin­ning of prac­tice and reach­ing cri­te­rion was sub­stan­tially longer for the most spaced con­di­tion….The same ba­sic re­sults have been re­peat­edly demon­strated in the decades since (see re­views by Mag­ill and Hall 1990; Lee and Si­mon 2004), and with a wide va­ri­ety of mo­tor tasks in­clud­ing differ­ent bad­minton serves (Goode and Mag­ill 1986), ri­fle shoot­ing (Boyce and Del Rey 1990), a pre-estab­lished skill, base­ball bat­ting (Hall et al. 1994), learn­ing differ­ent logic gate con­fig­u­ra­tions (Carl­son et al. 1989; Carl­son and Yaure 1990), for new users of au­to­mated teller ma­chines (Jamieson and Rogers 2000), and for solv­ing math­e­mat­i­cal prob­lems as might ap­pear in a class home­work (Rohrer and Tay­lor 2007; Le Blanc and Si­mon 2008; Tay­lor and Rohrer 2010).

In this vein, it’s in­ter­est­ing to note that in­ter­leav­ing may be help­ful for tasks with a men­tal com­po­nent as well: Hatala et al 2003, Hels­din­gen et al 2011, and ac­cord­ing to Huang et al 2013 the rates at which Xbox video game play­ers ad­vance in skill matches nicely pre­dic­tions from dis­tri­b­u­tion: play­ers who play 4–8 matches a week ad­vance more in skill per match, than play­ers who play more (dis­trib­ut­ed); but ad­vance slower per week than play­ers who play many more matches / massed. (See also Stafford & Haas­noot 2016.)

Abstraction

An­other po­ten­tial ob­jec­tion is to ar­gue52 that spaced rep­e­ti­tion in­her­ently hin­ders any kind of ab­stract learn­ing and thought be­cause re­lated ma­te­ri­als are not be­ing shown to­gether - al­low­ing for com­par­i­son and in­fer­ence - but days or months apart. Ernst A. Rothkopf: “Spac­ing is the friend of re­call, but the en­emy of in­duc­tion” (Ko­r­nell & Bjork 2008, p. 585). This is plau­si­ble based on some of the early stud­ies53 but the 4 re­cent stud­ies I know of di­rectly ex­am­in­ing the is­sue both found spaced rep­e­ti­tion helped ab­strac­tion as well as gen­eral re­call:

  1. Ko­r­nell & Bjork 2008a, “Learn­ing con­cepts and cat­e­gories: Is spac­ing the ‘en­emy of in­duc­tion’?” Psy­cho­log­i­cal Sci­ence, 19, 585-592

  2. Vlach, H. A., Sand­hofer, C. M., & Ko­r­nell, N. (2008). “The spac­ing effect in chil­dren’s mem­ory and cat­e­gory in­duc­tion”. Cog­ni­tion, 109, 163-167

  3. Ken­ney 2009. “The Spac­ing Effect in In­duc­tive Learn­ing”

  4. Ko­r­nell, N., Castel, A. D., Eich, T. S., & Bjork, R. A. (2010). “Spac­ing as the friend of both mem­ory and in­duc­tion in younger and older adults”. Psy­chol­ogy and Ag­ing, 25, 498-503

  5. Zulkiply et al 2011

  6. Vlach & Sand­hofer 2012, , Child De­vel­op­ment

  7. Zulkiply 2012, “The spac­ing effect in in­duc­tive learn­ing”; in­cludes:

  8. Mc­Danie et al 2013, “Effects of Spaced ver­sus Massed Train­ing in Func­tion Learn­ing”

  9. Verkoei­jen & Bouwmeester 2014,

  10. Rohrer et al 2014: 1, 2; Rorher et al 2019: “A ran­dom­ized con­trolled trial of in­ter­leaved math­e­mat­ics prac­tice”

  11. Vlach et al 2014, “Equal spac­ing and ex­pand­ing sched­ules in chil­dren’s cat­e­go­riza­tion and gen­er­al­iza­tion”

  12. Gluck­man et al, “Spac­ing Si­mul­ta­ne­ously Pro­motes Mul­ti­ple Forms of Learn­ing in Chil­dren’s Sci­ence Cur­ricu­lum”

Review summary

To bring it all to­gether with the gist:

  • test­ing is effec­tive and comes with min­i­mal neg­a­tive fac­tors

  • ex­pand­ing spac­ing is roughly as good as or bet­ter than (wide) fixed in­ter­vals, but ex­pand­ing is more con­ve­nient and the de­fault

  • test­ing (and hence spac­ing) is best on in­tel­lec­tu­al, highly fac­tu­al, ver­bal do­mains, but may still work in many low-level do­mains

  • the re­search fa­vors ques­tions which force the user to use their mem­ory as much as pos­si­ble; in de­scend­ing or­der of pref­er­ence:

    1. free re­call
    2. short an­swers
    3. mul­ti­ple-choice
    4. Cloze dele­tion
    5. recog­ni­tion
  • the re­search lit­er­a­ture is com­pre­hen­sive and most ques­tions have been an­swered - some­where.

  • the most com­mon mis­takes with spaced rep­e­ti­tion are

    1. for­mu­lat­ing poor ques­tions and an­swers
    2. as­sum­ing it will help you learn, as op­posed to main­tain and pre­serve what one al­ready learned54. (It’s hard to learn from cards, but if you have learned some­thing, it’s much eas­ier to then de­vise a set of flash­cards that will test your weak points.)

Using it

One does­n’t need to use Su­per­Me­mo, of course; there are plenty of free al­ter­na­tives. I like (home­page) my­self - , pack­aged for , easy to use, free mo­bile client, long track record of de­vel­op­ment and re­li­a­bil­ity (I’ve used it since ~2008). But the SRS is also pop­u­lar, and has ad­van­tages in be­ing more fea­ture-rich and a larger & more ac­tive com­mu­nity (and pos­si­bly bet­ter sup­port for East Asian lan­guage ma­te­r­ial and a bet­ter but pro­pri­etary mo­bile clien­t).

OK, but what does one do with it? It’s a sur­pris­ingly diffi­cult ques­tion, ac­tu­al­ly. It’s akin to “the tyranny of the blank page” (or blank wik­i); now that I have all this power - a me­chan­i­cal golem that will never for­get and never let me for­get what­ever I chose to - what do I choose to re­mem­ber?

How Much To Add

The most diffi­cult task, be­yond that of just per­sist­ing un­til the ben­e­fits be­come clear, is de­cid­ing what’s valu­able enough to add in. In a 3 year pe­ri­od, one can ex­pect to spend “30-40 sec­onds”55 on any given item. The long run the­o­ret­i­cal pre­dic­tions are a lit­tle hairi­er. Given a sin­gle item, the for­mula for daily time spent on it is . Dur­ing our 20th year, we would spend , or 3.556940131083312e-4 min­utes a day. This is the av­er­age daily time, so to re­cover the an­nual time spent, we sim­ply mul­ti­ply by 365. Sup­pose we were in­ter­ested in how much time a flash­card would cost us over 20 years. The av­er­age daily time changes every year (the graph looks like an ex­po­nen­tial de­cay, re­mem­ber), so we have to run the for­mula for each year and sum them all; in Haskell:

sum $ map (\year -> ((1/500 * year**(-(1.5))) + 1/30000) * 365.25) [1..20]
# 1.8291

Which eval­u­ates to 1.8 min­utes. (This may seem too small, but one does­n’t spend much time in the first year and the time drops off quickly56.) Anki user mu­flax’s sta­tis­tics put his per-card time at 71s, for ex­am­ple. But maybe Pi­otr Woź­niak was be­ing op­ti­mistic or we’re bad at writ­ing flash­cards, so we’ll dou­ble it to 5 min­utes. That’s our key rule of thumb that lets us de­cide what to learn and what to for­get: if, over your life­time, you will spend more than 5 min­utes look­ing some­thing up or will lose more than 5 min­utes as a re­sult of not know­ing some­thing, then it’s worth­while to mem­o­rize it with spaced rep­e­ti­tion. 5 min­utes is the line that di­vides trivia from use­ful da­ta.57 (There might seem to be thou­sands of flash­cards that meet the 5 minute rule. That’s fine. Spaced rep­e­ti­tion can ac­com­mo­date dozens of thou­sands of cards. See the next sec­tion.)

To a lesser ex­tent, one might won­der when one is in a hur­ry, should one learn some­thing with spaced rep­e­ti­tion and with massed? How far away should the tests or dead­lines be be­fore aban­don­ing spaced rep­e­ti­tion? It’s hard to com­pare since one would need a spe­cific reg­i­mens to com­pare for the crossover point, but for massed rep­e­ti­tion, the av­er­age time after mem­o­riza­tion at which one has a 50% chance of re­mem­ber­ing the mem­o­rized item seems to be 3-5 days.58 Since there would be 2 or 3 rep­e­ti­tions in that pe­ri­od, pre­sum­ably one would do bet­ter than 50% in re­call­ing an item. 5 min­utes and 5 days seems like a mem­o­rable enough rule of thumb: ‘don’t use spaced rep­e­ti­tion if you need it sooner than 5 days or it’s worth less than 5 min­utes’.

Overload

One com­mon ex­pe­ri­ence of new users to spaced rep­e­ti­tion is to add too much stuff - triv­i­al­i­ties and things they don’t re­ally care about. But they soon learn the curse of . If they don’t ac­tu­ally want to learn the ma­te­r­ial they put in, they will soon stop do­ing the daily re­views - which will cause re­views to pile up, which will be fur­ther dis­cour­ag­ing, and so they stop. At least with phys­i­cal fit­ness there is­n’t a pre­cisely dis­may­ing num­ber in­di­cat­ing how far be­hind you are! But if you have too lit­tle at the be­gin­ning, you’ll have few rep­e­ti­tions per day, and you’ll see lit­tle ben­e­fit from the tech­nique it­self - it looks like bor­ing flash card re­view.

What to add

I find one of the best uses for Mnemosyne is, be­sides the clas­sic use of mem­o­riz­ing aca­d­e­mic ma­te­r­ial such as ge­og­ra­phy or the pe­ri­odic ta­ble or for­eign vo­cab­u­lary or Bible/Ko­ran verses or the avalanche of med­ical school facts, to add in words from 59 and , mem­o­rable quotes I see60, per­sonal in­for­ma­tion such as birth­days (or li­cense plates, a prob­lem for me be­fore), and so on. Quo­tid­ian us­es, but all valu­able to me. With a di­ver­sity of flash­cards, I find my daily re­view in­ter­est­ing. I get all sorts of ques­tions - now I’m try­ing to see whether a Haskell frag­ment is syn­tac­ti­cally cor­rect, now I’m pro­nounc­ing Ko­rean and lis­ten­ing to the an­swer, now I’m try­ing to find the Ukraine on a map, now I’m en­joy­ing some po­et­ry, fol­lowed by a few quotes from Less­Wrong quote threads, and so on. Other peo­ple use it for many other things; one ap­pli­ca­tion that im­presses me for its sim­ple util­ity is mem­o­riz­ing names & faces of stu­dents al­though learn­ing mu­si­cal notes is also not bad.

The workload

On av­er­age, when I’m study­ing a new top­ic, I’ll add 3-20 ques­tions a day. Com­bined with my par­tic­u­lar mem­o­ry, I usu­ally re­view about 90 or 100 items a day (out of the to­tal >18,300). This takes un­der 20 min­utes, which is not too bad. (I ex­pect the time is ex­panded a bit by the fact that early on, my for­mat­ting guide­lines were still be­ing de­vel­oped, and I had­n’t the full panoply of cat­e­gories I do now - so every so often I must stop and edit cat­e­gories.)

If I haven’t been study­ing some­thing re­cent­ly, the ex­po­nen­tial de­cay­ing of re­views slowly drops the daily re­view. For ex­am­ple, in March 2011, I was­n’t study­ing many things, so for 2011-03-24–2011-03-26, my sched­uled daily re­views are 73, 83, and 74; after that, it’ll prob­a­bly drop down into the 60s, and then after an­other week or two, into the 50s and so on un­til it hits the min­i­mum plateau which will slowly shrink over years. (I haven’t gone long enough with­out dump­ing cards in to know what that might be.) By Feb­ru­ary 2012, the daily re­views are in the 40s or some­times 50s for sim­i­lar rea­sons, but the grad­ual shrink­age will con­tin­ue. We can see this vivid­ly, and we can even see a sort of ana­logue of the orig­i­nal for­get­ting curve, if we ask Mnemosyne 2.0 to graph the num­ber of cards to re­view per day for the next year up to Feb­ru­ary 2013 (as­sum­ing no ad­di­tions or missed re­views etc.):

A wildly vary­ing but clearly de­creas­ing graph of pre­dicted cards per day

If Mnemosyne weren’t us­ing spaced rep­e­ti­tion, it would be hard to keep up with 18,300+ flash­cards. But be­cause it is us­ing spaced rep­e­ti­tion, keep­ing up is easy.

Nor is 18.3k ex­tra­or­di­nary. Many users have decks in the 6-7k range, Mnemosyne de­vel­oper Pe­ter Bi­en­st­man has >8.5k & Patrick Kenny >27k, Hugh Chen has a 73k+ deck, and in #anki, they tell me of one user who trig­gered bugs with his >200k deck. 200,000 may be a bit much, but for reg­u­lar hu­mans, some amount smaller seems pos­si­ble - it’s in­ter­est­ing to com­pare SRS decks to the feat of or to the Mus­lim ti­tle of , one who has mem­o­rized the ~80,000 words of the Ko­ran, or the stricter ‘hafid’, one who had mem­o­rized the Ko­ran and 100,000 as well. Other forms of mem­ory are still more pow­er­ful.61 (I sus­pect that spaced rep­e­ti­tion is in­volved in one of the few well-doc­u­mented cases of “”, : read­ing Wired, she has or­di­nary fal­li­ble pow­ers of mem­o­riza­tion for sur­prise de­mands with no ob­served anatom­i­cal differ­ences and is re­stricted to “her own per­sonal his­tory and cer­tain cat­e­gories like tele­vi­sion and air­plane crashes”; fur­ther, she is a pack­rat with ob­ses­sive-com­pul­sive traits who keeps >50,000 pages of de­tailed di­aries per­haps due to a child­hood trauma & as­so­ciates daily events nigh-in­vol­un­tar­ily with past events. Mar­cus says the other in­stances of hy­per­thymesia re­sem­ble Price.)

When to review

When should one re­view? In the morn­ing? In the evening? Any old time? The stud­ies demon­strat­ing the spac­ing effect do not con­trol or vary the time of day, so in one sense, the an­swer is: it does­n’t mat­ter - if it did mat­ter, there would be con­sid­er­able vari­ance in how effec­tive the effect is based on when a par­tic­u­lar study had its sub­jects do their re­views.

So one re­views at what­ever time is con­ve­nient. Con­ve­nience makes one more likely to stick with it, and stick­ing with it over­pow­ers any tem­po­rary im­prove­ment.

If one is not sat­is­fied with that an­swer, then on gen­eral con­sid­er­a­tions, one ought to re­view be­fore bed­time & sleep. seems to be re­lat­ed, and is known to pow­er­fully in­flu­ence what mem­o­ries en­ter long-term mem­o­ry, of ma­te­r­ial learned close to bed­time and in­creas­ing cre­ativ­ity; in­ter­rupt­ing sleep with­out affect­ing to­tal sleep time or qual­ity still dam­ages mem­ory for­ma­tion in mice62. So re­view­ing be­fore bed­time would be best. (Other men­tal ex­er­cises show im­prove­ment when trained be­fore bed­time; for ex­am­ple, dual n-back.) One pos­si­ble mech­a­nism is that it may be that the ex­pectancy of fu­ture re­views/tests is enough to en­cour­age mem­ory con­sol­i­da­tion dur­ing sleep; so if one re­views and goes to bed, pre­sum­ably the ex­pectancy is stronger than if one re­viewed at break­fast and had an event­ful day and for­got en­tirely about the re­viewed flash­cards. (See also the cor­re­la­tion be­tween time of study­ing & GPA in Hartwig & Dun­losky 2012.) Neural growth may be re­lat­ed; from Stahl 2010:

Re­cent ad­vances in our un­der­stand­ing of the neu­ro­bi­ol­ogy un­der­ly­ing nor­mal hu­man mem­ory for­ma­tion have re­vealed that learn­ing is not an event, but rather a process that un­folds over time.,,18,[Squire 2003Fun­da­men­tal Neu­ro­science],20 Thus, it is not sur­pris­ing that learn­ing strate­gies that re­peat ma­te­ri­als over time en­hance their re­ten­tion.20,21,22,23,24,,26

…T­hou­sands of new cells are gen­er­ated in this re­gion every day, al­though many of these cells die within weeks of their cre­ation.31 The sur­vival of den­tate gyrus neu­rons has been shown to be en­hanced in an­i­mals when they are placed into learn­ing sit­u­a­tions.16-20 An­i­mals that learn well re­tain more den­tate gyrus neu­rons than do an­i­mals that do not learn well. Fur­ther­more, 2 weeks after test­ing, an­i­mals trained in dis­crete spaced in­ter­vals over a pe­riod of time, rather than in a sin­gle pre­sen­ta­tion or a ‘massed trial’ of the same in­for­ma­tion, re­mem­ber bet­ter.16-20 The pre­cise mech­a­nism that links neu­ronal sur­vival with learn­ing has not yet been iden­ti­fied. One the­ory is that the hip­pocam­pal neu­rons that pref­er­en­tially sur­vive are the ones that are some­how ac­ti­vated dur­ing the learn­ing process.16-2063 The dis­tri­b­u­tion of learn­ing over a pe­riod of time may be more effec­tive in en­cour­ag­ing neu­ronal sur­vival by al­low­ing more time for changes in gene ex­pres­sion and pro­tein syn­the­sis that ex­tend the life of neu­rons that are en­gaged in the learn­ing process.

…Trans­fer­ring mem­ory from the en­cod­ing stage, which oc­curs dur­ing alert wake­ful­ness, into con­sol­i­da­tion must thus oc­cur at a time when in­ter­fer­ence from on­go­ing new mem­ory for­ma­tion is re­duced.17,18 One such time for this trans­fer is dur­ing sleep, es­pe­cially dur­ing non-rapid eye move­ment sleep, when the hip­pocam­pus can com­mu­ni­cate with other brain ar­eas with­out in­ter­fer­ence from new ex­pe­ri­ences.32,33,34 Maybe that is why some de­ci­sions are bet­ter made after a good night’s rest and also why pulling an al­l-nighter, study­ing with sleep de­pri­va­tion, may al­low you to pass an exam an hour later but not re­mem­ber the ma­te­r­ial a day lat­er.

Prospects: extended flashcards

Let’s step back for a mo­ment. What are all our flash­cards, small and large, do­ing for us? Why do I have a pair of flash­cards for the word ‘anent’ among many oth­ers? I can just look it up.

But look ups take time com­pared to al­ready know­ing some­thing. (Let’s ig­nore the pre­vi­ously dis­cussed 5 minute rule.) If we think about this ab­stractly in a com­puter sci­ence con­text, we might rec­og­nize it as an old con­cept in al­go­rithms & op­ti­miza­tion dis­cus­sions - the . We trade off lookup time against lim­ited skull space.

Con­sider the sort of fac­tual data al­ready given as ex­am­ples - we might one day need to know the av­er­age an­nual rain­fall in Hon­olulu or Austin, but it would re­quire too much space to mem­o­rize such data for all cap­i­tals. There are mil­lions of Eng­lish words, but in prac­tice any more than 100,000 is ex­ces­sive. More sur­pris­ing is a sort of pro­ce­dural knowl­edge. An ex­treme form of space-time trade­offs in com­put­ers is when a com­pu­ta­tion is re­placed by pre-cal­cu­lated con­stants. We could take a math and cal­cu­late its out­put for each pos­si­ble in­put. Usu­ally such a of in­put to out­put is re­ally large. Think about how many en­tries would be in such a ta­ble for all pos­si­ble in­te­ger mul­ti­pli­ca­tions be­tween 1 and 1 bil­lion. But some­times the ta­ble is re­ally small (like bi­nary Boolean func­tions) or small (like trigono­met­ric ta­bles) or large but still use­ful (s usu­ally start in the gi­ga­bytes and eas­ily reach ter­abytes).

Given an in­fi­nitely large lookup table, we could re­place com­pletely the skill of, say, ad­di­tion or mul­ti­pli­ca­tion by the lookup table. No com­pu­ta­tion. The space-time trade­off taken to the ex­treme of the space side of the con­tin­u­um. (We could go the other way and de­fine mul­ti­pli­ca­tion or ad­di­tion as the slow com­pu­ta­tion which does­n’t know any specifics like the - as if every time you wanted to add you had to count on 4 fin­ger­s.)

So sup­pose we were chil­dren who wanted to learn mul­ti­pli­ca­tion. SRS and Mnemosyne can’t help be­cause mul­ti­pli­ca­tion is not a spe­cific fac­toid? The space-time trade­off shows us that we can de-pro­ce­du­ral­ize mul­ti­pli­ca­tion and turn it partly into fac­toids. It would­n’t be hard for us to write a quick script or macro to gen­er­ate, say, 500 ran­dom cards which ask us to mul­ti­ply AB by XY, and im­port them to Mnemosyne.64

After all, which is your mind go­ing to do - get good at mul­ti­ply­ing 2 num­bers (gen­er­ate on-de­mand), or mem­o­rize 500 differ­ent mul­ti­pli­ca­tion prob­lems ()? From my ex­pe­ri­ence with mul­ti­ple sub­tle vari­ants on a card, the mind gives up after just a few and falls back on a prob­lem-solv­ing ap­proach - which is ex­actly what one wants to ex­er­cise, in this case. Con­grat­u­la­tions; you have done the im­pos­si­ble.

From a soft­ware en­gi­neer­ing point of view, we might want to mod­ify or im­prove the cards, and 500 snip­pets of text would be a tad hard to up­date. So coolest would be a ‘dy­namic card’. Add a markup type like <eval src=""> , and then Mnemosyne feeds the src ar­gu­ment straight into the Python in­ter­preter, which re­turns a of the ques­tion text and the an­swer text. The ques­tion text is dis­played to the user as usu­al, the user thinks, re­quests the an­swer, and grades him­self. In Anki, Javascript is sup­ported di­rectly by the ap­pli­ca­tion in HTML <script> tags (cur­rently in­line only but Anki could pre­sum­ably im­port li­braries by de­fault), for ex­am­ple for kinds of syn­tax high­light­ing, so any kind of dy­namic card could be writ­ten that one wants.

So for mul­ti­pli­ca­tion, the dy­namic card would get 2 ran­dom in­te­gers, print a ques­tion like x * y = ? and then print the re­sult as the an­swer. Every so often you would get a new mul­ti­pli­ca­tion ques­tion, and as you get bet­ter at mul­ti­pli­ca­tion, you see it less often - ex­actly as you should. Still in a math vein, you could gen­er­ate vari­ants on for­mu­las or pro­grams where one ver­sion is the cor­rect one and the oth­ers are sub­tly wrong; I do this by hand with my pro­gram­ming flash­cards (e­spe­cially if I make an er­ror do­ing ex­er­cis­es, that sig­nals a finer point to make sev­eral flash­cards on), but it can be done au­to­mat­i­cal­ly. kpreid de­scribes one tool of his:

I have writ­ten a pro­gram (in the form of a web page) which does a spe­cial­ized form of this [gen­er­at­ing ‘dam­aged for­mu­las’]. It has a set of gen­er­a­tors of for­mu­las and dam­aged for­mu­las, and presents you with a list con­tain­ing sev­eral for­mu­las of the same type (e.g. ∫ 2x dx = x^2 + C) but with one dam­aged (e.g. ∫ 2x dx = 2x^2 + C).

This ap­proach gen­er­al­izes to any­thing you can gen­er­ate ran­dom prob­lems of or have large data­bases of ex­am­ples of. Khan Acad­emy ap­par­ently does some­thing like this in as­so­ci­at­ing large num­bers of (al­go­rith­micly-gen­er­at­ed?) prob­lems with each of its lit­tle mod­ules and track­ing re­ten­tion of the skill in or­der to de­cide when to do fur­ther re­view of that mod­ule. For ex­am­ple, maybe you are study­ing Go and are in­ter­ested in learn­ing . Those are things that can be gen­er­ated by com­puter Go pro­grams, or fetched from places like Go­Prob­lem­s.­com. For even more ex­am­ples, Go is ro­ta­tion­ally in­vari­ant - the best move re­mains the same re­gard­less of which way the board is ori­ented and since there is no canon­i­cal di­rec­tion for the board (like in chess) a good player ought to be able to play the same no mat­ter how the board looks - so each spe­cific ex­am­ple can be mir­rored in 3 other ways. Or one could test one’s abil­ity to ‘read’ a board by writ­ing a dy­namic card which takes each ex­am­ple board­/prob­lem and adds some ran­dom pieces as long as some go-play­ing pro­gram like says the best move has­n’t changed be­cause of the added noise.

One could learn an aw­ful lot of things this way. Pro­gram­ming lan­guages could be learned this way - some­one learn­ing could take all the func­tions listed in the Pre­lude or his Haskell text­book, and ask to gen­er­ate ran­dom ar­gu­ments for the func­tions and ask the in­ter­preter ghci what the func­tion and its ar­gu­ments eval­u­ate to. Games other than go, like chess, may work (a live ex­am­ple be­ing Chess Tempo & Lis­tudy, and see the ex­pe­ri­ence of Dan Schmidt; or ). A fair bit of math­e­mat­ics. If the dy­namic card has In­ter­net ac­cess, it can pull down fresh ques­tions from an or just a web­site; this func­tion­al­ity could be quite use­ful in a for­eign lan­guage learn­ing con­text with every day bring­ing a fresh sen­tence to trans­late or an­other ex­er­cise.

With some NLP soft­ware, one could write dy­namic flash­cards which test all sorts of things: if one con­fuses verbs, the pro­gram could take a tem­plate like “$PRONOUN $VERB $PARTICLE $OBJECT % {right: ca­resse, wrong: ca­ress­es}” which yields flash­cards like “Je ca­resses le chat” or “Tu ca­resse le chat” and one would have to de­cide whether it was the cor­rect con­ju­ga­tion. (The dy­nam­i­cism here would help pre­vent mem­o­riz­ing spe­cific sen­tences rather than the un­der­ly­ing con­ju­ga­tion.) In full gen­er­al­i­ty, this would prob­a­bly be diffi­cult, but sim­pler ap­proaches like tem­plates may work well enough. Jack Kin­sel­la:

I wish there were dy­namic SRS decks for lan­guage learn­ing (or other dis­ci­plines). Such decks would count the num­ber of times you have re­viewed an in­stance of an un­der­ly­ing gram­mat­i­cal rule or an in­stance of a par­tic­u­lar piece of vo­cab­u­lary, for ex­am­ple its sin­gu­lar/­plu­ral/third per­son con­ju­ga­tion/da­tive form. These so­phis­ti­cated decks would present users with fresh ex­am­ple sen­tences on every re­view, thereby pre­vent­ing users from re­mem­ber­ing spe­cific an­swers and com­pelling them to learn the process of ap­ply­ing the gram­mat­i­cal rule afresh. More­over, these decks would keep users en­ter­tained through nov­elty and would present users with tacit learn­ing op­por­tu­ni­ties through ro­tat­ing vo­cab­u­lary used in non-essen­tial parts of the ex­am­ple sen­tence. Such a sys­tem, with mul­ti­ple-level re­view ro­ta­tion, would not only pre­vent against over­fit learn­ing, but also in­crease the to­tal amount of knowl­edge learned per min­ute, an effi­ciency I’d gladly in­vest in.

Even though these things seem like ‘skills’ and not ‘data’!

Popularity

As of 2011-05-02:

Met­ric Mnemosyne Mnemod­odo iSRS AnyMemo
Home­page Alexa 383k 27.5m 112k 1,766k65
ML/­fo­rum mem­bers 461 4129/215 129
Ubuntu in­stalls 7k 9k
De­bian in­stalls 164 364
Arch votes 85 96
iPhone rat­ings Un­re­leased66 193 69
An­droid rat­ings 20 703 836
An­droid in­stalls 100-500 10-50k 50-100k

Su­per­Memo does­n’t fall un­der the same rat­ings, but it has sold in the hun­dreds of thou­sands over its 2 decades:

Biedalak is CEO of Su­per­Memo World, which sells and li­censes Woz­ni­ak’s in­ven­tion. To­day, Su­per­Memo World em­ploys just 25 peo­ple. The ven­ture cap­i­tal never came through, and the com­pany never moved to Cal­i­for­nia. About 50,000 copies of Su­per­Memo were sold in 2006, most for less than $30. Many more are thought to have been pi­rat­ed.67

It seems safe to es­ti­mate the com­bined mar­ket-share of Anki, Mnemosyne, iSRS and other SRS apps at some­where un­der 50,000 users (mak­ing due al­lowance for users who in­stall mul­ti­ple times, those who in­stall and aban­don it, etc.). Rel­a­tively few users seem to have mi­grated from Su­per­Memo to those newer pro­grams, so it seems fair to sim­ply add that 50k to the other 50k and con­clude that the world­wide pop­u­la­tion is some­where around (but prob­a­bly un­der) 100,000.

Where was I going with this?

Nowhere, re­al­ly. Mnemosyne/SR soft­ware in gen­eral are just one of my fa­vorite tools: it’s based on a fa­mous effect68 dis­cov­ered by sci­ence, and it ex­ploits it el­e­gantly69 and use­ful­ly. It’s a tes­ta­ment to the En­light­en­ment ideal of im­prov­ing hu­man­ity through rea­son and over­com­ing our hu­man flaws; the idea of SR is se­duc­tive in its math­e­mat­i­cal rigor70. In this age where so often the ideal of ‘self­-im­prove­ment’ and progress are de­cried, and gloom are es­poused by even the com­mon peo­ple, it’s re­ally nice to just have a small ex­am­ple like this in one’s daily life, an ex­am­ple not yet so pro­saic and bor­ing as the light­bulb.

See Also

In the course of us­ing Mnemosyne, I’ve writ­ten a num­ber of scripts to gen­er­ate repet­i­tively vary­ing cards.

  • mnemo.hs will take any new­line-de­lim­ited chunk of text, like a po­em, and gen­er­ates every pos­si­ble ; that is, an ABC poem will be­come 3 ques­tions: _BC/ABC, A_C/ABC, AB_/ABC

  • mnemo2.hs works as above, but is more lim­ited and is in­tended for long chunks of text where mnemo.hs would cause a com­bi­na­to­r­ial ex­plo­sion of gen­er­ated ques­tions; it gen­er­ates a sub­set: for ABCD, one gets __CD/ABCD, A__D/ABCD, and AB__/ABCD (it re­moves 2 lines, and it­er­ates through the list).

  • mnemo3.hs is in­tended for date or name-based ques­tions. It’ll take in­put like “Barack Obama is %47%.” and spit out some ques­tions based on this: “Barack Obama is _7./47”, “Barack Obama is 4_./47” etc.

  • mnemo4.hs is in­tended for long lists of items. If one wants to mem­o­rize the list of US Pres­i­dents, the nat­ural ques­tions for flash­cards goes some­thing like “Who was the 3rd pres­i­den­t?/Thomas Jeffer­son”, “Thomas Jeffer­son was the _rd pres­i­den­t./3”, “Who was pres­i­dent after John Adams?/Thomas Jeffer­son”, “Who was pres­i­dent be­fore James Madis­on?/Thomas Jeffer­son”.

    You note there’s rep­e­ti­tion if you do this for each pres­i­dent - one asks the or­di­nal po­si­tion of the item both ways (item -> po­si­tion, po­si­tion -> item), what pre­cedes it, and what suc­ceeds it. mnemo4.hs au­to­mates this, given a list. In or­der to be gen­er­al, the word­ing is a bit odd, but it’s bet­ter than writ­ing it all out by hand! (Ex­am­ple out­put is in the to the source code).

The reader might well be cu­ri­ous by this point what my Mnemosyne data­base looks like. I use Mnemosyne quite a bit, and as of 2020-02-02, I have 16,149 (ac­tive) cards in my deck. Said cu­ri­ous reader may find my cards & me­dia at gwern.cards (52M; Mnemosyne 2.x for­mat).

The Mnemosyne project has been col­lect­ing user-sub­mit­ted spaced rep­e­ti­tion sta­tis­ti­cal data for years. The full dataset as of 2014-01-27 is avail­able for down­load by any­one who wishes to an­a­lyze it.


  1. “One does not learn com­put­ing by us­ing a hand cal­cu­la­tor, but one can for­get arith­metic.” Perlis 1982↩︎

  2. List­ing other neu­ro­pros­thet­ics is hard. It’s an in­ter­est­ing idea, but as pro­po­nents of like have found, it’s eas­ier to feel that ex­ter­nal­ism is mean­ing­ful than to nail down a clear de­fi­n­i­tion which sep­a­rates a neu­ro­pros­thetic or part of one’s mind from a ran­dom tool you like or find use­ful. Con­sider whether a pen­cil and pa­per a neu­ro­pros­thet­ic: clearly it is not for a child learn­ing to write, who must care­fully com­pose the words in his mind and put them down one after an­oth­er, but it is not so clear for an adult who has been writ­ing all his life and can doo­dle or write down thoughts with­out think­ing about them and may even be sur­prised at what they hap­pened to write.

    I like this de­fi­n­i­tion: “a neu­ro­pros­thetic is any­thing whose re­sults you use with­out fur­ther thought”. So in the clas­sic ex­am­ple, when Otto needs to go some­where, he never thinks “I am an am­ne­siac who stores lo­ca­tions in my notepad, and I must look up the lo­ca­tion” - he just looks up the lo­ca­tion. A good heuris­tic would be any­thing whose de­struc­tion leaves one feel­ing lost, slow, stu­pid, or ig­no­rant.

    By this stan­dard, I can think of only a few tools I use with­out no­tice­able thought:

    • key­bind­ings such as win­dow man­ager short­cuts, in par­tic­u­lar short­cuts for Google search­es; on oc­ca­sion, Prompt gets in­scrutably wedged, lock­ing it. When this hap­pens, I have to restart X be­cause I Google every­thing and the key­bind­ing is so en­grained that not us­ing it is un­bear­able. It would be like try­ing to write with your weak hand.
    • and Pre­dic­tion­Book: it is in­cred­i­ble how many fol­lowups or re­minders or reg­u­larly hap­pen­ing tasks I can put into Google Cal­en­dar or PB. I have out­sourced many habits or thoughts to them, and I no longer think of it as any­thing spe­cial. If ei­ther were gone, I would feel fright­ened - what events were pass­ing, what be­liefs fal­si­fied, what op­por­tu­ni­ties open­ing up (or clos­ing!) that I had sud­denly be­come ig­no­rant of?
    • , for a sim­i­lar rea­son; many of my mem­o­ries have ceased to be things like “oc­to­puses see too fast to watch TV and so only HDTV or works for them; I read this in Orion Mag­a­zine” and be­come things like “oc­to­pus TV Ever­note”, and if I want to know what it was about oc­to­puses & TV, well, I’ll have to look it up in Ever­note. Mnemosyne plays a sim­i­lar role for me, but there the mem­o­ries are much clearer on their own be­cause of the spaced rep­e­ti­tion.
    • my web­site Gw­ern.net; I’ve had to say many times that I don’t know what I think about some­thing, but what­ever that is, it’s on my web­site. (A more ex­treme form of the Ever­note/M­nemosyne neu­ro­pros­thet­ic.) A com­menter once wrote that read­ing Gw­ern.net felt like he was crawl­ing around in my head. He was more right than he re­al­ized.
    ↩︎
  3. as quoted in “Re­trieval prac­tice and the main­te­nance of knowl­edge”, Bjork 1988↩︎

  4. From “Close the Book. Re­call. Write It Down: That old study method still works, re­searchers say. So why don’t pro­fes­sors preach it?”;

    Two psy­chol­ogy jour­nals have re­cently pub­lished pa­pers show­ing that this strat­egy works, the lat­est find­ings from a decades-old body of re­search. When stu­dents study on their own, “ac­tive re­call” - recita­tion, for in­stance, or flash­cards and other self­-quizzing - is the most effec­tive way to in­scribe some­thing in long-term mem­o­ry. Yet many col­lege in­struc­tors are only dimly fa­mil­iar with that re­search…

    From “The Spac­ing Effect: A Case Study in the Fail­ure to Ap­ply the Re­sults of Psy­cho­log­i­cal Re­search” (Demp­ster 1988), whose ti­tle alone sum­ma­rizes the sit­u­a­tion (see also Kel­ley 2007, Mak­ing Minds: What’s Wrong with Ed­u­ca­tion - and What Should We Do About It?):

    Sec­ond, it [the spac­ing effect] is re­mark­ably ro­bust. In many cas­es, two spaced pre­sen­ta­tions are about twice as effec­tive as two massed pre­sen­ta­tions (e.g., Hintz­man, 1974; Melton, 1970), and the differ­ence be­tween them in­creases as the fre­quency of rep­e­ti­tion in­creases (Un­der­wood, 1970)…

    The spac­ing effect was known as early as 1885 when Ebbing­haus pub­lished the re­sults of his sem­i­nal work on mem­o­ry. With him­self as the sub­ject, Ebbing­haus found that for a sin­gle 12-syl­la­ble se­ries, 68 im­me­di­ately suc­ces­sive rep­e­ti­tions had the effect of mak­ing pos­si­ble an er­ror­less recital after seven ad­di­tional rep­e­ti­tions on the fol­low­ing day. How­ev­er, the same effect was achieved by only 38 dis­trib­uted rep­e­ti­tions spread over 3 days. On the ba­sis of this and other re­lated find­ings, Ebbing­haus con­cluded that ‘with any con­sid­er­able num­ber of rep­e­ti­tions a suit­able dis­tri­b­u­tion of them over a space of time is de­cid­edly more ad­van­ta­geous than the mass­ing of them at a sin­gle time’ (Eb­bing­haus, 1885/1913. p. 89)

    Son & Si­mon 2012:

    Fur­ther­more, even after ac­knowl­edg­ing the ben­e­fits of spac­ing, chang­ing teach­ing prac­tices proved to be enor­mously diffi­cult. De­laney et al (2010) wrote: “Anec­do­tal­ly, high school teach­ers and col­lege pro­fes­sors seem to teach in a lin­ear fash­ion with­out rep­e­ti­tion and give three or four non­cu­mu­la­tive ex­ams.” (p. 130). Fo­cus­ing on the math do­main, where one might ex­pect a very easy-to-re­view-and-to-space strat­e­gy, Rohrer (2009) points out that math­e­mat­ics text­books usu­ally present top­ics in a non-spaced, non-mixed fash­ion. Even much ear­lier, Vash (1989) had writ­ten: “Ed­u­ca­tion pol­icy set­ters know per­fectly well that [spaced prac­tice] works bet­ter [than massed prac­tice]. They don’t care. It is­n’t tidy. It does­n’t let teach­ers teach a unit and dust off their hands quickly with a nice sense of ‘Well, that’s done.’” (p. 1547).

    • Rohrer, D. (2009). “The effects of spac­ing and mix­ing prac­tice prob­lems”. Jour­nal for Re­search in Math­e­mat­ics Ed­u­ca­tion, 40, 4-17
    • Vash, C. L. (1989). “The spac­ing effect: A case study in the fail­ure to ap­ply the re­sults of psy­cho­log­i­cal re­search”. Amer­i­can Psy­chol­o­gist, 44, 1547 (a com­ment on Demp­ster’s ar­ti­cle?)

    From Psy­chol­o­gy: An In­tro­duc­tion:

    In one prac­ti­cal demon­stra­tion of the spac­ing effect, showed that re­ten­tion of for­eign lan­guage vo­cab­u­lary was greatly en­hanced if prac­tice ses­sions were spaced far apart. For ex­am­ple, “Thir­teen re­train­ing ses­sions spaced at 56 days yielded re­ten­tion com­pa­ra­ble to 26 ses­sions spaced at 14 days.” In other words, sub­jects could use half as many study ses­sions, if the study ses­sions were spread over a time pe­riod four times as long.

    ↩︎
  5. “Synap­tic ev­i­dence for the effi­cacy of spaced learn­ing”, Kra­mar et al 2012 (“Take your time: Neu­ro­bi­ol­ogy sheds light on the su­pe­ri­or­ity of spaced vs. massed learn­ing”):

    The su­pe­ri­or­ity of spaced vs. massed train­ing is a fun­da­men­tal fea­ture of learn­ing. Here, we de­scribe unan­tic­i­pated tim­ing rules for the pro­duc­tion of long-term po­ten­ti­a­tion (LTP) in adult rat hip­pocam­pal slices that can ac­count for one tem­po­ral seg­ment of the spaced tri­als phe­nom­e­non. Suc­ces­sive bouts of nat­u­ral­is­tic theta burst stim­u­la­tion of field CA1 affer­ents markedly en­hanced pre­vi­ously sat­u­rated LTP if spaced apart by 1 h or longer, but were with­out effect when shorter in­ter­vals were used. Analy­ses of F-act­in-en­riched spines to iden­tify po­ten­ti­ated synapses in­di­cated that the added LTP ob­tained with de­layed theta trains in­volved re­cruit­ment of synapses that were “missed” by the first stim­u­la­tion bout. Sin­gle spine glu­ta­mate-uncaging ex­per­i­ments con­firmed that less than half of the spines in adult hip­pocam­pus are primed to un­dergo plas­tic­ity un­der base­line con­di­tions, sug­gest­ing that in­trin­sic vari­abil­ity among in­di­vid­ual synapses im­poses a repet­i­tive pre­sen­ta­tion re­quire­ment for max­i­miz­ing the per­cent­age of po­ten­ti­ated con­nec­tions. We pro­pose that a com­bi­na­tion of lo­cal diffu­sion from ini­tially mod­i­fied spines cou­pled with much later mem­brane in­ser­tion events dic­tate that the rep­e­ti­tions be widely spaced. Thus, the synap­tic mech­a­nisms de­scribed here pro­vide a neu­ro­bi­o­log­i­cal ex­pla­na­tion for one com­po­nent of a poorly un­der­stood, ubiq­ui­tous as­pect of learn­ing.

    ↩︎
  6. There are many stud­ies to the effect that ac­tive re­call is best. Here’s one re­cent study, “Re­trieval Prac­tice Pro­duces More Learn­ing than Elab­o­ra­tive Study­ing with Con­cept Map­ping”, Karpicke 2011 (cov­ered in Sci­ence Daily and the NYT):

    Ed­u­ca­tors rely heav­ily on learn­ing ac­tiv­i­ties that en­cour­age elab­o­ra­tive study­ing, while ac­tiv­i­ties that re­quire stu­dents to prac­tice re­triev­ing and re­con­struct­ing knowl­edge are used less fre­quent­ly. Here, we show that prac­tic­ing re­trieval pro­duces greater gains in mean­ing­ful learn­ing than elab­o­ra­tive study­ing with con­cept map­ping. The ad­van­tage of re­trieval prac­tice gen­er­al­ized across texts iden­ti­cal to those com­monly found in sci­ence ed­u­ca­tion. The ad­van­tage of re­trieval prac­tice was ob­served with test ques­tions that as­sessed com­pre­hen­sion and re­quired stu­dents to make in­fer­ences. The ad­van­tage of re­trieval prac­tice oc­curred even when the cri­te­r­ial test in­volved cre­at­ing con­cept maps. Our find­ings sup­port the the­ory that re­trieval prac­tice en­hances learn­ing by re­trieval-spe­cific mech­a­nisms rather than by elab­o­ra­tive study process­es. Re­trieval prac­tice is an effec­tive tool to pro­mote con­cep­tual learn­ing about sci­ence.

    From “For­get What You Know About Good Study Habits”. New York Times;

    Cog­ni­tive sci­en­tists do not deny that hon­est-to-good­ness cram­ming can lead to a bet­ter grade on a given ex­am. But hur­riedly jam-pack­ing a brain is akin to speed-pack­ing a cheap suit­case, as most stu­dents quickly learn - it holds its new load for a while, then most every­thing falls out­….When the neural suit­case is packed care­fully and grad­u­al­ly, it holds its con­tents for far, far longer. An hour of study tonight, an hour on the week­end, an­other ses­sion a week from now: such so-called spac­ing im­proves later re­call, with­out re­quir­ing stu­dents to put in more over­all study effort or pay more at­ten­tion, dozens of stud­ies have found.

    “The idea is that for­get­ting is the friend of learn­ing”, said Dr. Ko­r­nell. “When you for­get some­thing, it al­lows you to re­learn, and do so effec­tive­ly, the next time you see it.”

    That’s one rea­son cog­ni­tive sci­en­tists see test­ing it­self - or prac­tice tests and quizzes - as a pow­er­ful tool of learn­ing, rather than merely as­sess­ment. The process of re­triev­ing an idea is not like pulling a book from a shelf; it seems to fun­da­men­tally al­ter the way the in­for­ma­tion is sub­se­quently stored, mak­ing it far more ac­ces­si­ble in the fu­ture.

    In one of his own ex­per­i­ments, Dr. Roedi­ger and Jeffrey Karpicke, who is now at Pur­due Uni­ver­si­ty, had col­lege stu­dents study sci­ence pas­sages from a read­ing com­pre­hen­sion test, in short study pe­ri­ods. When stu­dents stud­ied the same ma­te­r­ial twice, in back­-to-back ses­sions, they did very well on a test given im­me­di­ately after­ward, then be­gan to for­get the ma­te­r­i­al. But if they stud­ied the pas­sage just once and did a prac­tice test in the sec­ond ses­sion, they did very well on one test two days lat­er, and an­other given a week lat­er.

    ↩︎
  7. The Math­e­mat­ics of Gam­bling, Thorp 1984, “Sec­tion Two: The Wheels”, Chap­ter 4, pg43-44:

    It was the spring of 1955. I was fin­ish­ing my sec­ond year of grad­u­ate physics at U.C.L.A…I changed my field of study from physics to math­e­mat­ic­s…I at­tended classes and stud­ied from 50 to 60 hours a week, gen­er­ally in­clud­ing Sat­ur­days and Sun­days. I had read about the psy­chol­ogy of learn­ing in or­der to be able to work longer and hard­er. I found that “spaced learn­ing” worked well: study for an hour, then take a break of at least ten min­utes (show­er, meal, tea, er­rands, etc.). One Sun­day after­noon about 3 p.m., I came to the co-op din­ing room for a tea break…My head was bub­bling with physics equa­tions, and sev­eral of my good friends were sit­ting around chat­ting.

    ↩︎
  8. From Fi­nal Jeop­ardy: Man Vs. Ma­chine and the Quest to Know Every­thing, by Stephen Bak­er, pg 214:

    The pro­gram he put to­gether tested him on cat­e­gories, gauged his strengths (sciences, NFL foot­ball) and weak­nesses (fash­ion, Broad­way shows), and then di­rected him to­ward the prepa­ra­tion most likely to pay off in his own match. To patch these holes in his knowl­edge, Craig used a free on­line tool called Anki, which pro­vides elec­tronic flash cards for hun­dreds of fields of study, from Japan­ese vo­cab­u­lary to Eu­ro­pean mon­archs. The pro­gram, in Craig’s words, is based on psy­cho­log­i­cal re­search on ‘the for­get­ting curve’. It helps peo­ple find holes in their knowl­edge and de­ter­mines how often they need those ar­eas to be re­viewed to keep them in mind. In go­ing over world cap­i­tals, for ex­am­ple, the sys­tem learns quickly that a user like Craig knows Lon­don, Paris, and Rome, so it might spend more time re­in­forc­ing the cap­i­tal of, say, Kaza­khstan. (And what would be the Kazakh cap­i­tal? ‘As­tana’, Craig said in a flash. ‘It used to be Al­maty, but they moved it.’)

    ↩︎
  9. “Our In­ter­view With Jeop­ardy! Cham­pion Arthur Chu”:

    [Chu:] …Jeop­ardy! is aimed at the sort of av­er­age TV view­er, so they’re not go­ing to ask things that are point­lessly ob­scure…So I used a pro­gram called Anki which uses a method called “spaced rep­e­ti­tion.” It keeps track of where you’re do­ing well or poor­ly, and pushes you to study the flash­cards you don’t know as well, un­til you de­velop an even knowl­edge base about a par­tic­u­lar sub­ject, and I just made flash­cards for those spe­cific things. I mem­o­rized all the world cap­i­tals, it was­n’t that hard once I had the flash­cards and was us­ing them every day. I mem­o­rized the US State Nick­names (they’re on Wikipedi­a), mem­o­rized the ba­sic im­por­tant facts about the 44 US Pres­i­dents. I re­ally fo­cused on those. But there’s a lot more stuff to know. I went on Jeop­ardy! know­ing that there was stuff I did­n’t know. For in­stance, every­one laughs about sports - but I also knew that [s­ports clues] were the least likely to come up in Dou­ble Jeop­ardy and Fi­nal Jeop­ardy and be very im­por­tant. So I de­cided I should­n’t sweat it too much, I should just rec­og­nize that I did­n’t know them and let that go, as long as I can get the high value clues. So that was how I pre­pared.

    ↩︎
  10. Alan J. Perlis, (1982)↩︎

  11. Web de­vel­oper Per­sol writes in Au­gust 2012:

    I ac­tu­ally wrote a site that did this [spaced rep­e­ti­tion] a few months ago. I had about 4000 users who had ac­tu­ally gone through a com­plete ses­sion…As guessed, the prob­lem is that I could­n’t get peo­ple to start form­ing it as a habit. There is no im­me­di­ate pay­back. Less than 20 peo­ple out of 4000 did more than one ses­sion…Ad­di­tion­al­ly, there are at least 18 com­peti­tors. Here’s the list I made at the time. Very few seem to be suc­cess­ful. I shut the site down about a month ago. There are nu­mer­ous free com­peti­tors which don’t have any great an­noy­ances. I would­n’t sug­gest start­ing an­other of these sites un­less you fig­ured out an effec­tive way to “gam­ify” it.

    …~4000 peo­ple fin­ished a ses­sion. Many more ‘tried’ than 4000…I just could­n’t de­ter­mine which users were bots that reg­is­tered ran­domly vs users that did­n’t fin­ish the first ses­sion.

    • Tried: lots (but un­known)
    • Fin­ished 1 ses­sion: ~4000
    • Fin­ished >1 ses­sion: ~20 [0.5%]
    ↩︎
  12. “Play it Again: The Mas­ter Psy­chophar­ma­col­ogy Pro­gram as an Ex­am­ple of In­ter­val Learn­ing in Bite-Sized Por­tions”, Stahl et al 2010:

    Since Ebbing­haus’ time, a vo­lu­mi­nous amount of re­search has con­firmed this sim­ple but im­por­tant fact: the re­ten­tion of new in­for­ma­tion de­grades rapidly un­less it is re­viewed in some man­ner. A mod­ern ex­am­ple of this loss of knowl­edge with­out rep­e­ti­tion is a study of car­diopul­monary re­sus­ci­ta­tion (CPR) skills that demon­strated rapid de­cay in the year fol­low­ing train­ing. By 3 years post-train­ing only 2.4% were able to per­form CPR suc­cess­ful­ly.6 An­other re­cent study of physi­cians tak­ing a tu­to­r­ial they rated as very good or ex­cel­lent showed mean knowl­edge scores in­creas­ing from 50% be­fore the tu­to­r­ial to 76% im­me­di­ately after­ward. How­ev­er, score gains were only half as great 3-8 days later and in­cred­i­bly, there was no [s­ta­tis­ti­cal­ly-]sig­nifi­cant knowl­edge re­ten­tion mea­sur­able at all at 55 days.7 Sim­i­lar re­sults have been re­ported by us in fol­low-up stud­ies of knowl­edge re­ten­tion from con­tin­u­ing med­ical ed­u­ca­tion pro­grams.1 [S­tahl SM, Davis RL. Best Prac­tices for Med­ical Ed­u­ca­tors. Carls­bad, CA: NEI Press; 2009]

    …This may be due to the fact that lec­tures with as­signed read­ing are the eas­i­est for teach­ers. Al­so, med­ical learn­ing is rarely mea­sured im­me­di­ately after a lec­ture or after read­ing new ma­te­r­ial for the first time and then mea­sured again a few days or weeks lat­er, so that the low re­ten­tion rates of this ap­proach may not be widely ap­pre­ci­at­ed.1,4 No won­der for­mal med­ical ed­u­ca­tion con­fer­ences with­out en­abling or prac­tice-re­in­forc­ing strate­gies ap­pear to have rel­a­tively lit­tle im­pact on prac­tice and health­care out­comes.8,9,10

    ↩︎
  13. One study look­ing at cram­ming is the 1993 “Cram­ming: A bar­rier to stu­dent suc­cess, a way to beat the sys­tem or an effec­tive learn­ing strat­e­gy?”, Vacha et al 1993, ab­stract:

    Tested the hy­poth­e­sis that cram­ming is an in­effec­tive study strat­egy by ex­am­in­ing the weekly study di­aries of 166 un­der­grad­u­ates. All Ss also com­pleted an end-of-se­mes­ter ques­tion­naire mea­sur­ing study habits. Ss were clas­si­fied in the fol­low­ing study pat­terns: ide­al, con­fi­dent, zeal­ous, or cram­mer. Con­trary to the hy­poth­e­sis, re­sults sug­gest that cram­ming is an effec­tive ap­proach, most wide­spread in courses us­ing take-home es­say ex­am­i­na­tions and ma­jor re­search pa­pers. Cram­mers’ grades were as good as or bet­ter than those of Ss us­ing other strate­gies; the longer Ss were in col­lege, the more likely it was that they crammed. Cram­mers stud­ied more hours than most stu­dents and were as in­ter­ested in their courses as other stu­dents.

    Note that there is no mea­sure of long-term re­ten­tion, sug­gest­ing that peo­ple who only care about grades are ra­tio­nally choos­ing to cram.↩︎

  14. Anki has its Cram Mode and Mnemosyne 2.0 has a cram­ming plu­g­in. When a SRS does­n’t have ex­plicit sup­port, it’s al­ways pos­si­ble to ‘game’ the al­go­rithm by set­ting one’s scores ar­ti­fi­cially low, so the SR al­go­rithm thinks you are stu­pid and need to do a lot of rep­e­ti­tions.↩︎

  15. “Ex­am­in­ing the ex­am­in­ers: Why are we so bad at as­sess­ing stu­dents?”, New­stead 2002:

    Con­way, Co­hen and Stan­hope (1992) looked at long term mem­ory for the in­for­ma­tion pre­sented on a psy­chol­ogy course. They found that some types of in­for­ma­tion, es­pe­cially that re­lat­ing to re­search meth­ods, were re­mem­bered bet­ter than oth­ers. But in a fol­low up analy­sis, they found that the type of as­sess­ment used had an effect on mem­o­ry. In essence, ma­te­r­ial as­sessed by con­tin­u­ous as­sess­ment was more likely to be re­mem­bered than in­for­ma­tion as­sessed by ex­ams.

    ↩︎
  16. Stahl 2010:

    For ex­am­ple, sim­ple restudy­ing al­lows the learner to re­ex­pe­ri­ence all of the ma­te­r­ial but ac­tu­ally pro­duces poor long-term re­ten­tion.,26,35 Why do stu­dents keep study­ing the orig­i­nal ma­te­ri­als? Cer­tainly if this is their only choice, then restudy­ing is a nec­es­sary tac­tic. An­other an­swer may be that re­peated study­ing falsely in­flates stu­dents’ con­fi­dence in their abil­ity to re­mem­ber in the fu­ture be­cause they sense that they un­der­stand it now, and they and their in­struc­tors may be un­aware of the many stud­ies that show poor re­ten­tion on de­layed test­ing after this form of rep­e­ti­tion.25,26,35

    ↩︎
  17. From Ko­r­nell et al 2010:

    Con­trary to the mass­ing-aid­s-in­duc­tion hy­poth­e­sis, fi­nal test per­for­mance was con­sis­tently and con­sid­er­ably su­pe­rior in the spaced con­di­tion. A large ma­jor­ity of par­tic­i­pants, how­ev­er, judged mass­ing to be more effec­tive than spac­ing, de­spite mak­ing the judg­ment after tak­ing the test.

    …Metacog­ni­tive judg­ments-that is, judg­ments about one’s own mem­ory and cog­ni­tion-are often based on feel­ings of flu­en­cy(e.g., see Ben­jam­in, Bjork, & Schwartz, 1998; Rhodes & Castel, 2008). Be­cause mass­ing nat­u­rally leads to feel­ings of flu­ency and in­creases short­-term task per­for­mance dur­ing learn­ing, learn­ers fre­quently rate spac­ing as less effec­tive than mass­ing, even when their per­for­mance shows the op­po­site pat­tern (Bad­de­ley & Long­man 1978; Ko­r­nell & Bjork, 2008; Si­mon & Bjork, 2001; Zech­meis­ter & Shaugh­nessy, 1980). Av­er­aged across Ko­r­nell and Bjork’s (2008) ex­per­i­ments, for ex­am­ple, more than 80% of par­tic­i­pants rated mass­ing as equally or more effec­tive than spac­ing,whereas only 15% of par­tic­i­pants ac­tu­ally per­formed bet­ter in the massed con­di­tion than in the spaced con­di­tion.

    …Such an il­lu­sion was ap­par­ent in the in­duc­tion con­di­tion. Con­trary to pre­vi­ous re­search, how­ev­er, par­tic­i­pants gave higher rat­ings for spac­ing than mass­ing dur­ing rep­e­ti­tion learn­ing (see, e.g., Si­mon & Bjork, 2001; Zech­meis­ter & Shaugh­nessy, 1980). This out­come may have oc­curred be­cause of a process of a ha­bit­u­a­tion: Six pre­sen­ta­tions and a to­tal of 30 s spent study­ing a sin­gle paint­ing may have come to seem in­effi­cient and point­less. Thus, there ap­pears to be a turn­ing point in metacog­ni­tive rat­ings based on flu­en­cy: As flu­ency in­creas­es, metacog­ni­tive rat­ings in­crease up to a point, but as flu­ency con­tin­ues to in­crease and en­cod­ing or re­trieval be­comes too easy, metacog­ni­tive rat­ings may be­gin to de­crease.

    …In ad­vance of their re­search, Ko­r­nell and Bjork (2008) were con­vinced that such in­duc­tive learn­ing would ben­e­fit from mass­ing, yet their re­sults showed the op­po­site. Un­daunt­ed, we re­mained con­vinced that spac­ing would be more ben­e­fi­cial for rep­e­ti­tion learn­ing than for in­duc­tive learn­ing- es­pe­cially for older adults, given their over­all de­clines in episodic mem­o­ry. The cur­rent re­sults dis­con­firmed our ex­pec­ta­tions once again. If our in­tu­itions are er­ro­neous, de­spite our years spent prov­ing and prais­ing the spac­ing effec­t-in­clud­ing roughly 40 years’ worth con­tributed by Robert A. Bjork-those of the av­er­age stu­dent are surely mis­taken as well (as the in­ac­cu­racy of the par­tic­i­pants’ metacog­ni­tive rat­ings sug­gest­s). We have, per­haps, fallen vic­tim to the il­lu­sion that mak­ing learn­ing easy makes learn­ing effec­tive, rather than rec­og­niz­ing that spac­ing is a de­sir­able diffi­culty (Bjork 1994) that en­hances in­duc­tive learn­ing as well as rep­e­ti­tion learn­ing well into old age.

    ↩︎
  18. From Son & Si­mon 2012:

    Thus, while spac­ing may boost learn­ing, it may be thought to be rel­a­tively in­effi­cient in terms of study time. As we dis­cuss lat­er, this feel­ing of in­effi­ciency may be one of the rea­sons that spac­ing is not the more pop­u­lar strat­e­gy. In­ter­est­ing­ly, in that same study (Bad­de­ley & Long­man 1978; and see also Pirolli & An­der­son 1985 and Wood­worth & Schlos­berg 1954 [Ex­per­i­men­tal Psy­chol­ogy]), there was ev­i­dence of such a thing as la­bor­ing in vain. That is, ex­ceed­ing a cer­tain num­ber of hours of prac­tice a day (more than ap­prox­i­mately 2 h) led to no in­creases in learn­ing, as might be ex­pect­ed. Re­lated to the de­fi­cien­t-pro­cess­ing the­ory men­tioned above, these re­sults are cru­cial in un­der­stand­ing in­tu­itively how the spac­ing effect works: We sim­ply get burnt out. These data are also anal­o­gous to the cog­ni­tive lit­er­a­ture on over­learn­ing, which shows that while con­tin­u­ous study over long pe­ri­ods of time might seem ben­e­fi­cial (and even feel good) in the short­-term, the ben­e­fits dis­ap­pear soon after­wards (Rohrer et al. 2005; Rohrer and Tay­lor 2006)…In the above-de­scribed Bad­de­ley and Long­man’s (1978) study, for ex­am­ple, after postal work­ers prac­ticed typ­ing in ei­ther massed or spaced study ses­sions, they had to in­di­cate how sat­is­fied they were with the train­ing. Re­sults showed that while spac­ing led to the best learn­ing, it was the least liked. Sim­i­lar­ly, Si­mon & Bjork (2001) found that peo­ple pre­ferred the mass­ing strat­egy on a mo­tor learn­ing task.

    ↩︎
  19. “Study strate­gies of col­lege stu­dents: Are self­-test­ing and sched­ul­ing re­lated to achieve­ment?”, Hartwig & Dun­losky 2012:

    Pre­vi­ous stud­ies, such as those by Ko­r­nell and Bjork (Psy­cho­nomic Bul­letin & Re­view, 14:219-224, 2007) and Karpicke, But­ler, and Roedi­ger (Mem­ory, 17:471-479, 2009), have sur­veyed col­lege stu­dents’ use of var­i­ous study strate­gies, in­clud­ing self­-test­ing and reread­ing. These stud­ies have doc­u­mented that some stu­dents do use self­-test­ing (but largely for mon­i­tor­ing mem­o­ry) and reread­ing, but the re­searchers did not as­sess whether in­di­vid­ual differ­ences in strat­egy use were re­lated to stu­dent achieve­ment. Thus, we sur­veyed 324 un­der­grad­u­ates about their study habits as well as their col­lege grade point av­er­age (GPA). Im­por­tant­ly, the sur­vey in­cluded ques­tions about self­-test­ing, sched­ul­ing one’s study, and a check­list of strate­gies com­monly used by stu­dents or rec­om­mended by cog­ni­tive re­search. Use of self­-test­ing and reread­ing were both pos­i­tively as­so­ci­ated with GPA. Sched­ul­ing of study time was also an im­por­tant fac­tor: Low per­form­ers were more likely to en­gage in late-night study­ing than were high per­form­ers; mass­ing (vs. spac­ing) of study was as­so­ci­ated with the use of fewer study strate­gies over­all; and all stu­dents-but es­pe­cially low per­form­er­s-were dri­ven by im­pend­ing dead­lines. Thus, self­-test­ing, reread­ing, and sched­ul­ing of study play im­por­tant roles in re­al-world stu­dent achieve­ment.

    (See also Dun­losky et al 2013.) Note the self­-test­ing cor­re­la­tion ex­cludes flash­cards, a re­sult that both the au­thors and me found sur­pris­ing. The sleep con­nec­tion is in­ter­est­ing, given the hy­poth­e­sized link be­tween stronger mem­ory for­ma­tion & study­ing be­fore a good night’s sleep - you can hardly get a good night’s sleep if you are cram­ming late into the night (cor­re­lated with lower grades) but you can if you do so at a rea­son­able time in the evening (in time to get a solid night).

    See also Susser & Mc­Cabe 2012:

    Lab­o­ra­tory stud­ies have demon­strated the long-term mem­ory ben­e­fits of study­ing ma­te­r­ial in mul­ti­ple dis­trib­uted ses­sions as op­posed to one massed ses­sion, given an iden­ti­cal amount of over­all study time (i.e., the spac­ing effect). The cur­rent study goes be­yond the lab­o­ra­tory to in­ves­ti­gate whether un­der­grad­u­ates know about the ad­van­tage of spaced study, to what ex­tent they use it in their own study­ing, and what fac­tors might in­flu­ence its uti­liza­tion. Re­sults from a we­b-based sur­vey in­di­cated that par­tic­i­pants (n = 285) were aware of the ben­e­fits of spaced study and would use a higher level of spac­ing un­der ideal com­pared to re­al­is­tic cir­cum­stances. How­ev­er, self­-re­ported use of spac­ing was in­ter­me­di­ate, sim­i­lar to mass­ing and sev­eral other study strate­gies, and ranked well be­low com­monly used strate­gies such as reread­ing notes. Sev­eral fac­tors were en­dorsed as im­por­tant in the de­ci­sion to dis­trib­ute study time, in­clud­ing the per­ceived diffi­culty of an up­com­ing ex­am, the amount of ma­te­r­ial to learn, how heav­ily an exam is weighed in the course grade, and the value of the ma­te­r­i­al. Fur­ther, level of metacog­ni­tive self­-reg­u­la­tion and use of elab­o­ra­tion strate­gies were as­so­ci­ated with higher rates of spaced study.

    ↩︎
  20. An­a­lytic Cul­ture in the US In­tel­li­gence Com­mu­ni­ty: An Ethno­graphic Study, John­ston 2005, pg89:

    To in­ves­ti­gate the in­ten­sity of in­struc­tional in­ter­ac­tions, Art Graesser and Na­talie Per­son 1994 com­pared ques­tion­ing and an­swer­ing in class­rooms with those in tu­to­r­ial set­tings.5 They found that class­room groups of stu­dents ask about three ques­tions an hour and that any sin­gle stu­dent in a class­room asks about 0.11 ques­tions per hour. In con­trast, they found that stu­dents in in­di­vid­ual tu­to­r­ial ses­sions asked 20-30 ques­tions an hour and were re­quired to an­swer 117-146 ques­tions per hour. Re­views of the in­ten­sity of in­ter­ac­tion that oc­curs in tech­nol­o­gy-based in­struc­tion have found even more ac­tive stu­dent re­sponse lev­els. [J. D. Fletcher, Tech­nol­o­gy, the Colum­bus Effect, and the Third Rev­o­lu­tion in Learn­ing.]

    Al­though Graesser & Per­son 1994 also found that sheer num­ber of ques­tions was not nec­es­sar­ily im­por­tant, sug­gest­ing or per­haps bad ques­tion ask­ing.↩︎

  21. “Su­per­Memo is based on the in­sight that there is an ideal mo­ment to prac­tice what you’ve learned. Prac­tice too soon and you waste your time. Prac­tice too late and you’ve for­got­ten the ma­te­r­ial and have to re­learn it. The right time to prac­tice is just at the mo­ment you’re about to for­get. Un­for­tu­nate­ly, this mo­ment is differ­ent for every per­son and each bit of in­for­ma­tion. Imag­ine a pile of thou­sands of flash cards. Some­where in this pile are the ones you should be prac­tic­ing right now. Which are they?” Gary Wolf, “Want to Re­mem­ber Every­thing You’ll Ever Learn? Sur­ren­der to This Al­go­rithm”, ↩︎

  22. “Make no mis­take about it: Com­put­ers process num­bers - not sym­bols. We mea­sure our un­der­stand­ing (and con­trol) by the ex­tent to which we can arith­me­tize an ac­tiv­i­ty.” Perlis, ibid.↩︎

  23. this ex­po­nen­tial ex­pan­sion is how a SR pro­gram can han­dle con­tin­ual in­put of cards: if cards were sched­uled at fixed in­ter­vals, like every other day, re­view would soon be­come quite im­pos­si­ble - I have >18000 items in Mnemosyne, but I don’t have time to re­view 9000 ques­tions a day!↩︎

  24. See the 2008 meta-analy­sis, “Learn­ing Styles: Con­cepts and Ev­i­dence” (APS press re­lease); from the ab­stract:

    …in or­der to demon­strate that op­ti­mal learn­ing re­quires that stu­dents re­ceive in­struc­tion tai­lored to their pu­ta­tive learn­ing style, the ex­per­i­ment must re­veal a spe­cific type of in­ter­ac­tion be­tween learn­ing style and in­struc­tional method: Stu­dents with one learn­ing style achieve the best ed­u­ca­tional out­come when given an in­struc­tional method that differs from the in­struc­tional method pro­duc­ing the best out­come for stu­dents with a differ­ent learn­ing style. In other words, the in­struc­tional method that proves most effec­tive for stu­dents with one learn­ing style is not the most effec­tive method for stu­dents with a differ­ent learn­ing style.

    Our re­view of the lit­er­a­ture dis­closed am­ple ev­i­dence that chil­dren and adults will, if asked, ex­press pref­er­ences about how they pre­fer in­for­ma­tion to be pre­sented to them. There is also plen­ti­ful ev­i­dence ar­gu­ing that peo­ple differ in the de­gree to which they have some fairly spe­cific ap­ti­tudes for differ­ent kinds of think­ing and for pro­cess­ing differ­ent types of in­for­ma­tion. How­ev­er, we found vir­tu­ally no ev­i­dence for the in­ter­ac­tion pat­tern men­tioned above, which was judged to be a pre­con­di­tion for val­i­dat­ing the ed­u­ca­tional ap­pli­ca­tions of learn­ing styles. Al­though the lit­er­a­ture on learn­ing styles is enor­mous, very few stud­ies have even used an ex­per­i­men­tal method­ol­ogy ca­pa­ble of test­ing the va­lid­ity of learn­ing styles ap­plied to ed­u­ca­tion. More­over, of those that did use an ap­pro­pri­ate method, sev­eral found re­sults that flatly con­tra­dict the pop­u­lar mesh­ing hy­poth­e­sis.

    We con­clude there­fore, that at pre­sent, there is no ad­e­quate ev­i­dence base to jus­tify in­cor­po­rat­ing learn­ing-styles as­sess­ments into gen­eral ed­u­ca­tional prac­tice. Thus, lim­ited ed­u­ca­tion re­sources would bet­ter be de­voted to adopt­ing other ed­u­ca­tional prac­tices that have a strong ev­i­dence base, of which there are an in­creas­ing num­ber. How­ev­er, given the lack of method­olog­i­cally sound stud­ies of learn­ing styles, it would be an er­ror to con­clude that all pos­si­ble ver­sions of learn­ing styles have been tested and found want­i­ng; many have sim­ply not been tested at all.

    ↩︎
  25. Fritz, C. O., Mor­ris, P. E., Ac­ton, M., Etkind, R., & Voelkel, A. R (2007). “Com­par­ing and com­bin­ing ex­pand­ing re­trieval prac­tice and the key­word mnemonic for for­eign vo­cab­u­lary learn­ing”. Ap­plied Cog­ni­tive Psy­chol­ogy, 21, 499-526.↩︎

  26. From Balota et al 2006, de­scrib­ing Spitzer 1939, “Stud­ies in re­ten­tion”:

    Spitzer (1939) in­cor­po­rated a form of ex­panded re­trieval in a study de­signed to as­sess the abil­ity of sixth graders to learn sci­ence facts. Im­pres­sive­ly, Spitzer tested over 3600 stu­dents in Iowa-the en­tire six­th-grade pop­u­la­tion of 91 el­e­men­tary schools at the time. The stu­dents read two ar­ti­cles, one on peanuts and the other on bam­boo, and were given a 25-item mul­ti­ple choice test to as­sess their knowl­edge (such as ‘To which fam­ily of plants does bam­boo be­long?’). Spitzer tested a to­tal of nine groups, ma­nip­u­lat­ing both the tim­ing of the test (ad­min­is­tered im­me­di­ately or after var­i­ous de­lays) and the num­ber of iden­ti­cal tests stu­dents re­ceived (one to three). Spitzer did not in­cor­po­rate massed or equal in­ter­val re­trieval con­di­tions, but he had at least two groups that were tested on an ex­pand­ing sched­ule of re­trieval, in which the in­ter­vals be­tween tests were sep­a­rated by the pas­sage of time (in days) rather than by in­ter­ven­ing to-be-learned in­for­ma­tion. For ex­am­ple, in one of the groups, the first test was given im­me­di­ate­ly, the sec­ond test was given seven days after the first test, and the third test was given 63 days after the sec­ond test. Thus, in essence, this group was tested on a 0-7-63 day ex­pand­ing re­trieval sched­ule. Spitzer com­pared per­for­mance of the ex­panded re­trieval group to a group given a sin­gle test 63 days after read­ing the orig­i­nal ar­ti­cle. On the first (im­me­di­ate) test, the ex­panded re­trieval group cor­rectly an­swered 53% of the ques­tions. After 63 days and two pre­vi­ous tests, their score was still an im­pres­sive 43%. The sin­gle test group cor­rectly an­swered only 25% of the orig­i­nal items after 63 days, giv­ing the ex­panded re­trieval group an 18% re­ten­tion ad­van­tage. This is quite im­pres­sive, given that this large ben­e­fit re­mained after a 63-day re­ten­tion in­ter­val. Sim­i­lar ben­e­fi­cial effects were found in a group tested on a 0-1-21 day ex­panded re­trieval sched­ule com­pared to a group given a sin­gle test after 21 days. Of course, this study does not de­cou­ple the effects of test­ing from spac­ing or ex­pan­sion, but the re­sults do clearly in­di­cate con­sid­er­able learn­ing and re­ten­tion us­ing the ex­panded re­peated test­ing pro­ce­dure. Spitzer con­cluded that ‘…ex­am­i­na­tions are learn­ing de­vices and should not be con­sid­ered only as tools for mea­sur­ing achieve­ment of pupils’ (p. 656, ital­ics added)

    ↩︎
  27. , Vlach & Sand­hofer 2012:

    The spac­ing effect de­scribes the ro­bust find­ing that long-term learn­ing is pro­moted when learn­ing events are spaced out in time, rather than pre­sented in im­me­di­ate suc­ces­sion. Stud­ies of the spac­ing effect have fo­cused on mem­ory processes rather than for other types of learn­ing, such as the ac­qui­si­tion and gen­er­al­iza­tion of new con­cepts. In this study, early el­e­men­tary school chil­dren (5-7 year-olds; N = 36) were pre­sented with sci­ence lessons on one of three sched­ules: massed, clumped, and spaced. The re­sults re­vealed that spac­ing lessons out in time re­sulted in higher gen­er­al­iza­tion per­for­mance for both sim­ple and com­plex con­cepts. Spaced learn­ing sched­ules pro­mote sev­eral types of learn­ing, strength­en­ing the im­pli­ca­tions of the spac­ing effect for ed­u­ca­tional prac­tices and cur­ricu­lum.

    ↩︎
  28. See also Balch 2006, who com­pared spac­ing & massed in an in­tro­duc­tory psy­chol­ogy course as well.↩︎

  29. Roedi­ger & Karpicke 2006b again.↩︎

  30. Balota et al 2006 re­view:

    No feed­back or cor­rec­tion was given to sub­jects if they made er­rors or omit­ted an­swers. Lan­dauer & Bjork 1978 found that the ex­pand­ing-in­ter­val sched­ule pro­duced bet­ter re­call than equal-in­ter­val test­ing on a fi­nal test at the end of the ses­sion, and equal-in­ter­val test­ing, in turn, pro­duced bet­ter re­call than did ini­tial massed test­ing. Thus, de­spite the fact that massed test­ing pro­duced nearly er­ror­less per­for­mance dur­ing the ac­qui­si­tion phase, the other two sched­ules pro­duced bet­ter re­ten­tion on the fi­nal test given at the end of the ses­sion. How­ev­er, the differ­ence fa­vor­ing the ex­pand­ing re­trieval sched­ule over the equal-in­ter­val sched­ule was fairly small at around 10%. In re­search fol­low­ing up Lan­dauer and Bjork’s (1978) orig­i­nal ex­per­i­ments, prac­ti­cally all stud­ies have found that spaced sched­ules of re­trieval (whether equal-in­ter­val or ex­pand­ing sched­ules) pro­duce bet­ter re­ten­tion on a fi­nal test given later than do massed re­trieval tests given im­me­di­ately after pre­sen­ta­tion (e.g., Cull, 2000; Cull, Shaugh­nessy, & Zech­meis­ter, 1996), al­though ex­cep­tions do ex­ist. For ex­am­ple, in Ex­per­i­ments 3 and 4 of Cull et al (1996), massed test­ing pro­duced per­for­mance as good as equal-in­ter­val test­ing on a 5-5-5 sched­ule, but most other ex­per­i­ments have found that any spaced sched­ule of test­ing (ei­ther equal-in­ter­val or ex­pand­ing) is bet­ter than a massed sched­ule for per­for­mance on a de­layed test. How­ev­er, whether ex­pand­ing sched­ules are bet­ter than equal-in­ter­val sched­ules for long-term re­ten­tion-the other part of Lan­dauer and Bjork’s in­ter­est­ing find­ings-re­mains an open ques­tion. Balota, Duchek, and Lo­gan (in press) have pro­vided a thor­ough con­sid­er­a­tion of the rel­e­vant ev­i­dence and have shown that it is mixed at best, and that most re­searchers have found no differ­ence be­tween the two sched­ules of test­ing. That is, per­for­mance on a fi­nal test at the end of a ses­sion often shows no differ­ence in per­for­mance be­tween equal-in­ter­val and ex­pand­ing re­trieval sched­ules.

    Cull, for those cu­ri­ous (Cull, W. L. (2000). “Un­tan­gling the ben­e­fits of mul­ti­ple study op­por­tu­ni­ties and re­peated test­ing for cued re­call”. Ap­plied Cog­ni­tive Psy­chol­ogy, 14, 215-235):

    Cull (2000) com­pared ex­panded re­trieval to equal in­ter­val spaced re­trieval in a se­ries of four ex­per­i­ments de­signed to mimic typ­i­cal teach­ing or study strate­gies en­coun­tered by stu­dents. He ex­am­ined the role of test­ing ver­sus sim­ply restudy­ing the ma­te­ri­al, feed­back, and var­i­ous re­ten­tion in­ter­vals on fi­nal test per­for­mance. Paired as­so­ciates (an un­com­mon word paired with a com­mon word, such as bairn-print) were pre­sented in a man­ner sim­i­lar to the flash­card tech­niques stu­dents often use to learn vo­cab­u­lary words. The in­ter­vals be­tween re­trieval at­tempts of to-be-learned in­for­ma­tion ranged from min­utes in some ex­per­i­ments to days in oth­ers. In­ter­est­ing­ly, across four ex­per­i­ments, Cull did not find any ev­i­dence of an ad­van­tage of an ex­panded con­di­tion over a uni­form spaced con­di­tion (i.e., no [sub­stan­tial] ex­panded re­trieval effec­t), al­though both con­di­tions con­sis­tently pro­duced large ad­van­tages over massed pre­sen­ta­tions. He con­cluded that dis­trib­uted test­ing of any kind, ex­panded or equal in­ter­val, can be an effec­tive learn­ing aid for teach­ers to pro­vide for their stu­dents.

    ↩︎
  31. The Balota et al 2006 re­view offers a syn­the­sis of cur­rent the­o­ries on how massed and spaced differ, based on :

    Ac­cord­ing to en­cod­ing vari­abil­ity the­o­ry, per­for­mance on a mem­ory test is de­pen­dent upon the over­lap be­tween the con­tex­tual in­for­ma­tion avail­able at the time of test and the con­tex­tual in­for­ma­tion avail­able dur­ing en­cod­ing. Dur­ing massed study, there is rel­a­tively lit­tle time for con­tex­tual el­e­ments to fluc­tu­ate be­tween pre­sen­ta­tions and so this con­di­tion pro­duces the high­est per­for­mance in an im­me­di­ate mem­ory test, when the test con­text strongly over­laps with the same con­tex­tual in­for­ma­tion en­coded dur­ing both of the massed pre­sen­ta­tions. In con­trast, when there is spac­ing be­tween the items, there is time for fluc­tu­a­tion to take place be­tween the pre­sen­ta­tions dur­ing study, and hence there is an in­creased like­li­hood of hav­ing mul­ti­ple unique con­texts en­cod­ed. Be­cause a de­layed test will also al­low fluc­tu­a­tion of con­text, it is bet­ter to have mul­ti­ple unique con­texts en­cod­ed, as in the spaced pre­sen­ta­tion for­mat, as op­posed to a sin­gle en­coded con­text, as in the massed pre­sen­ta­tion for­mat.

    Storm et al 2010 did 3 ex­per­i­ments on read­ing com­pre­hen­sion:

    On a test 1 week lat­er, re­call was en­hanced by the ex­pand­ing sched­ule, but only when the task be­tween suc­ces­sive re­trievals was highly in­ter­fer­ing with mem­ory for the pas­sage. These re­sults sug­gest that the ex­tent to which learn­ers ben­e­fit from ex­pand­ing re­trieval prac­tice de­pends on the de­gree to which the to-be-learned in­for­ma­tion is vul­ner­a­ble to for­get­ting.

    ↩︎
  32. From Mnemosyne’s Prin­ci­ples page:

    The Mnemosyne al­go­rithm is very sim­i­lar to SM2 used in one of the early ver­sions of Su­per­Me­mo. There are some mod­i­fi­ca­tions that deal with early and late rep­e­ti­tions, and also to add a small, healthy dose of ran­dom­ness to the in­ter­vals. Su­per­memo now uses SM11. How­ev­er, we are a bit skep­ti­cal that the huge com­plex­ity of the newer SM al­go­rithms pro­vides for a sta­tis­ti­cally rel­e­vant ben­e­fit. But, that is one of the facts we hope to find out with our data col­lec­tion. We will only make mod­i­fi­ca­tions to our al­go­rithms based on com­mon sense or if the data tells us that there is a sta­tis­ti­cally rel­e­vant rea­son to do so.

    ↩︎
  33. Balota et al 2006:

    Car­pen­ter and De­Losh (2005, Exp. 2) have re­cently in­ves­ti­gated face-name learn­ing un­der massed, ex­panded (1-3-5), and equal in­ter­val (3-3-3) con­di­tions. This study also in­volved study and study and test pro­ce­dures dur­ing the ac­qui­si­tion phase. Car­pen­ter and De­Losh found a large effect of spac­ing, but no ev­i­dence of a ben­e­fit of ex­panded over equal in­ter­val prac­tice. In fact, Car­pen­ter and De­Losh re­ported a re­li­able ben­e­fit of the equal in­ter­val con­di­tion over the ex­panded re­trieval con­di­tion.

    ↩︎
  34. Balota et al 2006 again:

    Rea and Modigliani (1985) tested the effec­tive­ness of ex­panded re­trieval in a third-grade class­room set­ting. In sep­a­rate con­di­tions, stu­dents were given new mul­ti­pli­ca­tion prob­lems or spelling words to learn. The prob­lem or word was pre­sented au­dio­vi­su­ally once and then tested on ei­ther a massed re­trieval sched­ule of 0-0-0-0 or an ex­pand­ing sched­ule of 0-1-2-4, in which the in­ter­vals in­volved be­ing tested on old items or learn­ing new items. After each test trial for a given item, the item was re-p­re­sented in its en­tirety so stu­dents re­ceived feed­back on what they were learn­ing. Per­for­mance dur­ing the learn­ing phase was at 100% for both spelling words and mul­ti­pli­ca­tion facts. On an im­me­di­ate fi­nal re­ten­tion test, Rea and Modigliani found a per­for­mance ad­van­tage for all item­s-math and spelling- prac­ticed on an ex­pand­ing sched­ule com­pared to the massed re­trieval sched­ule. They sug­gest­ed, as have oth­ers, that spac­ing com­bined with the high suc­cess rate in­her­ent in the ex­panded re­trieval sched­ule pro­duced bet­ter re­ten­tion than massed re­trieval prac­tice. How­ev­er, as in Spitzer’s study, Rea and Modigliani did not test an ap­pro­pri­ate equal in­ter­val spac­ing con­di­tion. Hence, their find­ing that ex­panded re­trieval is su­pe­rior to massed re­trieval in third graders could sim­ply re­flect the su­pe­ri­or­ity of spaced ver­sus massed re­hearsal-in other words, the spac­ing effect.

    ↩︎
  35. .↩︎

  36. Balota et al 2006; >1 is rare in psy­chol­o­gy, see “One Hun­dred Years of So­cial Psy­chol­ogy Quan­ti­ta­tively De­scribed”, Bond et al 2003↩︎

  37. Rohrer & Tay­lor 2006↩︎

  38. Balota et al 2006:

    …long-term re­ten­tion of in­for­ma­tion has been demon­strated over sev­eral days in some cases (e.g., Camp et al, 1996). For ex­am­ple, in the lat­ter study, Camp et al em­ployed an ex­pand­ing re­trieval strat­egy to train 23 in­di­vid­u­als with mild to mod­er­ate AD to re­fer to a daily cal­en­dar as a cue to re­mem­ber to per­form var­i­ous per­sonal ac­tiv­i­ties (e.g., take med­ica­tion). Fol­low­ing a base­line phase to de­ter­mine whether sub­jects would spon­ta­neously use the cal­en­dar, spaced re­trieval train­ing was im­ple­mented by re­peat­edly ask­ing the sub­ject the ques­tion, ‘How are you go­ing to re­mem­ber what to do each day?’ at ex­pand­ing time in­ter­vals. The re­sults in­di­cated that 20/23 sub­jects did learn the strat­egy (i.e., to look at the cal­en­dar) and re­tained it over a 1-week pe­ri­od.

    ↩︎
  39. Rohrer & Tay­lor 2006 warns us, though, about many of the other math stud­ies:

    In one meta-analy­sis by Dono­van and Ra­do­se­vich (1999), for in­stance, the size of the spac­ing effect de­clined sharply as con­cep­tual diffi­culty of the task in­creased from low (e.g. ro­tary pur­suit) to av­er­age (e.g. word list re­call) to high (e.g. puz­zle). By this find­ing, the ben­e­fits of spaced prac­tise may be muted for many math­e­mat­ics tasks.

    ↩︎
  40. What is es­pe­cially nice about this study was that not only did it use high­-qual­ity (in­tel­li­gent & mo­ti­vat­ed) col­lege stu­dents (), the con­di­tions were rel­a­tively con­trolled - both groups had the same home­work (so equal test­ing effec­t), but like Rohrer & Tay­lor 2006/2007, the dis­tri­b­u­tion was what var­ied:

    The course top­ics, text­book, hand­outs, read­ing as­sign­ments, and graded as­sign­ments (with the ex­cep­tion of quiz, home­work, and par­tic­i­pa­tion points) were iden­ti­cal for the treat­ment and con­trol groups. The list­ing of home­work as­sign­ments in the syl­labus differed be­tween groups. The con­trol group was as­signed daily home­work re­lated to the top­ic(s) pre­sented that day in class. Pe­ter­son (1971) calls this the ver­ti­cal model for as­sign­ing math­e­mat­ics home­work. The treat­ment group was as­signed home­work in ac­cor­dance with a dis­trib­uted or­ga­ni­za­tional pat­tern that com­bines prac­tice on cur­rent top­ics and re­in­force­ment of pre­vi­ously cov­ered top­ics. Un­der the dis­trib­uted mod­el, ap­prox­i­mately 40% of the prob­lems on a given topic were as­signed the day the topic was first in­tro­duced, with an ad­di­tional 20% as­signed on the next les­son and the re­main­ing 40% of prob­lems on the topic as­signed on sub­se­quent lessons (Hirsch et al, 1983). In Hirsch’s re­search and in this study, after the ini­tial home­work as­sign­ment, prob­lem(s) rep­re­sent­ing a given topic resur­faced on the 2nd, 4th, 7th, 12th, and 21st les­son. Con­se­quent­ly, treat­ment group home­work for les­son one con­sisted of only one top­ic; home­work for lessons two and three con­sisted of two top­ics; and home­work for les­son four through six con­sisted of three top­ics. This pat­tern con­tin­ued as new top­ics were added and was ap­plied to all non-ex­am, non-lab­o­ra­tory lessons. As shown by Ta­bles 1 and 2, the same home­work prob­lems were as­signed to both groups with only the pat­tern of as­sign­ment differ­ing. Be­cause of the na­ture of the dis­trib­uted prac­tice mod­el, home­work for the treat­ment group con­tained fewer prob­lems (rel­a­tive to the con­trol group) early in the se­mes­ter with the num­ber of prob­lems in­creas­ing as the se­mes­ter pro­gressed. Later in the se­mes­ter, home­work for the treat­ment group con­tained more prob­lems (rel­a­tive to the con­trol group)….The USAFA rou­tinely col­lects study time da­ta. After each ex­am, a large sam­ple of cadets (at least 60% of the course pop­u­la­tion) anony­mously re­ported the amount of time (in min­utes) spent study­ing for the ex­am. Time spent study­ing was ap­prox­i­mately equal for both groups (see Ta­ble 5). De­scrip­tive data rev­els that, for both the treat­ment and con­trol group, study time for the third exam was at least 16% greater than study time for any other ex­am. Study time for the fi­nal exam was at least 68% greater than study time for any of the hourly ex­ams (see Ta­ble 5)

    …The treat­ment pro­duced an effect size (f 2) of 0.013 on the first ex­am, 0.029 on the sec­ond ex­am, 0.035 on the fourth ex­am, and 0.040 on the fi­nal course per­cent­age grade. Al­though the effect sizes ap­pear to be small, the treat­ment group outscored the con­trol group in every case. A mean differ­ence of 5.13 per­cent­age points on the first, sec­ond, and fourth exam trans­lates to an ad­van­tage of about a third of a let­ter grade for stu­dents in the treat­ment group. In ad­di­tion, higher min­i­mum scores earned by the treat­ment group may in­di­cate that the dis­trib­uted prac­tice treat­ment served to elim­i­nate the ex­tremely low scores (re­fer to Ta­ble 3)….Odd­ly, the dis­trib­uted prac­tice treat­ment did not pro­duce a [s­ta­tis­ti­cal­ly-]sig­nifi­cant effect on fi­nal exam scores. One pos­si­ble cause for the dis­par­ity was the USAFA pol­icy ex­empt­ing the top per­form­ers from the fi­nal ex­am. Of the 16 ex­empted stu­dents, 11 were from the treat­ment group with only 5 from the con­trol group.

    ↩︎
  41. Balch 2006 ab­stract:

    Two in­tro­duc­tory psy­chol­ogy classes (N = 145) par­tic­i­pated in a coun­ter­bal­anced class­room ex­per­i­ment that demon­strated the spac­ing effect and, by anal­o­gy, the ben­e­fits of dis­trib­uted study. After hear­ing words pre­sented twice in ei­ther a massed or dis­trib­uted man­ner, par­tic­i­pants re­called the words and scored their re­call pro­to­cols, re­li­ably re­mem­ber­ing more dis­trib­uted than massed words. Posttest scores on a mul­ti­ple-choice quiz cov­er­ing points il­lus­trated by the ex­per­i­ment av­er­aged about twice the com­pa­ra­ble pretest scores, in­di­cat­ing the effec­tive­ness of the ex­er­cise in con­vey­ing con­tent. Stu­dents’ sub­jec­tive rat­ings sug­gested that the ex­per­i­ment helped con­vince them of the ben­e­fits of dis­trib­uted study.

    ↩︎
  42. See ↩︎

  43. Com­mins, S., Cun­ning­ham, L., Har­vey, D., and Wal­sh, D. (2003). “Massed but not spaced train­ing im­pairs spa­tial mem­ory”. Be­hav­ioural Brain Re­search 139, 215-223↩︎

  44. Gal­luc­cio & Rovee-Col­lier 2006, “Nonuni­form effects of re­in­state­ment within the time win­dow”. Learn­ing and Mo­ti­va­tion, 37, 1-17.↩︎

  45. See the pre­vi­ous sec­tions for many us­ing chil­dren; one pre­vi­ously uncited is Top­pino 1993, “The spac­ing effect in preschool chil­dren’s free re­call of pic­tures and words”; but Top­pino et al 2009 adds some in­ter­est­ing qual­i­fiers to spaced rep­e­ti­tion in the young:

    Preschool­ers, el­e­men­tary school chil­dren, and col­lege stu­dents ex­hib­ited a spac­ing effect in the free re­call of pic­tures when learn­ing was in­ten­tion­al. When learn­ing was in­ci­den­tal and a shal­low pro­cess­ing task re­quir­ing lit­tle se­man­tic pro­cess­ing was used dur­ing list pre­sen­ta­tion, young adults still ex­hib­ited a spac­ing effect, but chil­dren con­sis­tently failed to do so. Chil­dren, how­ev­er, did man­i­fest a spac­ing effect in in­ci­den­tal learn­ing when an elab­o­rate se­man­tic pro­cess­ing task was used.

    ↩︎
  46. An­other pre­vi­ously uncited study: Glen­berg, A. M. (1979), “Com­po­nen­t-levels the­ory of the effects of spac­ing of rep­e­ti­tions on re­call and recog­ni­tion”. Mem­ory & Cog­ni­tion, 7, 95-112.↩︎

  47. See Ko­r­nell et al 2010; Si­mone et al 2012 shows the spac­ing ben­e­fits but re­duced in mag­ni­tude in its 56-74 year old sub­jects, sim­i­lar to Jack­son et al 2012 and Mad­dox 2013↩︎

  48. Mam­marel­la, N., Rus­so, R., & Avons, S. E. (2002). "Spac­ing effects in cued-mem­ory tasks for un­fa­mil­iar faces and non­words". Mem­ory & Cog­ni­tion, 30, 1238-1251↩︎

  49. Childers, J. B., & Tomasel­lo, M. (2002). "T­wo-year-olds learn novel nouns, verbs, and con­ven­tional ac­tions from massed or dis­trib­uted ex­po­sures". De­vel­op­men­tal Psy­chol­ogy, 38, 967-978↩︎

  50. eg. Fish­man et al 1968↩︎

  51. The fa­mous ‘10,000 hours of prac­tice’ fig­ure may not be as true or im­por­tant as Er­ic­s­son and pub­li­ciz­ers like Mal­colm Glad­well im­ply, given the high of ex­per­tise against time, and re­sults from sports show­ing smaller time in­vest­ments (see also Ham­brick’s cor­pus cut­ting ‘de­lib­er­ate prac­tice’ down to size), and Er­ic­s­son ab­surdly deny the pow­er­ful role of ge­net­ics and the nec­es­sary con­di­tion of hav­ing tal­ent but the in­sight of ‘de­lib­er­ate prac­tice’ help­ing tal­ented peo­ple prob­a­bly is re­al. One may be able to get away with 3,000 hours rather than 10,000, but one is­n’t go­ing to do that with mind­less rep­e­ti­tion or no rep­e­ti­tions.↩︎

  52. Gen­tner, D., Loewen­stein, J., & Thomp­son, L. (2003). “Learn­ing and trans­fer: A gen­eral role for ana­log­i­cal en­cod­ing”. Jour­nal of Ed­u­ca­tional Psy­chol­ogy, 95, 393-40↩︎

  53. From Ko­r­nell et al 2010:

    The ben­e­fits of spac­ing seem to di­min­ish or dis­ap­pear when to-be-learned items are not re­peated ex­actly (Ap­ple­ton-K­napp, Bjork, & Wick­ens, 2005)…a num­ber of stud­ies have shown that mass­ing, rather than spac­ing, pro­motes in­duc­tive learn­ing. These stud­ies have gen­er­ally em­ployed rel­a­tively sim­ple per­cep­tual stim­uli that fa­cil­i­tate ex­per­i­men­tal con­trol (Gag­né, 1950; Gold­stone, 1996; Kurtz & Hov­land, 1956; [Whit­man J. R., & Gar­ner, W. R. (1963). “Con­cept learn­ing as a func­tion of the form of in­ter­nal struc­ture”. Jour­nal of Ver­bal Learn­ing & Ver­bal Be­hav­ior, 2, 195-202]).

    ↩︎
  54. High er­ror rates - in­di­cat­ing one did­n’t ac­tu­ally learn the card con­tents in the first place - seem to be con­nected to fail­ures of the spac­ing effect; there’s some ev­i­dence that peo­ple nat­u­rally choose to mass study when they don’t yet know the ma­te­r­i­al.↩︎

  55. “Su­per­Memo as a new tool in­creas­ing the pro­duc­tiv­ity of a pro­gram­mer. A case study: pro­gram­ming in Ob­ject Win­dows”↩︎

  56. The 20 years look like this (note the ): [0.742675, 0.27044575182838654, 0.15275979054767388, 0.10348750000000001, 7.751290630254386e-2, 6.187922936397532e-2, 5.161829250474865e-2, 4.445884397854832e-2, 3.923055555555555e-2, 3.5275438307530015e-2, 3.219809429218694e-2, 2.9748098818459235e-2, 2.7759942051635768e-2, 2.6120309801216147e-2, 2.474928593068675e-2, 2.35890625e-2, 2.2596898475825956e-2, 2.1740583401051353e-2, 2.0995431241707652e-2, 2.0342238287817983e-2]↩︎

  57. mod­ulo things where know­ing it is use­ful even if you don’t need it often - it can be a brick in a pyra­mid of knowl­edge; cf.page 3 of Wolf:

    The prob­lem of for­get­ting might not tor­ment us so much if we could only con­vince our­selves that re­mem­ber­ing is­n’t im­por­tant. Per­haps the things we learn - words, dates, for­mu­las, his­tor­i­cal and bi­o­graph­i­cal de­tails - don’t re­ally mat­ter. Facts can be looked up. That’s what the In­ter­net is for. When it comes to learn­ing, what re­ally mat­ters is how things fit to­geth­er. We mas­ter the sto­ries, the schemas, the frame­works, the par­a­digms; we re­hearse the lin­go; we swim in the epis­teme.

    The dis­ad­van­tage of this com­fort­ing no­tion is that it’s false. “The peo­ple who crit­i­cize mem­o­riza­tion - how happy would they be to spell out every let­ter of every word they read?” asks Robert Bjork, chair of UCLA’s psy­chol­ogy de­part­ment and one of the most em­i­nent mem­ory re­searchers. After all, Bjork notes, chil­dren learn to read whole words through in­tense prac­tice, and every time we en­ter a new field we be­come chil­dren again. “You can’t es­cape mem­o­riza­tion,” he says. “There is an ini­tial process of learn­ing the names of things. That’s a stage we all go through. It’s all the more im­por­tant to go through it rapid­ly.” The hu­man brain is a mar­vel of as­so­cia­tive pro­cess­ing, but in or­der to make as­so­ci­a­tions, data must be loaded into mem­o­ry.

    ↩︎
  58. See Stephen R. Schmidt’s web­page “The­o­ries of For­get­ting”, which cites ‘Wood­worth & Schlos­beg (1961)’ when pre­sent­ing a log graph of var­i­ous stud­ies’ for­get­ting curves.↩︎

  59. which neatly ad­dresses the is­sue of such mail­ing lists be­ing use­less (‘who learns a word after just one ex­po­sure?’).↩︎

  60. Mnemosyne in this case con­sti­tutes both a way to learn the quotes so I can use them, and a ; just the other day I had 3 or 4 ap­po­site quotes for an es­say be­cause I had en­tered them into Mnemosyne months or years ago.↩︎

  61. It’s well known that any speaker of a lan­guage un­der­stands many more words than they will ever use or be able to ex­plic­itly gen­er­ate, that their “read­ing vo­cab­u­lary” ex­ceeds their “writ­ing vo­cab­u­lary”; less well-known is that on many prob­lems, one can guess at well above ran­dom rates even while feel­ing un­sure & ig­no­rant, ne­ces­si­tat­ing psy­chol­o­gists to em­ploy forced-choice par­a­digms. Even less known is the ca­pac­ity of or “im­plicit mem­ory”; this mem­ory can ap­ply to things like rec­og­niz­ing im­ages or text or mu­sic, typ­ing, puz­zle solv­ing, etc. An­drew Druck­er, in , em­ploys vi­sual mem­ory to cal­cu­late ; he cites as prece­dent Stand­ing 1973:

    In one of the most wide­ly-cited stud­ies on recog­ni­tion mem­o­ry, Stand­ing showed par­tic­i­pants an epic 10,000 pho­tographs over the course of 5 days, with 5 sec­onds’ ex­po­sure per im­age. He then tested their fa­mil­iar­i­ty, es­sen­tially as de­scribed above. The par­tic­i­pants showed an 83% suc­cess rate, sug­gest­ing that they had be­come fa­mil­iar with about 6,600 im­ages dur­ing their or­deal. Other vol­un­teers, trained on a smaller col­lec­tion of 1,000 im­ages se­lected for vivid­ness, had a 94% suc­cess rate.

    One some­times sees peo­ple ar­gue that some­thing is in­se­cure or unguess­able or free from pos­si­ble placebo effect be­cause it in­volves too many ob­jects to ex­plic­itly mem­o­rize, but as these ex­am­ples make clear, recog­ni­tion mem­ory can hap­pen quickly and store sur­pris­ingly large amounts of in­for­ma­tion. This could be used for au­then­ti­ca­tion (see for ex­am­ple Bo­ji­nov et al 2012; HN dis­cus­sion) or mes­sage since recog­ni­tion mem­ory could be ex­ploited as a sort of se­cure com­mu­ni­ca­tion sys­tem. Two par­ties can share a set of 20,000 pho­tographs (10,000 pairs); to send a mes­sage, have a mes­sen­ger spend 5 days on 10,000 picked ones; and then to re­ceive it, ask him to rec­og­nize which pho­to­graph he saw in each of the 10,000 pairs. The sub­ject not only does not know what the bi­nary mes­sage is or what means, he can’t even pro­duce it since he can­not re­mem­ber the pho­tographs!

    At an 80% ac­cu­racy rate, we can even cal­cu­late how many bits of in­for­ma­tion can be en­trusted to the mes­sen­ger us­ing ; a cal­cu­la­tion gives 5.8 kilo­bits as the up­per lim­it: if p = 0.2 (based on the 80% suc­cess rate), then . So we see that was right after all: the se­curest way to send a mes­sage is through a dis­trans mes­sen­ger. (The down­side is that the im­plicit recog­ni­tion mem­ory de­cays con­sid­er­ably; see Lan­dauer 1986 for ad­justed es­ti­mates.)↩︎

  62. In this vein, I am re­minded of what a for­mer told me:

    I’ve been polypha­sic for about a year. (Not any­more; kills my mem­o­ry.)…Anki reps, most­ly. I found that I could do proper re­view ses­sions for about 2-3 days and would hit an im­pen­e­tra­ble wall. I could­n’t learn a sin­gle new card and had to­tal brain fog un­til I got 3 hours more sleep. That, how­ev­er, would re­set my adap­ta­tion. The whole effect is a bit less pro­nounced on Every­man, but not much. It is how­ever eas­ier to add sleep when you al­ready have a core. I did­n’t no­tice any other ma­jor men­tal im­pair­ment after the ini­tial sleep de­pri­va­tion.

    ↩︎
  63. For a more re­cent re­view, see Philips et al 2013.↩︎

  64. Pre­sum­ably one would im­me­di­ately give them all some high grade like 5 to avoid sud­denly hav­ing a daily load of 500 cards for a while.↩︎

  65. Smaller is bet­ter.↩︎

  66. “For Mnemosyne 2.x, Ull­rich is work­ing on an offi­cial Mnemosyne iPhone client which will have very easy sync­ing.”↩︎

  67. Wired↩︎

  68. See Page 4, Wolf 2008:

    The spac­ing effect was one of the proud­est lab-derived dis­cov­er­ies, and it was in­ter­est­ing pre­cisely be­cause it was not ob­vi­ous, even to pro­fes­sional teach­ers. The same year that Neisser re­volt­ed, Robert Bjork, work­ing with Thomas Lan­dauer of Bell Labs, pub­lished the re­sults of two ex­per­i­ments in­volv­ing nearly 700 un­der­grad­u­ate stu­dents. Lan­dauer and Bjork were look­ing for the op­ti­mal mo­ment to re­hearse some­thing so that it would later be re­mem­bered. Their re­sults were im­pres­sive: The best time to study some­thing is at the mo­ment you are about to for­get it. And yet - as Neisser might have pre­dicted - that in­sight was use­less in the real world.

    ↩︎
  69. When I first read of Su­per­Me­mo, I had al­ready taken a class in and was rea­son­ably fa­mil­iar with Ebbing­haus’s for­get­ting curve - so my re­ac­tion to its method­ol­ogy was Hux­ley’s: “How ex­tremely stu­pid not to have thought of that!”↩︎

  70. See page 7, Wolf 2008

    And yet now, as I grin broadly and wave to the gawk­ers, it oc­curs to me that the cold ra­tio­nal­ity of his ap­proach may be only a sur­face fea­ture and that, when linked to gen­uine re­wards, even the chill­i­est of sys­tems can have a cer­tain vis­ceral ap­peal. By pro­ject­ing the achieve­ment of ex­treme mem­ory back along the for­get­ting curve, by prov­ably link­ing the dis­tant fu­ture - when we will know so much - to the few min­utes we de­vote to study­ing to­day, Woz­niak has found a way to con­di­tion his tem­pera­ment along with his mem­o­ry. He is mak­ing the fu­ture no­tice­able. He is try­ing not just to learn many things but to warm the process of learn­ing it­self with a draft of utopian ec­sta­sy.

    ↩︎