Spaced-repetition (Link Bibliography)

“Spaced-repetition” links:

  1. Prediction-markets#predictionbook-nights

  2. 1988-bjork.pdf






  8. 1993-bahrick.pdf: ⁠, Harry P. Bahrick, Lorraine E. Bahrick, Audrey S. Bahrick, Phyllis E. Bahrick (1993-09-01; spaced-repetition):

    In a 9-year longitudinal investigation, 4 subjects learned and relearned 300 English-foreign language word pairs. Either 13 or 26 relearning sessions were administered at intervals of 14, 28, or 56 days. Retention was tested for 1, 2, 3, or 5 years after training terminated. The longer intersession intervals slowed down acquisition slightly, but this disadvantage during training was offset by substantially higher retention. 13 retraining sessions spaced at 56 days yielded retention comparable to 26 sessions spaced at 14 days. The retention benefit due to additional sessions was independent of the benefit due to spacing, and both variables facilitated retention of words regardless of difficulty level and of the consistency of retrieval during training. The benefits of spaced retrieval practice to long-term maintenance of access to academic knowledge areas are discussed.




  12. #when-to-review






  18. 1984-thorp-themathematicsofgambling-ch4.pdf

  19. 2011-qs-rogercraigwinsjeopardy.html#comment-3004








  27. 1982-perlis.pdf: ⁠, Alan J. Perlis (1982-09-01; cs):

    [130 epigrams on computer science and technology, published in 1982, for ACM’s SIGPLAN journal, by noted computer scientist and programming language researcher ⁠. The epigrams are a series of short, programming-language-neutral, humorous statements about computers and programming, distilling lessons he had learned over his career, which are widely quoted.]

    8. A programming language is low level when its programs require attention to the irrelevant….19. A language that doesn’t affect the way you think about programming, is not worth knowing….54. Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.

    15. Everything should be built top-down, except the first time….30. In programming, everything we do is a special case of something more general—and often we know it too quickly….31. Simplicity does not precede complexity, but follows it….58. Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it….65. Make no mistake about it: Computers process numbers—not symbols. We measure our understanding (and control) by the extent to which we can arithmetize an activity….56. Software is under a constant tension. Being symbolic it is arbitrarily perfectible; but also it is arbitrarily changeable.

    1. One man’s constant is another man’s variable. 34. The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.

    36. The use of a program to prove the 4-color theorem will not change mathematics—it merely demonstrates that the theorem, a challenge for a century, is probably not important to mathematics.

    39. Re graphics: A picture is worth 10K words—but only those to describe the picture. Hardly any sets of 10K words can be adequately described with pictures.

    48. The best book on programming for the layman is Alice in Wonderland; but that’s because it’s the best book on anything for the layman.

    77. The cybernetic exchange between man, computer and algorithm is like a game of musical chairs: The frantic search for balance always leaves one of the 3 standing ill at ease….79. A year spent in artificial intelligence is enough to make one believe in God….84. Motto for a research laboratory: What we work on today, others will first think of tomorrow.

    91. The computer reminds one of Lon Chaney—it is the machine of a thousand faces.

    7. It is easier to write an incorrect program than understand a correct one….93. When someone says “I want a programming language in which I need only say what I wish done”, give him a lollipop….102. One can’t proceed from the informal to the formal by formal means.

    100. We will never run out of things to program as long as there is a single program around.

    108. Whenever 2 programmers meet to criticize their programs, both are silent….112. Computer Science is embarrassed by the computer….115. Most people find the concept of programming obvious, but the doing impossible. 116. You think you know when you can learn, are more sure when you can write, even more when you can teach, but certain when you can program. 117. It goes against the grain of modern education to teach children to program. What fun is there in making plans, acquiring discipline in organizing thoughts, devoting attention to detail and learning to be self-critical?


  29. 2012-persol-srssitecomparison.pdf


  31. 1985-mckenna.pdf

  32. ⁠, Bell, Douglas S. Harless, Charles E. Higa, Jerilyn K. Bjork, Elizabeth L. Bjork, Robert A. Bazargan, Mohsen Mangione, Carol M (2008; psychology  /​ ​​ ​illusion-of-depth):

    Background: The time course of physicians’ knowledge retention after learning activities has not been well characterized. Understanding the time course of retention is critical to optimizing the reinforcement of knowledge.

    Design: Educational follow-up experiment with knowledge retention measured at 1 of 6 randomly assigned time intervals (0-55 days) after an online tutorial covering 2 American Diabetes Association guidelines.

    Participants: Internal and family medicine residents.

    Measurements: Multiple-choice knowledge tests, subject characteristics including critical appraisal skills, and learner satisfaction.

    Results: Of 197 residents invited, 91 (46%) completed the tutorial and were randomized; of these, 87 (96%) provided complete follow-up data. Ninety-two percent of the subjects rated the tutorial as “very good” or “excellent.” Mean knowledge scores increased from 50% before the tutorial to 76% among those tested immediately afterward. Score gains were only half as great at 3–8 days and no significant retention was measurable at 55 days. The shape of the retention curve corresponded with a 1/​​​​4-power transformation of the delay interval. In multivariate analyses, critical appraisal skills and participant age were associated with greater initial learning, but no participant characteristic significantly modified the rate of decline in retention.

    Conclusions: Education that appears successful from immediate posttests and learner evaluations can result in knowledge that is mostly lost to recall over the ensuing days and weeks. To achieve longer-term retention, physicians should review or otherwise reinforce new learning after as little as 1 week.




  36. 1998-davis.pdf

  37. 1995-davis.pdf: “http:  /​ ​​ ​  /​ ​​ ​  /​ ​​ ​gw2  /​ ​​ ​ovidweb.cgi”⁠, WainnerRS





  42. 2011-mccabe.pdf

  43. ⁠, Benjamin C. Storm, Robert Bjork, Jennifer C. Storm (2010; psychology  /​ ​​ ​illusion-of-depth):

    Retrieving information from memory makes that information more recallable in the future than it otherwise would have been. Optimizing retrieval practice has been assumed, on the basis of evidence and arguments tracing back to Landauer and Bjork (1978), to require an expanding-interval schedule of successive retrievals, but recent findings suggest that expanding retrieval practice may be inferior to uniform-interval retrieval practice when memory is tested after a long retention interval. We report three experiments in which participants read educational passages and were then repeatedly tested, without feedback, after an expanding or uniform sequence of intervals. On a test 1 week later, recall was enhanced by the expanding schedule, but only when the task between successive retrievals was highly interfering with memory for the passage. These results suggest that the extent to which learners benefit from expanding retrieval practice depends on the degree to which the to-be-learned information is vulnerable to forgetting.


  45. 2006-roediger.pdf: “untitled”



  48. 1994-bjork.pdf

  49. 2012-son.pdf

  50. 1978-baddeley.pdf

  51. 1985-pirolli.pdf

  52. 2014-mulligan.pdf

  53. 2013-bjork.pdf

  54. ⁠, Louis Deslauriers, Logan S. McCarty, Kelly Miller, Kristina Callaghan, Greg Kestin (2019-09-04; psychology  /​ ​​ ​illusion-of-depth):

    Despite active learning being recognized as a superior method of instruction in the classroom, a major recent survey found that most college STEM instructors still choose traditional teaching methods. This article addresses the long-standing question of why students and faculty remain resistant to active learning. Comparing passive lectures with active learning using a randomized experimental approach and identical course materials, we find that students in the active classroom learn more, but they feel like they learn less. We show that this negative correlation is caused in part by the increased cognitive effort required during active learning. Faculty who adopt active learning are encouraged to intervene and address this misperception, and we describe a successful example of such an intervention.

    We compared students’ self-reported perception of learning with their actual learning under controlled conditions in large-enrollment introductory college physics courses taught using (1) active instruction (following best practices in the discipline) and (2) passive instruction (lectures by experienced and highly rated instructors). Both groups received identical class content and handouts, students were randomly assigned, and the instructor made no effort to persuade students of the benefit of either method. Students in active classrooms learned more (as would be expected based on research), but their perception of learning, while positive, was lower than that of their peers in passive environments. This suggests that attempts to evaluate instruction based on students’ perceptions of learning could inadvertently promote inferior (passive) pedagogical methods. For instance, a superstar lecturer could create such a positive feeling of learning that students would choose those lectures over active learning. Most importantly, these results suggest that when students experience the increased cognitive effort associated with active learning, they initially take that effort to signify poorer learning. That disconnect may have a detrimental effect on students’ motivation, engagement, and ability to self-regulate their own learning. Although students can, on their own, discover the increased value of being actively engaged during a semester-long course, their learning may be impaired during the initial part of the course. We discuss strategies that instructors can use, early in the semester, to improve students’ response to being actively engaged in the classroom.

    [Keywords: scientific teaching, undergraduate education, evidence-based teaching, Constructivism]


  56. 1994-dunlosky.pdf: “rnvb085.PDF

  57. 2001-simon.pdf


  59. 2012-hartwig.pdf


  61. 2012-susser.pdf


  63. 1994-graesser.pdf

  64. ⁠, Richard Hamming (1986-03-07):

    [Transcript of a talk by mathematician and manager about what he had learned about computers and how to do effective research (republished in expanded form as Art of Doing Science and Engineering: Learning to Learn; 1995 video). It is one of the most famous and most-quoted such discussions ever.]

    At a seminar in the Bell Communications Research Colloquia Series, Dr. Richard W. Hamming, a Professor at the Naval Postgraduate School in Monterey, California and a retired Bell Labs scientist, gave a very interesting and stimulating talk, ‘You and Your Research’ to an overflow audience of some 200 Bellcore staff members and visitors at the Morris Research and Engineering Center on March 7, 1986. This talk centered on Hamming’s observations and research on the question “Why do so few scientists make substantial contributions and so many are forgotten in the long run?” From his more than 40 years of experience, 30 of which were at Bell Laboratories, he has made a number of direct observations, asked very pointed questions of scientists about what, how, and why they did things, studied the lives of great scientists and great contributions, and has done introspection and studied theories of creativity. The talk is about what he has learned in terms of the properties of the individual scientists, their abilities, traits, working habits, attitudes, and philosophy.

  65. #the-workload



  68. #using-it

  69. 1969-allen.pdf



  72. ⁠, Henry L. Roediger III, Jeffrey D. Karpicke (2006-09-01):

    A powerful way of improving one’s memory for material is to be tested on that material. Tests enhance later retention more than additional study of the material, even when tests are given without feedback. This surprising phenomenon is called the testing effect, and although it has been studied by cognitive psychologists sporadically over the years, today there is a renewed effort to learn why testing is effective and to apply testing in educational settings. In this article, we selectively review laboratory studies that reveal the power of testing in improving retention and then turn to studies that demonstrate the basic effects in educational settings. We also consider the related concepts of dynamic testing and formative assessment as other means of using tests to improve learning. Finally, we consider some negative consequences of testing that may occur in certain circumstances, though these negative effects are often small and do not cancel out the large positive effects of testing. Frequent testing in the classroom may boost educational achievement at all levels of education.




  76. 1991-bangertdrowns.pdf

  77. 2006-cook.pdf

  78. ⁠, Johnson, Bethany C. Kiviniemi, Marc T (2009):

    Assigned textbook readings are a common requirement in undergraduate courses, but students often do not complete reading assignments or do not do so until immediately before an exam. This may have detrimental effects on learning and course performance. Regularly scheduled quizzes on reading material may increase completion of reading assignments and therefore course performance. This study examined the effectiveness of compulsory, mastery-based, weekly reading quizzes as a means of improving exam and course performance. Completion of reading quizzes was related to both better exam and course performance. The discussion includes recommendations for the use of quizzes in undergraduate courses.

  79. 2012-mcdaniel.pdf: “Using quizzes to enhance summative-assessment performance in a web-based class: An experimental study”⁠, Mark A. McDaniel, Kathleen M. Wildman, Janis L. Anderson


  81. 2013-meyer.pdf

  82. 2013-larsen.pdf: “Chapter 38: Test-Enhanced Learning”⁠, Douglas P. Larsen, Andrew C. Butler

  83. 2021-yang.pdf: ⁠, Chunliang Yang, Liang Luo, Miguel A. Vadillo, Ronjun Yu, David R. Shanks (2021-03-08; spaced-repetition):

    Testing (class quizzing) yields a variety of learning benefits, even though learners, instructors, and policymakers tend to lack full metacognitive insight into the virtues of testing. The current finds a reliable advantage of testing over other strategies in facilitating learning of factual knowledge, concept comprehension, and knowledge application in the classroom. Overall, testing is not only an assessment of learning but also an assessment for learning.

    Over the last century hundreds of studies have demonstrated that testing is an effective intervention to enhance long-term retention of studied knowledge and facilitate mastery of new information, compared with restudying and many other learning strategies (eg., concept mapping), a phenomenon termed the testing effect. How robust is this effect in applied settings beyond the laboratory?

    The current review integrated 48,478 students’ data, extracted from k = 222 independent studies, to investigate the magnitude, boundary conditions, and psychological underpinnings of test-enhanced learning in the classroom. The results show that overall testing (quizzing) raises student academic achievement to a medium extent (g = 0.499). The magnitude of the effect is modulated by a variety of factors, including learning strategy in the control condition, test format consistency, material matching, provision of corrective feedback, number of test repetitions, test administration location and timepoint, treatment duration, and experimental design.

    The documented findings support 3 theories to account for the classroom testing effect: additional exposure, transfer-appropriate processing, and motivation. In addition to their implications for theory development, these results have practical importance for enhancing teaching practice and guiding education policy and highlight important directions for future research.

    [Keywords: meta-analysis, motivation, academic achievement, testing effect, transfer-appropriate processing]





  88. 1991-mcdaniel.pdf


  90. 1939-spitzer.pdf

  91. ⁠, Haley A. Vlach, Catherine M. Sandhofer (2012-05-22):

    The spacing effect describes the robust finding that long-term learning is promoted when learning events are spaced out in time, rather than presented in immediate succession. Studies of the spacing effect have focused on memory processes rather than for other types of learning, such as the acquisition and generalization of new concepts. In this study, early elementary school children (5–7 year-olds; n = 36) were presented with science lessons on one of three schedules: massed, clumped, and spaced. The results revealed that spacing lessons out in time resulted in higher generalization performance for both simple and complex concepts. Spaced learning schedules promote several types of learning, strengthening the implications of the spacing effect for educational practices and curriculum.

    [Keywords: spacing effect, distributed learning, learning and memory, generalization, cognitive development, educational curriculum and practices]

  92. 1982-nungester.pdf

  93. 1975-laporte.pdf

  94. 1981-duchastel

  95. 1981-duchastel.pdf

  96. 1989-glover.pdf

  97. 2007-kang.pdf


  99. 2008-butler.pdf

  100. ⁠, Nicholas J. Cepeda, Harold Pashler, Edward Vul, John T. Wixted, Doug Rohrer (2006):

    The authors performed a meta-analysis of the distributed practice effect to illuminate the effects of temporal variables that have been neglected in previous reviews. This review found 839 assessments of distributed practice in 317 experiments located in 184 articles. Effects of spacing (consecutive massed presentations vs. spaced learning episodes) and lag (less spaced vs. more spaced learning episodes) were examined, as were expanding inter study interval (ISI) effects. Analyses suggest that ISI and retention interval operate jointly to affect final-test retention; specifically, the ISI producing maximal retention increased as retention interval increased. Areas needing future research and theoretical implications are discussed.

  101. 1928-ruch.pdf


  103. 1989-dempster.pdf

  104. 2010-delaney.pdf

  105. 1999-donovan.pdf: “Previous Meta˚Analytic Review”⁠, Roy Hessinger




  109. ⁠, David A. Balota, Janet M. Duchek, Jessica M. Logan (2015):

    The spacing effect is one of the most ubiquitous findings in learning and memory. Performance on a variety of tasks is better when the repetition of the to-be-learned information is distributed as opposed to massed in presentation. This observation was first formalized in Jost’s law, which states that “if two associations are of equal strength but of different age, a new repetition has a greater value for the older one” (McGeogh, 1943). Spacing effects occur across domains (eg., learning perceptual motor tasks vs. learning lists of words), across species (eg., rats, pigeons, and humans), across age groups and individuals with different memory impairments, and across retention intervals of seconds to months (see Cepeda et al 2006; Crowder 1976; Dempster 1996, for reviews).

    In this light, it is interesting that spacing effects have not received much attention in Cognitive Psychology textbooks. In fact, in our sampling of 7 such textbooks, only one had a section dedicated to this topic, while virtually all cognitive text-books discussed mnemonic techniques such as the pegword or method of loci. Given the power and simplicity of implementing spaced practice, we clearly hope this changes in the future.


  111. 1978-landauer.pdf

  112. 2000-cull.pdf

  113. 1963-peterson.pdf

  114. 1977-glenberg.pdf








  122. 2007-rohrer.pdf


  124. 2005-seabrook.pdf

  125. 1967-keppel.pdf

  126. 1981-bloom.pdf



  129. 1987-bahrick.pdf


  131. 1989-gilmore-memoryaginganddementia.pdf


  133. 2009-goverover.pdf: ⁠, Yael Goverover, Frank G. Hillary, Nancy Chiaravalloti, Juan Carlos Arango-Lasprilla, John DeLuca (2009-05-20; spaced-repetition):

    The present study examined the utility of using trials (when trials are distributed over time) versus massed learning trials (consecutive learning trials) in the acquisition of everyday functional tasks.

    In a within-subjects design, 20 participants with (MS) and 18 healthy controls (HC) completed 2 route learning tasks and 2 paragraph reading tasks. One task in each area was presented in the “spaced” condition, in which the task was presented to the participants 3 times with 5-minutes break between each trial, and the second task in each area was presented in the “massed” condition, in which the task was presented 3 consecutive times to the participants. The dependent variables consisted of recall and recognition of the paragraphs and routes both immediately and 30 minutes following initial learning.

    Results showed that for paragraph learning, the spaced condition statistically-significantly enhanced memory performance for this task relative to the massed condition. However, this effect was not demonstrated in the route learning task. Thus, the spacing effect can be beneficial to enhance recall and performance of activities of daily living for individuals with MS; however, this effect was for verbal tasks stimuli, but not for visual tasks stimuli.

    It will be important during future investigations to better characterize the factors that maximize the spacing effect.

    [Keywords: memory, activities of daily living, cognitive rehabilitation, multiple sclerosis, spacing effect]


  135. 1997-revak.pdf: “Florida Journal of Educational Research”


  137. ⁠, Mayfield, Kristin H. Chase, Philip N (2002):

    This study compared three different methods of teaching five basic algebra rules to college students. All methods used the same procedures to teach the rules and included four 50-question review sessions interspersed among the training of the individual rules. The differences among methods involved the kinds of practice provided during the four review sessions. Participants who received cumulative practice answered 50 questions covering a mix of the rules learned prior to each review session. Participants who received a simple review answered 50 questions on one previously trained rule. Participants who received extra practice answered 50 extra questions on the rule they had just learned. Tests administered after each review included new questions for applying each rule (application items) and problems that required novel combinations of the rules (problem-solving items). On the final test, the cumulative group outscored the other groups on application and problem-solving items. In addition, the cumulative group solved the problem-solving items significantly faster than the other groups. These results suggest that cumulative practice of component skills is an effective method of training problem solving.

  138. 2013-patac.pdf


  140. 2009-kerfoot-2.pdf


  142. 2009-kerfoot.pdf



  145. 2013-gyorki.pdf

  146. ⁠, Moulton, Carol-Anne E. Dubrowski, Adam Macrae, Helen Graham, Brent Grober, Ethan Reznick, Richard (2006):

    Objective: Surgical skills laboratories have become an important venue for early skill acquisition. The principles that govern training in this novel educational environment remain largely unknown; the commonest method of training, especially for continuing medical education (CME), is a single multihour event. This study addresses the impact of an alternative method, where learning is distributed over a number of training sessions. The acquisition and transfer of a new skill to a life-like model is assessed.

    Methods: Thirty-eight junior surgical residents, randomly assigned to either massed (1 day) or distributed (weekly) practice regimens, were taught a new skill (microvascular anastomosis). Each group spent the same amount of time in practice. Performance was assessed pretraining, immediately post-training, and 1 month post-training. The ultimate test of anastomotic skill was assessed with a transfer test to a live, anesthetized rat. Previously validated computer-based and expert-based outcome measures were used. In addition, clinically relevant outcomes were assessed.

    Results: Both groups showed immediate improvement in performance, but the distributed group performed significantly better on the retention test in most outcome measures (time, number of hand movements, and expert global ratings; all P values <0.05). The distributed group also outperformed the massed group on the live rat anastomosis in all expert-based measures (global ratings, checklist score, final product analysis, competency for OR; all P values <0.05).

    Conclusions: Our current model of training surgical skills using short courses (for both CME and structured residency curricula) may be suboptimal. Residents retain and transfer skills better if taught in a distributed manner. Despite the greater logistical challenge, we need to restructure training schedules to allow for distributed practice.

  147. 2014-spruit.pdf

  148. 2006-balch.pdf: “04faculty.vp”⁠, Judy Levine




  152. 2015-maas.pdf


  154. ⁠, Jeremiah Blocki, Saranga Komanduri, Lorrie Cranor, Anupam Datta (2014-10-06):

    We report on a user study that provides evidence that spaced repetition and a specific mnemonic technique enable users to successfully recall multiple strong passwords over time. Remote research participants were asked to memorize 4 Person-Action-Object (PAO) stories where they chose a famous person from a drop-down list and were given machine-generated random action-object pairs. Users were also shown a photo of a scene and asked to imagine the PAO story taking place in the scene (e.g., Bill Gates—swallowing—bike on a beach). Subsequently, they were asked to recall the action-object pairs when prompted with the associated scene-person pairs following a spaced repetition schedule over a period of 127+ days. While we evaluated several spaced repetition schedules, the best results were obtained when users initially returned after 12 hours and then in 1.5× increasing intervals: 77% of the participants successfully recalled all 4 stories in 10 tests over a period of 158 days. Much of the forgetting happened in the first test period (12 hours): 89% of participants who remembered their stories during the first test period successfully remembered them in every subsequent round. These findings, coupled with recent results on naturally rehearsing password schemes, suggest that 4 PAO stories could be used to create usable and strong passwords for 14 sensitive accounts following this spaced repetition schedule, possibly with a few extra upfront rehearsals. In addition, we find that there is an interference effect across multiple PAO stories: the recall rate of 100% (resp. 90%) for participants who were asked to memorize 1 PAO story (resp. 2 PAO stories) is statistically-significantly better than the recall rate for participants who were asked to memorize 4 PAO stories. These findings yield concrete advice for improving constructions of password management schemes and future user studies.

  155. ⁠, Manuel Blum, Santosh Vempala (2017-07-05):

    What can humans compute in their heads? We are thinking of a variety of Crypto Protocols, games like Sudoku, Crossword Puzzles, Speed Chess, and so on. The intent of this paper is to apply the ideas and methods of theoretical computer science to better understand what humans can compute in their heads. For example, can a person compute a function in their head so that an eavesdropper with a powerful computer—who sees the responses to random input—still cannot infer responses to new inputs? To address such questions, we propose a rigorous model of human computation and associated measures of complexity. We apply the model and measures first and foremost to the problem of (1) humanly computable password generation, and then consider related problems of (2) humanly computable “one-way functions” and (3) humanly computable “pseudorandom generators”.

    The theory of Human Computability developed here plays by different rules than standard computability, and this takes some getting used to. For reasons to be made clear, the polynomial versus exponential time divide of modern computability theory is irrelevant to human computation. In human computability, the step-counts for both humans and computers must be more concrete. Specifically, we restrict the adversary to at most 1024 (Avogadro number of) steps. An alternate view of this work is that it deals with the analysis of algorithms and counting steps for the case that inputs are small as opposed to the usual case of inputs large-in-the-limit.

  156. 2015-colbran.pdf: “The impact of student-generated digital flashcards on student learning of constitutional law”⁠, Stephen Colbran

  157. 2003-commins.pdf


  159. ⁠, Margulies, Carla Tully, Tim Dubnau, Josh (2005):

    Unlike most organ systems, which have evolved to maintain homeostasis, the brain has been selected to sense and adapt to environmental stimuli by constantly altering interactions in a gene network that functions within a larger neural network. This unique feature of the central nervous system provides a remarkable plasticity of behavior, but also makes experimental investigations challenging. Each experimental intervention ramifies through both gene and neural networks, resulting in unpredicted and sometimes confusing phenotypic adaptations. Experimental dissection of mechanisms underlying behavioral plasticity ultimately must accomplish an integration across many levels of biological organization, including genetic pathways acting within individual neurons, neural network interactions which feed back to gene function, and phenotypic observations at the behavioral level. This dissection will be more easily accomplished for model systems such as ⁠, which, compared with mammals, have relatively simple and manipulable nervous systems and genomes. The evolutionary conservation of behavioral phenotype and the underlying gene function ensures that much of what we learn in such model systems will be relevant to human cognition. In this essay, we have not attempted to review the entire Drosophila memory field. Instead, we have tried to discuss particular findings that provide some level of intellectual synthesis across three levels of biological organization: behavior, neural circuitry and biochemical pathways. We have attempted to use this integrative approach to evaluate distinct mechanistic hypotheses, and to propose critical experiments that will advance this field.

  160. ⁠, Randolf Menzel, Gisela Manz, Rebecca Menzel, Uwe Greggers (2001-07):

    Conditioning the proboscis extension reflex of harnessed honeybees (Apis mellifera) is used to study the effect temporal spacing between successive conditioning trials has on memory. Retention is monitored at two long-term intervals corresponding to early (1 and 2 d after conditioning) and late long-term memory (3 and 4 d). The acquisition level is varied by using different conditioned stimuli (odors, mechanical stimulation, and temperature increase at the antenna), varying strengths of the unconditioned stimulus (sucrose), and various numbers of conditioning trials. How learning trials are spaced is the dominant factor both for acquisition and retention, and although longer intertrial intervals lead to better acquisition and higher retention, the level of acquisition per se does not determine the spacing effect on retention. Rather, spaced conditioning leads to higher memory consolidation both during acquisition and later, between the early and long-term memory phases. These consolidation processes can be selectively inhibited by blocking protein synthesis during acquisition.

  161. 1972-carew.pdf

  162. ⁠, Michael A. ⁠, Jasmine Ide, Sarah E. Masters, Thomas J. Carew (2002-01):

    In Aplysia, three distinct phases of memory for sensitization can be dissociated based on their temporal and molecular features. A single training trial induces short-term memory (STM, lasting <30 min), whereas five trials delivered at 15-min intervals induces both intermediate-term memory (ITM, lasting >90 min) and long-term memory (LTM, lasting >24 h). Here, we explore the interaction of amount and pattern of training in establishing ITM and LTM by examining memory for sensitization after different numbers of trials (each trial = one tail shock) and different patterns of training (massed vs. spaced). Under spaced training patterns, two trials produced STM exclusively, whereas four or five trials each produced both ITM and LTM. Three spaced trials failed to induce LTM but did produce an early decaying form of ITM (E-ITM) that was statistically-significantly shorter and weaker in magnitude than the late-decaying ITM (L-ITM) observed after four to five trials. In addition, E-ITM was induced after three trials with both massed and spaced patterns of training. However, L-ITM and LTM after four to five trials require spaced training: Four or five massed trials failed to induce LTM and produced only E-ITM. Collectively, our results indicate that in addition to three identified phases of memory for sensitization—STM, ITM, and LTM—a unique temporal profile of memory, E-ITM, is revealed by varying either the amount or pattern of training.

  163. 2006-galluccio.pdf

  164. 1993-toppino.pdf

  165. 2009-toppino.pdf


  167. 2012-simone.pdf



  170. 2002-mammarella.pdf

  171. 2002-childers.pdf: “EBSCOhost⁠, Tarn Somervell Fletcher





  176. 1993-ericsson.pdf: ⁠, K. Anders Ericsson, Ralf T. Krampe, Clemens Tesch-Römer (1993-07; psychology  /​ ​​ ​writing):

    The theoretical framework presented in this article explains expert performance as the end result of individuals’ prolonged efforts to improve performance while negotiating motivational and external constraints. In most domains of expertise, individuals begin in their childhood a regimen of effortful activities () designed to optimize improvement. Individual differences, even among elite performers, are closely related to assessed amounts of deliberate practice. Many characteristics once believed to reflect innate talent are actually the result of intense practice extended for a minimum of 10 yrs. Analysis of expert performance provides unique evidence on the potential and limits of extreme environmental adaptation and learning.

  177. 1897-bryan.pdf: ⁠, William Lowe Bryan, Noble Harter (1897; psychology):

    Studied individual differences in telegraphic writing. A preliminary study was conducted, in which operators were cross-examined on aspects of psychological or physiological importance. On the basis of this, a study was undertaken on 60 Ss, who were asked to write a sentence requiring attention. There were constant differences required in the times for a given character. Further tests were made, and schools were requested to provide typical curves of improvement. Results reveal that there were distinct specialties in telegraphy. The rate of receiving varied greatly, and exceeded sending rate. Both external and subjective disturbances affected inexperienced operators. The best age to learn telegraphy was 18–30 yrs. The variations in the value of a character depended on its place in the sentence. Homotaxic variation was an inverse measure of skill, while the inflection variation increased with expertise.

  178. 1899-william.pdf: ⁠, William Lowe Bryan, Noble Harter (1899-01-01; psychology):

    Investigated the different stages involved in learning telegraphy. One S was tested each week on: (1) rate of receiving letters not making words, (2) rate of receiving letters making words, but not sentences, and (3) rate of receiving letters making words and sentences. Results indicate that a hierarchy of psycho-physical habits were required to receive the telegraphic language. From an early period, letter, word and higher habits made gains together, but not equally. No plateau appeared between the learning of letters and words; the first one occurred after the learning of words. Later, there was a second ascent, representing the acquisition of higher language habits. Effective speed was largely dependent upon the mastery of these habits, which led to greater accuracy in detail. Concluded that the rate of progress, depended partly on the rate of mental and nervous processes, but far more on how much was included in each process.

  179. 1921-thorndike-educationalpsychology-v2-thepsychologyoflearning.pdf#page=188




  183. 1983-lee.pdf

  184. 2004-brady.pdf

  185. 1997-landin.pdf


  187. 2005-ollis.pdf

  188. 2004-stemarie.pdf

  189. 1912-culler.pdf


  191. 1915-lashley.pdf

  192. 1916-murphy.pdf

  193. 1987-adams.pdf


  195. 1988-lee.pdf

  196. 1988-ammons.pdf

  197. 1988-christina.pdf

  198. 1988-newell.pdf


  200. 2004-lee.pdf: “Chapter 2: Contextual interference”⁠, Timothy D. Lee, Dominic A. Simon

  201. 1986-goode.pdf

  202. 1994-hall.pdf

  203. 1990-carlson.pdf

  204. 1989-carlson.pdf

  205. 2000-jamieson.pdf: “99202”



  208. 1993-landin.pdf

  209. 2003-hatala.pdf





  214. 2005-appletonknapp.pdf

  215. 1950-gagne.pdf


  217. 1956-kurtz.pdf

  218. 2008-kornell.pdf




  222. 2011-zulkiply.pdf: “Spacing and induction: Application to exemplars presented as auditory and visual text”⁠, Norehan Zulkiply, John McLean, Jennifer S. Burt, Debra Bath


  224. 2012-zulkiply.pdf


  226. ⁠, Verkoeijen, Peter P. J L. Bouwmeester, Samantha (2014):

    Inductive learning takes place when people learn a new concept or category by observing a variety of exemplars. Kornell and Bjork (2008) asked participants to learn new painting styles either by presenting different paintings of the same artist consecutively (massed presentation) or by mixing paintings of different artists (spaced presentation). In their second experiment, Kornell and Bjork (2008) showed with a final style recognition test, that spacing resulted in better inductive learning than massing. Also, by using this style recognition test, they ruled out the possibility that spacing merely resulted in a better memory for the labels of the newly learned painting styles. The findings from Kornell and Bjork’s (2008) second experiment are important because they show that the benefit of spaced learning generalizes to complex learning tasks and outcomes, and that it is not confined to rote memory learning. However, the findings from Kornell and Bjork’s (2008) second experiment have never been replicated. In the present study we performed an exact and high-powered replication of Kornell and Bjork’s (2008) second experiment with a Web-based sample. Such a replication contributes to establish the reliability of the original finding and hence to more conclusive evidence of the spacing effect in inductive learning. The findings from the present replication attempt revealed a medium-sized advantage of spacing over massing in inductive learning, which was comparable to the original effect in the experiment by Kornell and Bjork (2008). Also, the 95% (CI) of the effect sizes from both experiments overlapped considerably. Hence, the findings from the present replication experiment and the original experiment clearly reinforce each other.

  227. 2014-rohrer-1.pdf

  228. 2014-rohrer-2.pdf

  229. 2019-rohrer.pdf: “A randomized controlled trial of interleaved mathematics practice”⁠, Doug Rohrer, Robert F. Dedrick, Marissa K. Hartwig, Chi-Ngai Cheung

  230. 2014-vlach.pdf: “Equal spacing and expanding schedules in children’s categorization and generalization”⁠, Haley A. Vlach, Catherine M. Sandhofer, Robert A. Bjork


  232. #downsides





  237. 2012-muflax-dreamingofaworldundone.html.maff









  246. 2012-chessdata-perfectpitchspacedrepetition.webm




  250. 2010-seamon.pdf: ⁠, John G. Seamon, Paawan J. Punjabi, Emily A. Busch (2020-04-23; spaced-repetition):

    At age 58, JB [John Basinger] began memorizing Milton’s epic poem ⁠. 9 years and thousands of study hours later, he completed this process in 2001 and recalled from memory all 12 books of this 10,565-line poem over a 3-day period. Now 74, JB continues to recite this work. We tested his memory accuracy by cueing his recall with two lines from the beginning or middle of each book and asking JB to recall the next 10 lines. JB is an exceptional memoriser of Milton, both in our laboratory tests in which he did not know the specific tests or procedures in advance, and in our analysis of a videotaped, prepared performance. Consistent with deliberate practice theory, JB achieved this remarkable ability by deeply analysing the poem’s structure and meaning over lengthy repetitions. Our findings suggest that exceptional memorizers such as JB are made, not born, and that cognitive expertise can be demonstrated even in later adulthood.

    [Keywords: Exceptional memory, Prose memory, Age and memory]

  251. ⁠, Andrew Drucker (2011):

    In this informal article, I’ll describe the “recognition method”—a simple, powerful technique for memorization and mental calculation. Compared to traditional memorization techniques, which use elaborate encoding and visualization processes [1], the recognition method is easy to learn and requires relatively little effort…The method works: using it, I was able to mentally multiply two random 10-digit numbers, by the usual grade-school algorithm, on my first attempt! I have a normal, untrained memory, and the task would have been impossible by a direct approach. (I can’t claim I was speedy: I worked slowly and carefully, using about 7 hours plus rest breaks. I practiced twice with 5-digit numbers beforehand.)

    …It turns out that ordinary people are incredibly good at this task [recognizing whether a photograph has been seen before]. In one of the most widely-cited studies on recognition memory, Standing [2] showed participants an epic 10,000 photographs over the course of 5 days, with 5 seconds’ exposure per image. He then tested their familiarity, essentially as described above. The participants showed an 83% success rate, suggesting that they had become familiar with about 6,600 images during their ordeal. Other volunteers, trained on a smaller collection of 1,000 images selected for vividness, had a 94% success rate.

  252. 1973-standing.pdf: ⁠, Lionel Standing (1973-05-01; spaced-repetition):

    Four experiments are reported which examined memory capacity and retrieval speed for pictures and for words. Single-trial learning tasks were employed throughout, with memory performance assessed by forced-choice recognition, recall measures or choice reaction-time tasks. The main experimental findings were: (1) memory capacity, as a function of the amount of material presented, follows a general with a characteristic exponent for each task; (2) pictorial material obeys this power law and shows an overall superiority to verbal material. The capacity of recognition memory for pictures is almost limitless, when measured under appropriate conditions; (3) when the recognition task is made harder by using more alternatives, memory capacity stays constant and the superiority of pictures is maintained; (4) picture memory also exceeds verbal memory in terms of verbal recall; comparable recognition/​​​​recall ratios are obtained for pictures, words and nonsense syllables; (5) verbal memory shows a higher retrieval speed than picture memory, as inferred from reaction-time measures. Both types of material obey a power law, when reaction-time is measured for various sizes of learning set, and both show very rapid rates of memory search.

    From a consideration of the experimental results and other data it is concluded that the superiority of the pictorial mode in recognition and free recall learning tasks is well established and cannot be attributed to methodological artifact.







  259. ⁠, Jessica D. Payne, Matthew A. Tucker, Jeffrey M. Ellenbogen, Erin J. Wamsley, Matthew P. Walker, Daniel L. Schacter, Robert Stickgold (2012-02-08):

    Numerous studies have examined sleep’s influence on a range of hippocampus-dependent declarative memory tasks, from text learning to spatial navigation. In this study, we examined the impact of sleep, wake, and time-of-day influences on the processing of declarative information with strong semantic links (semantically related word pairs) and information requiring the formation of novel associations (unrelated word pairs). Participants encoded a set of related or unrelated word pairs at either 9am or 9pm, and were then tested after an interval of 30 min, 12 hr, or 24 hr. The time of day at which subjects were trained had no effect on training performance or initial memory of either word pair type. At 12 hr retest, memory overall was superior following a night of sleep compared to a day of wakefulness. However, this performance difference was a result of a pronounced deterioration in memory for unrelated word pairs across wake; there was no sleep-wake difference for related word pairs. At 24 hr retest, with all subjects having received both a full night of sleep and a full day of wakefulness, we found that memory was superior when sleep occurred shortly after learning rather than following a full day of wakefulness. Lastly, we present evidence that the rate of deterioration across wakefulness was statistically-significantly diminished when a night of sleep preceded the wake period compared to when no sleep preceded wake, suggesting that sleep served to stabilize the memories against the deleterious effects of subsequent wakefulness. Overall, our results demonstrate that 1) the impact of 12 hr of waking interference on memory retention is strongly determined by word-pair type, 2) sleep is most beneficial to memory 24 hr later if it occurs shortly after learning, and 3) sleep does in fact stabilize declarative memories, diminishing the negative impact of subsequent wakefulness.





  264. DNB-FAQ#sleep


  266. ⁠, Sisti, Helene M. Glass, Arnold L. Shors, Tracey J (2007):

    Information that is spaced over time is better remembered than the same amount of information massed together. This phenomenon, known as the spacing effect, was explored with respect to its effect on learning and neurogenesis in the adult dentate gyrus of the hippocampal formation. Because the cells are generated over time and because learning enhances their survival, we hypothesized that training with spaced trials would rescue more new neurons from death than the same number of massed trials. In the first experiment, animals trained with spaced trials in the Morris water maze outperformed animals trained with massed trials, but there was not a direct effect of trial spacing on cell survival. Rather, animals that learned well retained more cells than animals that did not learn or learned poorly. Moreover, performance during acquisition correlated with the number of cells remaining in the dentate gyrus after training. In the second experiment, the time between blocks of trials was increased. Consequently, animals trained with spaced trials performed as well as those trained with massed, but remembered the location better two weeks later. The strength of that memory correlated with the number of new cells remaining in the hippocampus. Together, these data indicate that learning, and not mere exposure to training, enhances the survival of cells that are generated 1 wk before training. They also indicate that learning over an extended period of time induces a more persistent memory, which then relates to the number of cells that reside in the hippocampus.

  267. ⁠, Lee, Jonathan L. C (2009):

    The retrieval of a memory places it into a plastic state, the result of which is that the memory can be disrupted or even enhanced by experimental treatment. This phenomenon has been conceptualised within a framework of memories being reactivated and then reconsolidated in repeated rounds of cellular processing. The reconsolidation phase has been seized upon as crucial for the understanding of memory stability and, more recently, as a potential therapeutic target in the treatment of disorders such as post-traumatic stress and drug addiction. However, little is known about the reactivation process, or what might be the adaptive function of retrieval-induced plasticity. Reconsolidation has long been proposed to mediate memory updating, but only recently has this hypothesis been supported experimentally. Here, the adaptive function of memory reconsolidation is explored in more detail, with a strong emphasis on its role in updating memories to maintain their relevance.

  268. 2010-oneill.pdf



  271. 1980-glenberg.pdf

  272. 1991-toppino.pdf


  274. 2013-philips.pdf: “Pattern and predictability in memory formation: From molecular mechanisms to clinical relevance”⁠, Gary T. Philips, Ashley M. Kopec, Thomas J. Carew



  277. 2001-maquet.pdf











  288. ⁠, Waleed Khan (2020-12-06):

    Smash Training is a spaced-repetition training web-app I created to help my progression with Super Smash Bros. Ultimate. I released it on May 16, 2020 on Reddit to warm reception. As of December 2020, it receives 150–200 monthly users. I’d rank it as my most successful project! In this article, I discuss the choices I made for this project. (The source code is available).

    …I decided that I wanted to build a spaced-repetition training app, rather than reuse a general-purpose spaced-repetition flash-card system such as Anki, because the project would benefit from domain-specific knowledge. For example:

    • Exercises have large numbers of variants, such as “short-hop” vs “full-hop”, or “facing left” vs “facing right”, which should be tracked separately.
    • Many of the exercises have natural dependencies on others: they shouldn’t be attempted unless a certain underlying fundamental skill has been mastered.
    • Exercises to train one character don’t necessarily confer the same skill for other characters. Some exercises may only be applicable to some characters.

    …Stronglifts has you note down how many repetitions of the exercise you succeeded at (out of five). However, the Smash Training paradigm is different, and has you repeat the exercise for a length of time and rate your accuracy.


























  314. mnemo.hs

  315. mnemo2.hs

  316. mnemo3.hs

  317. mnemo4.hs






  323. ⁠, Andy Matuschak, Michael Nielsen (2019-10):

    [Long writeup by Andy Matuschak and Michael Nielsen on experiment in integrating spaced repetition systems with a tutorial on quantum computing, Quantum Country: Quantum Computing For The Very Curious By combining explanation with spaced testing, a notoriously thorny subject may be learned more easily and then actually remembered—such a system demonstrating a possible ‘tool for thought’. Early results indicate users do indeed remember the quiz answers, and feedback has been positive.]

    Part I: Memory systems

    • Introducing the mnemonic medium
    • The early impact of the prototype mnemonic medium
    • Expanding the scope of memory systems: what types of understanding can they be used for?
    • Improving the mnemonic medium: making better cards
    • Two cheers for mnemonic techniques
    • How important is memory, anyway?
    • How to invent Hindu-Arabic numerals?

    Part II: Exploring tools for thought more broadly:

    • Mnemonic video

    • Why isn’t there more work on tools for thought today?

    • Questioning our basic premises

      • What if the best tools for thought have already been discovered?
      • Isn’t this what the tech industry does? Isn’t there a lot of ongoing progress on tools for thought?
      • Why not work on AGI or BCI instead?
    • Executable books

      • Serious work and the aspiration to canonical content
      • Stronger emotional connection through an inverted writing structure

    Summary and Conclusion

    … in Quantum Country an expert writes the cards, an expert who is skilled not only in the subject matter of the essay, but also in strategies which can be used to encode abstract, conceptual knowledge. And so Quantum Country provides a much more scalable approach to using memory systems to do abstract, conceptual learning. In some sense, Quantum Country aims to expand the range of subjects users can comprehend at all. In that, it has very different aspirations to all prior memory systems.

    More generally, we believe memory systems are a far richer space than has previously been realized. Existing memory systems barely scratch the surface of what is possible. We’ve taken to thinking of Quantum Country as a memory laboratory. That is, it’s a system which can be used both to better understand how memory works, and also to develop new kinds of memory system. We’d like to answer questions such as:

    • What are new ways memory systems can be applied, beyond the simple, declarative knowledge of past systems?
    • How deep can the understanding developed through a memory system be? What patterns will help users deepen their understanding as much as possible?
    • How far can we raise the human capacity for memory? And with how much ease? What are the benefits and drawbacks?
    • Might it be that one day most human beings will have a regular memory practice, as part of their everyday lives? Can we make it so memory becomes a choice; is it possible to in some sense solve the problem of memory?





















































  377. 1998-arthur.pdf









  386. ⁠, Erik Hoel (2020-07-19):

    Understanding of the evolved biological function of sleep has advanced considerably in the past decade. However, no equivalent understanding of dreams has emerged. Contemporary neuroscientific theories generally view dreams as epiphenomena, and the few proposals for their biological function are contradicted by the phenomenology of dreams themselves. Now, the recent advent of deep neural networks (DNNs) has finally provided the novel conceptual framework within which to understand the evolved function of dreams. Notably, all DNNs face the issue of overfitting as they learn, which is when performance on one data set increases but the network’s performance fails to generalize (often measured by the divergence of performance on training vs. testing data sets). This ubiquitous problem in DNNs is often solved by modelers via “noise injections” in the form of noisy or corrupted inputs. The goal of this paper is to argue that the brain faces a similar challenge of overfitting, and that nightly dreams evolved to combat the brain’s overfitting during its daily learning. That is, dreams are a biological mechanism for increasing generalizability via the creation of corrupted sensory inputs from stochastic activity across the hierarchy of neural structures. Sleep loss, specifically dream loss, leads to an overfitted brain that can still memorize and learn but fails to generalize appropriately. Herein this “overfitted brain hypothesis” is explicitly developed and then compared and contrasted with existing contemporary neuroscientific theories of dreams. Existing evidence for the hypothesis is surveyed within both neuroscience and deep learning, and a set of testable predictions are put forward that can be pursued both in vivo and in silico.

  387. 2016-mazza.pdf: ⁠, Stéphanie Mazza, Emilie Gerbier, Marie-Paule Gustin, Zumrut Kasikci, Olivier Koenig, Thomas C. Toppino, Michel Magnin (2016-08-16; spaced-repetition):

    Both repeated practice and sleep improve long-term retention of information. The assumed common mechanism underlying these effects is memory reactivation, either on-line and effortful or off-line and effortless.

    In the study reported here, we investigated whether sleep-dependent memory consolidation could help to save practice time during relearning. During two sessions occurring 12 hr apart, 40 participants practiced foreign vocabulary until they reached a perfect level of performance. Half of them learned in the morning and relearned in the evening of a single day. The other half learned in the evening of one day, slept, and then relearned in the morning of the next day. Their retention was assessed 1 week later and 6 months later. We found that interleaving sleep between learning sessions not only reduced the amount of practice needed by half but also ensured much better long-term retention.

    Sleeping after learning is definitely a good strategy, but sleeping between two learning sessions is a better strategy.

    [Keywords: learning, sleep-wake cycle, relearning, sleep-dependent memory consolidation, repeated practice]

    Figure 1: Overall results. The graph in (a) shows the mean number of correct translations (out of 16 possible) during the first and the last practice trials in the learning session (pair trials) and relearning session (list trials) and during the cued-recall task after 1 week and 6 months. Results are presented separately for the wake, sleep, and control groups. The relearning session in the control experiment consisted of only the first list trial. Error bars represent 95% confidence intervals. The box-and-whiskers plots in (b) indicate the number of pair trials necessary for the wake group and the sleep group to attain the performance criterion in the learning session and the number of list trials necessary for them to attain the performance criterion in the relearning session. The left and right edges of the boxes represent the boundaries of the first and third quartiles, respectively, and the lines down the center of the boxes represent the medians. The left and right ends of the whiskers represent the minimum and maximum scores, respectively. Asterisks indicate statistically-significant differences between groups (* p < 0.01).
    Figure 2: Individual list-trial scores. The left and middle graphs show, respectively, the individual scores of members of the sleep and wake groups for each list trial in the relearning session. The maximum score was 16. The symbols enclosed in the dashed box indicate the successive scores for those participants in the wake group who still needed to continue after all of the participants in the sleep group had reached the criterion. The graph on the right shows individual scores of members of the sleep and wake subgroups for each list trial; the subgroups were matched on their performance in the first list trial. The arrows indicate the point at which all the participants in a given group reached the criterion.
    Figure 3: Change in individual scores. Individual participants’ number of correct translations on the first list trial of the relearning session and at the delayed testing at 1 week is graphed separately for the wake and the sleep groups. The gray shaded area in each graph represents the remaining list trials in the relearning session. The dashed lines connect the two scores for each participant.
  388. ⁠, Jaap M. J. Murre, Joeri Dros (2015-01-25):

    We present a successful replication of Ebbinghaus’ classic forgetting curve from 1880 based on the method of savings. One subject spent 70 hours learning lists and relearning them after 20 min, 1 hour, 9 hours, 1 day, 2 days, or 31 days. The results are similar to Ebbinghaus’ original data. We analyze the effects of serial position on forgetting and investigate what mathematical equations present a good fit to the Ebbinghaus forgetting curve and its replications. We conclude that the Ebbinghaus forgetting curve has indeed been replicated and that it is not completely smooth but most probably shows a jump upwards starting at the 24 hour data point.

  389. ⁠, Janet Metcalfe (2017-01):

    Although error avoidance during learning appears to be the rule in American classrooms, laboratory studies suggest that it may be a counterproductive strategy, at least for neurologically typical students.

    Experimental investigations indicate that errorful learning followed by corrective feedback is beneficial to learning. Interestingly, the beneficial effects are particularly salient when individuals strongly believe that their error is correct: Errors committed with high confidence are corrected more readily than low-confidence errors. Corrective feedback, including analysis of the reasoning leading up to the mistake, is crucial.

    Aside from the direct benefit to learners, teachers gain valuable information from errors, and error tolerance encourages students’ active, exploratory, generative engagement. If the goal is optimal performance in high-stakes situations, it may be worthwhile to allow and even encourage students to commit and correct errors while they are in low-stakes learning situations rather than to assiduously avoid errors at all costs.

    [Keywords: errorless learning, generation effect, hypercorrection effect, feedback, after-action review (AAR), error management training (EMT), formative assessment, reconsolidation, prediction error]

    • Introduction
    • Encouraging Versus Discouraging Errors In The Classroom
    • Error Generation And Memory For Correct Responses In The Lab
    • Confidence In Errors
    • Exceptions
    • Implications Of The Hypercorrection Effect
    • Theories Of Why Errors Enhance Learning
    • Secondary Benefits Of Encouraging Errors
    • Origin Of The Idea That Errorless Learning Is A Good Thing
    • Emotional Consequences Of Errors
    • Using Errors To Improve Learning
    • Conclusion