Dual N-Back FAQ

A compendium of DNB, WM, IQ information up to 2015
DNB, psychology, experiments, R, survey, Bayes, IQ
2009-03-252019-12-05 in progress certainty: unlikely importance: 7

Dual N-Back is a kind of intended to expand your (WM), and hope­fully your intel­li­gence (IQ1).

The the­ory orig­i­nally went that novel2 cog­ni­tive processes tend to over­lap and seem to go through . As it hap­pens, WM pre­dicts and cor­re­lates with IQ3 and may use the same neural net­works4, sug­gest­ing that WM might be IQ5. WM is known to be train­able, and so improv­ing WM would hope­fully improve IQ. And N-back is a fam­ily of tasks which stress atten­tion and WM.

Later research found that per­for­mance and improve­ment on N-back seems to cor­re­late bet­ter with IQ rather than clas­sic mea­sures of WM like recit­ing lists of num­bers, rais­ing the ques­tion of whether N-back works via increas­ing WM or by improv­ing self­-con­trol or improv­ing manip­u­la­tion of WM con­tents (rather than WM’s size) or some­how train­ing IQ direct­ly.6 Per­for­mance on DNB has com­pli­cated cor­re­la­tions with per­for­mance on other tests of work­ing mem­ory or IQ, so it’s not clear what it is tap­ping into. (And the link between WM and per­for­mance on IQ tests has been dis­put­ed; high WM as mea­sured by OSPAN does not cor­re­late well with per­for­mance on hard Raven’s ques­tions7 and the valid­ity of sin­gle tests of WM train­ing has been ques­tioned8.)

Brain Work­shop offers many mod­es, some far more elab­o­rate than sim­ple Dual N-back; no research has been done on them, so lit­tle can be said about what they are good for or what they train or what improve­ments they may offer; Jaeggi 2010 seemed to find Sin­gle N-back bet­ter than Dual N-back. Some of the more elab­o­rate modes seem to focus heav­ily on shift­ing the cor­rect response among var­i­ous modal­i­ties - not just sound, but left/right, eg. - and so stress ; there are results that task switch­ing can be trained and that it trans­fers9, but how use­ful this is and how well the BW modes train this are unknown.

The Argument

for learn­ing and also just gen­eral intel­li­gence.10 It’s not too hard to see why work­ing mem­ory could be so impor­tant. Work­ing mem­ory boils down to ‘how much stuff you can think about at the same time’.

Imag­ine a poor pro­gram­mer who has suffered brain dam­age and has only enough work­ing mem­ory for 1 defi­n­i­tion at a time. How could he write any­thing? To write a cor­rect pro­gram, he needs to know simul­ta­ne­ously 2 things - what a vari­able, say, con­tains, and what is valid input for a pro­gram. But unfor­tu­nate­ly, our pro­gram­mer can know that the vari­able foo con­tains a string with the input, or he can know that the func­tion processInput uses a string, but he can’t remem­ber these 2 things simul­ta­ne­ous­ly! He will dead­lock forever, unsure either what to do with this foo, or unsure what exactly processInput was sup­posed to work on.

More seri­ous­ly, work­ing mem­ory can be use­ful since it allows one to grasp more of the struc­ture of some­thing at any one time. Com­men­ta­tors on pro­gram­ming often write that one of the great chal­lenges of pro­gram­ming (be­sides the chal­lenge of accept­ing & deal­ing with the real­ity that a com­puter really is just a mind­less rule-fol­low­ing machine), is that pro­gram­ming requires one to keep in mind dozens of things and cir­cum­stances - any one of which could com­pletely bol­lix things up. Focus is absolutely essen­tial. One of the char­ac­ter­is­tics of great pro­gram­mers is their appar­ent omni­science. Obses­sion grants them this abil­ity to know what they are actu­ally doing:

“With pro­gram­mers, it’s espe­cially hard. Pro­duc­tiv­ity depends on being able to jug­gle a lot of lit­tle details in short term mem­ory all at once. Any kind of inter­rup­tion can cause these details to come crash­ing down. When you resume work, you can’t remem­ber any of the details (like local vari­able names you were using, or where you were up to in imple­ment­ing that search algo­rithm) and you have to keep look­ing these things up, which slows you down a lot until you get back up to speed.” –, “Where do These Peo­ple Get Their (Uno­rig­i­nal) Ideas?”

“Sev­eral friends men­tioned hack­ers’ abil­ity to con­cen­trate - their abil­i­ty, as one put it, to ‘tune out every­thing out­side their own heads.’ I’ve cer­tainly noticed this. And I’ve heard sev­eral hack­ers say that after drink­ing even half a beer they can’t pro­gram at all. So maybe hack­ing does require some spe­cial abil­ity to focus. Per­haps great hack­ers can load a large amount of con­text into their head, so that when they look at a line of code, they see not just that line but the whole pro­gram around it. John McPhee wrote that Bill Bradley’s suc­cess as a bas­ket­ball player was due partly to his extra­or­di­nary periph­eral vision. ‘Per­fect’ eye­sight means about 47° of ver­ti­cal periph­eral vision. Bill Bradley had 70; he could see the bas­ket when he was look­ing at the floor. Maybe great hack­ers have some sim­i­lar inborn abil­i­ty. (I cheat by using a very dense lan­guage, which shrinks the court.) This could explain the dis­con­nect over cubi­cles. Maybe the peo­ple in charge of facil­i­ties, not hav­ing any con­cen­tra­tion to shat­ter, have no idea that work­ing in a cubi­cle feels to a hacker like hav­ing one’s brain in a blender.” –, “Great Hack­ers”

It’s sur­pris­ing, but bugs have a close rela­tion­ship to num­ber of lines of code - no mat­ter whether the lan­guage is as low-level as assem­bler or high­-level as Haskell (hu­mor­ous­ly, Nor­ris’ num­ber); is this because each line takes up a sim­i­lar amount of work­ing and short­-term mem­ory and there’s only so much mem­ory to go around?11

The Silver Bullet

It’s not all that obvi­ous, but just about every pro­duc­tiv­ity inno­va­tion in com­put­ing is about either cut­ting down on how much a pro­gram­mer needs to know (eg. ), or mak­ing it eas­ier for him to shuffle things in and out of his ‘short term mem­ory’. Why are some com­men­ta­tors like so focused12 on hav­ing mul­ti­ple ? For that mat­ter, why are there real stud­ies show­ing sur­pris­ingly large pro­duc­tiv­ity boosts by sim­ply adding a sec­ond mon­i­tor?13 It’s not like the per­son is any differ­ent after­wards. And arguably mul­ti­ple or larger mon­i­tors come with dam­ag­ing over­heads14.

Or, why does think touch-typ­ing is one of the few skills pro­gram­mers must know (along with read­ing)?15 Why is Unix guru one regret not learn­ing typ­ing?16 Typ­ing hardly seems very impor­tant - it’s what you say, not how you say it. The com­piler does­n’t care if you typed the source code in at 30WPM or 120WPM, after all.

I love being able to type that with­out look­ing! It’s empow­er­ing, being able to type almost as fast as you can think. Why would you want it any other way?

The thing is, mul­ti­ple mon­i­tors, touch-typ­ing, speed-read­ing17 - they’re all about mak­ing the exter­nal world part of your mind. What’s the real differ­ence between hav­ing a in your short­-term mem­ory or promi­nently dis­played in your sec­ond mon­i­tor? What’s the real differ­ence between writ­ing a in your mind or touch-typ­ing it as fast as you cre­ate it?

WM prob­lems

Just some speed. Just some time. And the more vis­i­ble that type sig­na­ture is, the faster you can type out that com­ment, the larger your ‘mem­ory’ gets. And the larger your mem­ory is, the more intelligent/productive you can be. (Think of this as the the­sis as applied to pro­gram­ming!) Great pro­gram­mers often1819 talk vaguely about ‘keep­ing a sys­tem in your head’ or ‘hav­ing a model’, and hate dis­trac­tions20, say­ing they destroy one’s care­fully devel­oped thoughts; I think what they are talk­ing about is try­ing to store all the rel­e­vant details inside their short­-term or work­ing mem­o­ry. Learn­ing pro­gram­ming has a cor­re­la­tion with WM.21 (Once you start look­ing, you see this every­where. Games, for exam­ple.22) Or in bug rates - WM has been pro­posed as the rea­son why small or large chunks of pro­grams have more pro­por­tional errors than medium sized chunks23. It remains to be seen whether pro­gram­ming tools designed with an eye to mem­ory will be help­ful, though.

But as great as things like garbage col­lec­tion & touch-typ­ing & mul­ti­ple mon­i­tors are (I am a fan & user of the fore­go­ing), they are still imper­fect sub­sti­tutes. Would­n’t it be bet­ter if one could just improve one’s short-term/working mem­ory direct­ly? It might be more effec­tive, and cer­tainly would be more portable!


Unfor­tu­nate­ly, in gen­er­al, IQ/ and mem­ory don’t seem to be train­able. Many appar­ent effects are swamped by exer­cise or nutri­tion or by sim­ple prac­tice. And when prac­tice does result in gains on tasks or expen­sive games24, said ben­e­fits often do not ; many pop­u­lar ‘brain games’ & exer­cises fail this cri­te­rion or at least have not been shown to trans­fer252627, even brainy skilled exer­cises like music28 or chess29 or mem­ory com­pe­ti­tions30. summed it up:

…Gen­eral Dree­dle wants his [pi­lots] to spend as much time on the skeet-shoot­ing range as the facil­i­ties and their flight sched­ule would allow. eight hours a month was excel­lent train­ing for them. It trained them to shoot skeet.

Indeed, the gen­eral his­tory of attempts to increase IQ in any chil­dren or adults remains essen­tially what it was when wrote his 1969 paper —a his­tory of fail­ure. The excep­tions prove the rule by either apply­ing to nar­row groups with spe­cific deficits or work only before birth, like . (See also : if there were an easy fit­ness-in­creas­ing way to make us smarter, evo­lu­tion would have already used it.)

But hope springs eter­nal, and there are pos­si­ble excep­tions. The one this FAQ focuses on is Dual , and it’s a vari­ant on an old work­ing-mem­ory test.

One of the nice things about N-back is that while it may or may not improve your IQ, it may help you in other ways. WM train­ing helps alco­holics reduce their con­sump­tion31 and increases patience in recov­er­ing stim­u­lant addicts (co­caine & metham­phet­a­mine)32. The self­-dis­ci­pline or willpower of stu­dents cor­re­lates bet­ter with grades than even IQ33, WM cor­re­lates with grades and lower behav­ioral prob­lems34 & WM out­-pre­dicts grades 6 years later in 5-year olds & 2 years later in older chil­dren35. WM train­ing has been shown to help chil­dren with ADHD36 and also preschool­ers with­out ADHD37; Lucas 2008 found behav­ior improve­ments at a sum­mer camp. Another inter­ven­tion using a mis­cel­lany of ‘rea­son­ing’ games with young (7-9 years old) poor chil­dren found a For­wards Digit Span (but not Back­wards) and IQ gains, with no gain to the sub­jects play­ing games requir­ing “rapid visual detec­tion and rapid motor responses”38, but it’s worth remem­ber­ing that IQ scores are unre­li­able in child­hood39 or per­haps, as an ado­les­cent brain imag­ing study indi­cates40, they sim­ply are much more mal­leable at that point. (WM train­ing in teenagers does­n’t seem much stud­ied but given their issues, may help; see “Beau­ti­ful Brains” or “The Trou­ble With Teens”.)

There are many kinds of WM train­ing. One review worth read­ing is “Does work­ing mem­ory train­ing work? The promise and chal­lenges of enhanc­ing cog­ni­tion by train­ing work­ing mem­ory” (Mor­ri­son & Chein 2011); “Is Work­ing Mem­ory Train­ing Effec­tive?” (Ship­stead, Redick, & Engle 2012) dis­cusses the mul­ti­ple method­olog­i­cal diffi­cul­ties of design­ing WM train­ing exper­i­ments (at least, they are diffi­cult if you want to show gen­uine improve­ments which trans­fer to non-WM skill­s).


The orig­i­nal N-back test sim­ply asked that you remem­ber a sin­gle stream of let­ters, and sig­nal if any let­ters were pre­cise­ly, say, 2 posi­tions apart. ‘A S S R’ would­n’t merit a sig­nal, but ‘A S A R’ would since there are ‘A’ char­ac­ters exactly 2 posi­tions away from each oth­er. The pro­gram would give you another let­ter, you would sig­nal or not, and so on. This is sim­ple enough once you under­stand it, but is a lit­tle hard to explain. It may be best to read the Brain Work­shop tuto­r­ial, or watch a video.

Dual N-back

In 2003, Susan Jaeggi and her team began stud­ies using a vari­ant of N-back which tried to increase the bur­den on each turn - remem­ber­ing mul­ti­ple things instead of just 1. The abstract describes the rea­son why:

With ref­er­ence to sin­gle tasks, acti­va­tion in the pre­frontal cor­tex (PFC) com­monly increases with incre­men­tal mem­ory load, whereas for dual tasks it has been hypoth­e­sized pre­vi­ously that activ­ity in the PFC decreases in the face of exces­sive pro­cess­ing demands, i.e., if the capac­ity of the work­ing mem­o­ry’s cen­tral exec­u­tive sys­tem is exceed­ed. How­ev­er, our results show that dur­ing both sin­gle and dual tasks, pre­frontal acti­va­tion increases con­tin­u­ously as a func­tion of mem­ory load. An increase of pre­frontal acti­va­tion was observed in the dual tasks even though pro­cess­ing demands were exces­sive in the case of the most diffi­cult con­di­tion, as indi­cated by behav­ioral accu­racy mea­sures. The hypoth­e­sis con­cern­ing the decrease in pre­frontal acti­va­tion could not be sup­ported and was dis­cussed in terms of moti­va­tion fac­tors.41

In this ver­sion, called “dual N-back” (to dis­tin­guish it from the clas­sic sin­gle N-back), one is still play­ing a turn-based game. In the Brain Work­shop ver­sion, you are pre­sented with a 3x3 grid in which every turn, a block appears in 1 of the 9 spaces and a let­ter is spo­ken aloud. (There are any num­ber of vari­ants: the NATO pho­netic alpha­bet, piano keys, etc. And Brain Work­shop has any num­ber of mod­es, like ‘Arith­metic N-back’ or ‘Quin­tu­ple N-back’.)


In 1-back, the task is to cor­rectly answer whether the let­ter is the same as the pre­vi­ous round, and whether the posi­tion is the same as the pre­vi­ous round. It can be both, mak­ing 4 pos­si­ble responses (po­si­tion, sound, posi­tion+­sound, & nei­ther).

This stresses work­ing mem­ory since you need to keep in mind 4 things simul­ta­ne­ous­ly: the posi­tion and let­ter of the pre­vi­ous turn, and the posi­tion and let­ter of the cur­rent turn (so you can com­pare the cur­rent let­ter with the old let­ter and the cur­rent posi­tion with the old posi­tion). Then on the next turn you need to imme­di­ately for­get the old posi­tion & let­ter (which are now use­less) and remem­ber the new posi­tion and let­ter. So you are con­stantly remem­ber­ing and for­get­ting and com­par­ing.


But 1-back is pretty easy. The turns come fast enough that you could eas­ily keep the let­ters in your and lighten the load on your work­ing mem­o­ry. Indeed, after 10 rounds or so of 1-back, I mas­tered it - I now get 100%, unless I for­get for a sec­ond that it’s 1-back and not 2-back (or I sim­ply lose my con­cen­tra­tion com­plete­ly). Most peo­ple find 1-back very easy to learn, although a bit chal­leng­ing at first since the pres­sure is con­stant (games and tests usu­ally have some slack or rest peri­od­s).

The next step up is a doozy: 2-back. In 2-back, you do the same thing as 1-back but as the name sug­gests, you are instead match­ing against 2 turns ago. So before you would be look­ing for repeated let­ters - ‘AA’ - but now you need to look for sep­a­rated let­ters - ‘ABA’. And of course, you can’t for­get so quick­ly, since you still need to match against some­thing like ‘ABABA’.

2-back stresses your work­ing mem­ory even more, as now you are remem­ber­ing 6 things, not 4: 2 turns ago, the pre­vi­ous turn, and the cur­rent turn - all of which have 2 salient fea­tures. At 6 items, we’re also in the mid-range of esti­mates for :

Work­ing mem­ory is gen­er­ally con­sid­ered to have lim­ited capac­i­ty. The ear­li­est quan­tifi­ca­tion of the capac­ity limit asso­ci­ated with short­-term mem­ory was the intro­duced by Miller (1956). He noticed that the mem­ory span of young adults was around seven ele­ments, called chunks, regard­less whether the ele­ments were dig­its, let­ters, words, or other units. Later research revealed that span does depend on the cat­e­gory of chunks used (e.g., span is around seven for dig­its, around six for let­ters, and around five for word­s), and even on fea­tures of the chunks within a cat­e­go­ry….­Sev­eral other fac­tors also affect a per­son’s mea­sured span, and there­fore it is diffi­cult to pin down the capac­ity of short­-term or work­ing mem­ory to a num­ber of chunks. Nonethe­less, Cowan (2001) has pro­posed that work­ing mem­ory has a capac­ity of about four chunks in young adults (and fewer in chil­dren and old adult­s).

And even if there are only a few things to remem­ber, the num­ber of responses you have to choose between go up expo­nen­tially with how many ‘modes’ there are, so Triple N-back has not ⅓ more pos­si­ble responses than Dual N-back, but more than twice as many: if m is the num­ber of mod­es, then the num­ber of pos­si­ble responses is 2m-1 (the -1 is there because one can noth­ing in every mode, but that’s bor­ing and requires no choice or thought), so DNB has 3 pos­si­ble responses42, while TNB has 743, Quadru­ple N-back 1544, and Quin­tu­ple N-back 3145!

Worse, the tem­po­ral gap between ele­ments is deeply con­fus­ing. It’s par­tic­u­larly bad when there’s rep­e­ti­tion involved - if the same square is selected twice with the same let­ter, you might wind up for­get­ting both!

So 2-back is where the chal­lenge first really man­i­fests. After about 20 games I started to get the hang of it. (It helped to play a few games focus­ing only on one of the stim­uli, like the let­ters; this helps you get used to the ‘reach­ing back’ of 2-back.)

Personal reflection on results

Have I seen any ben­e­fits yet? Not real­ly. Thus far it’s like med­i­ta­tion: I haven’t seen any spe­cific improve­ments, but it’s been inter­est­ing just to explore con­cen­tra­tion - I’ve learned that my abil­ity to focus is much less than I thought it was! It is very sober­ing to get 30% scores on some­thing as triv­ial as 1-back and strain to reach D2B, and even more sober­ing to score 60% and min­utes later score 20%. Besides the intrin­sic inter­est of chang­ing one’s brain through a sim­ple exer­cise - med­i­ta­tion is equally inter­est­ing for how one’s mind refuses to coop­er­ate with the sim­ple work of med­i­tat­ing, and I under­stand that there are even vivid hal­lu­ci­na­tions at the higher lev­els - N-back might func­tion as a kind of men­tal cal­is­then­ics. Few peo­ple exer­cise and stretch because they find the activ­i­ties intrin­si­cally valu­able, but they serve to fur­ther some other goal; some peo­ple jog because they just enjoy run­ning, but many more jog so they can play soc­cer bet­ter or live longer. I am young, and it’s good to explore these sorts of cal­is­then­ics while one has a long life ahead of one; then one can reap the most ben­e­fits.


N-back train­ing is some­times referred to sim­ply as ‘N-back­ing’, and par­tic­i­pants in such train­ing are called ‘N-back­ers’. Almost every­one uses the Free, fea­ture­ful & portable pro­gram Brain Work­shop, abbre­vi­ated “BW” (but see the soft­ware sec­tion for alter­na­tives).

There are many vari­ants of N-back train­ing. A 3-let­ter acronym end­ing in ‘B’ spec­i­fies one of the pos­si­bil­i­ties. For exam­ple, ‘D2B’ and ‘D6B’ both refer to a dual N-back task, but in the for­mer the depth of recall is 2 turns, while in the lat­ter one must remem­ber back 6 rounds; the ‘D’, for ‘Dual’, indi­cates that each round presents 2 stim­uli (usu­ally the posi­tion of the square, and a spo­ken let­ter).

But one can add fur­ther stim­uli: spo­ken let­ter, posi­tion of square, and color of square. That would be ‘Triple N-back’, and so one might speak of how one is doing on ‘T4B’.

One can go fur­ther. Spo­ken let­ter, posi­tion, col­or, and geo­met­ric shape. This would be ‘Quad N-back’, so one might dis­cuss one’s per­for­mance on ‘Q3B’. (It’s unclear how to com­pare the var­i­ous mod­es, but it seems to be much harder to go from D2B to T3B than to go from D2B to D3B.)

Past QNB, there is Pen­tu­ple N-back (PNB) which was added in Brain Work­shop 4.7 (video demon­stra­tion). The 5th modal­ity is added by a sec­ond audio chan­nel - that is, now sounds are in stereo.

Other abbre­vi­a­tions are in com­mon use: ‘WM’ for ‘work­ing mem­ory’, ‘Gf’ for ‘’, and ‘g’ for the mea­sured by IQ tests.

Notes from the author

N-back in general

To those whose time is lim­it­ed: you may wish to stop read­ing here. If you seek to improve your life, and want the great­est ‘bang for the buck’, you are well-ad­vised to look else­where.

Med­i­ta­tion, for exam­ple, is eas­ier, faster, and ultra­-portable. Typ­ing train­ing will directly improve your facil­ity with a com­put­er, a valu­able skill for this mod­ern world. mem­o­riza­tion tech­niques offer unpar­al­leled advan­tages to stu­dents. are the epit­ome of ease (just swal­low!), and their effects are much more eas­ily assessed - one can even run dou­ble-blind exper­i­ments on one­self, impos­si­ble with dual N-back. Other sup­ple­ments like can deliver ben­e­fits incom­men­su­rable with DNB - what is the cog­ni­tive value of another num­ber in work­ing mem­ory thanks to DNB com­pared to a good night’s sleep thanks to mela­ton­in? Mod­est changes to one’s diet and envi­rons can fun­da­men­tally improve one’s well-be­ing. Even basic train­ing in read­ing, with the crud­est tech­niques, can pay large div­i­dends if one is below a basic level of read­ing like 200 & still sub­vo­cal­iz­ing. And all of these can start pay­ing off imme­di­ate­ly.

DNB, on the other hand, requires a min­i­mum of 15 hours before one can expect gen­uine somatic improve­ments. The task itself is unproven - the Jaeggi stud­ies are sug­ges­tive, not defin­i­tive (and there are con­trary results). Pro­grams for DNB train­ing rely essen­tially on guess­work as they explore the large design-space; there are no data on what fea­tures are essen­tial, what sort of pre­sen­ta­tion opti­mal, or even how long or when to train for. The task itself is unen­joy­able. It can be weary­ing, diffi­cult & embar­rass­ing. It can be one too many daily tasks, a straw which breaks the camel’s back, and a dis­trac­tion from what­ever activ­ity has the great­est for one46 and one ought to be doing instead.

So why then do I per­se­vere with DNB?

I do it because I find it fas­ci­nat­ing. Fas­ci­nat­ing that WM can be so large a part of IQ; fas­ci­nat­ing that it can be increased by an appar­ently triv­ial exer­cise. I’m fas­ci­nated that there are mea­sur­able gross changes in brain activ­ity & chem­istry & com­po­si­tion47 - that the effects are not purely ‘men­tal’ or place­bo. I’m fas­ci­nated by how the sequence of posi­tions and let­ters can at some times appear in my mind with bound­less lucid­i­ty, yet at other times I grope con­fused in a men­tal murk unsure of even what the last position/letter was - even though I can rise from my com­puter and go about nor­mal activ­i­ties nor­mal­ly; or with how time can stretch and com­press dur­ing N-back­ing48. I’m fas­ci­nated by how a sin­gle increase in n-level can ren­der the task night­mar­ishly diffi­cult when I just fin­ished n-1 at 90 or 100%. I’m fas­ci­nated by how sac­cad­ing, another appar­ently triv­ial exer­cise, can reli­ably boost my score by 10 or 20%, and how my mind seems to be fagged after just a few rounds but recov­ers within min­utes. I’m equally fas­ci­nated by the large lit­er­a­ture on WM: what it is, what’s it good for, how it can be manip­u­lat­ed, etc.

I do not think that DNB is ter­ri­bly prac­ti­cal - but inter­est­ing? Very.

Reading this FAQ

Brian: “Look, you’ve got it all wrong! You don’t need to fol­low me, You don’t need to fol­low any­body! You’ve got to think for your selves! You’re all indi­vid­u­als!”

The Crowd: “Yes! We’re all indi­vid­u­als!”49

This FAQ is almost solely my own work. I’ve striven to make it fair, to incor­po­rate most of the rel­e­vant research, and to not omit things. But inevitably I will have made errors or impor­tant omis­sions. You must read this skep­ti­cal­ly.

You must read this skep­ti­cally also because the N-back com­mu­nity formed around the mail­ing list is a com­mu­ni­ty. That means it is prone to all the biases and issues of a com­mu­ni­ty. One would expect a com­mu­nity formed around a tech­nique or prac­tice to be made up only of peo­ple who find value in it; any mate­r­ial (like this FAQ or included tes­ti­mo­ni­als) is auto­mat­i­cally sus­pect due to biases such as the . Imag­ine if sci­en­tists pub­lished only papers which showed new results, and no papers report­ing fail­ure to ! Why would any N-backer hang around who had dis­cov­ered that DNB was not use­ful or a fraud? Cer­tainly the fans would not thank him. ( has an excel­lent essay called “Evap­o­ra­tive Cool­ing of Group Beliefs” on this top­ic; for­tu­nate­ly, the dam­age caused by a dual n-back would be lim­it­ed, in com­par­i­son to some other exam­ples of evap­o­ra­tive cool­ing like or mind-con­trol vic­tims.)

Final­ly, you must read skep­ti­cally because this is about psy­chol­o­gy. Psy­chol­ogy is noto­ri­ously for being one of the hard­est sci­en­tific fields to get solid results in, because every­body is WEIRD and differ­ent. As one of my pro­fes­sors joked, “if you have 2 psy­chol­ogy papers report­ing the same result, one of them is wrong”; there are many issues with tak­ing a psy­chol­ogy study at face value (to which I have devoted an appen­dix, “Flaws in main­stream sci­ence (and psy­chol­o­gy)”). It’s very tempt­ing to engage in but you must­n’t. Every­body is differ­ent; your pos­i­tive (or neg­a­tive) result could be due to a , it could be thanks to that recent shift in your sleep sched­ule for the bet­ter50, or that nap you took51, it could be the exer­cise you’re get­ting52, it could be a mild lift­ing (or set­ting in), it could be a cal­cium or 53 or iodine defi­ciency, 545556, vari­a­tion in moti­va­tion etc.

N-back training

Should I do multiple daily sessions, or just one?

Most users seem to go for one long N-back ses­sion, point­ing out that exer­cises one’s focus. Oth­ers do one ses­sion in the morn­ing and one in the evening so they can focus bet­ter on each one. There is some sci­en­tific sup­port for the idea that evening ses­sions are bet­ter than morn­ing ses­sions, though; see Kuriyama 2008 on how prac­tice before bed­time was more effec­tive than after wak­ing up.

If you break up ses­sions into more than 2, you’re prob­a­bly wast­ing time due to over­head, and may not be get­ting enough exer­cise in each ses­sion to really strain your­self like you need to.


The sim­plest men­tal strat­e­gy, and per­haps the most com­mon, is to men­tally think of a list, and for­get the last one each round, remem­ber­ing the newest in its place. This begins to break down on higher lev­els - if one is repeat­ing the list men­tal­ly, the rep­e­ti­tion can just take too long.

Surcer writes up a list of strate­gies for differ­ent lev­els in his “My Sys­tem, let’s share strate­gies” thread.

Are strategies good or bad?

Peo­ple fre­quently ask and dis­cuss whether they should use some sort of strat­e­gy, and if so, what.

A num­ber of N-back­ers adopt an ‘intu­ition’ strat­e­gy. Rather than explic­itly rehears­ing sequences of let­ters (‘f-up, h-mid­dle; f-up, h-mid­dle; g-down, f-up…’), they sim­ply think very hard and wait for a feel­ing that they should press ‘a’ (au­dio match), or ‘l’ (lo­ca­tion match). Some, like SwedishChef can be quite vocif­er­ous about it:

The chal­lenges are in help­ing peo­ple under­stand that dual-n-back is NOT about remem­ber­ing n num­ber of visual and audi­tory stim­uli. It’s about devel­op­ing a new men­tal process that intu­itively rec­og­nizes when it has seen or heard a stim­uli n times ago.

Ini­tial­ly, most stu­dents of dual n-back want to remem­ber n items as fast as they can so they can con­quer the dual-n-back hill. They use their own already devel­oped tech­niques to help them remem­ber. They may try to hold the images in their head men­tally and review them every time a new image is added and say the sounds out loud and review the sounds every time a new sound is added. This is NOT what we want. We want the brain to learn a new process that intu­itively rec­og­nizes if an item and sound was shown 3 back or 4 back. It’s sort of like play­ing a new type of musi­cal instru­ment.

I’ve helped some stu­dents on the site try to under­stand this. It’s not about how much you can remem­ber, it’s about learn­ing a new process. In the­o­ry, this new process trans­lates into a bet­ter work­ing mem­o­ry, which helps you make con­nec­tions bet­ter and faster.

Other N-back­ers think that intu­ition can’t work, or at least does­n’t very well:

I don’t believe that much in the “intu­itive” method. I mean, sure, you can intu­itively remem­ber you heard the same let­ter or saw the square at the same posi­tion a few times ago, but I fail to see how you can “feel” it was exactly 6 or 7 times ago with­out some kind of “active” remem­ber­ing. –Gaël DEEST

I totally agree with Gaël about the intu­itive method not hold­ing much water…­For me a lot of times the intu­itive method can be totally unre­li­able. You’ll be doing 5-back one game and a few games later your fail­ing mis­er­ably at 3-back­..y­our score all over the place. Plus, intu­itive-wise, it’s best to play the same n-back level over and over because then you train your intu­ition…and that does­n’t seem right. –MikeM (same thread)

Few N-back­ers have sys­tem­at­i­cally tracked intu­itive ver­sus strate­gic play­ing; Dark­Alrx reports on his blog the results of his exper­i­ment, and while he con­sid­ers them pos­i­tive, oth­ers find them incon­clu­sive, or like Pheonex­ia, even unfa­vor­able for the intu­itive approach:

Look­ing at your graphs and the over­all drop in your per­for­mance, I think it’s clear that intu­itive does­n’t work. On your score sheet, the first pic­ture, using the intu­itive method over 38 days of TNB train­ing in 44 days your aver­age n-back increased by less than .25. You were per­form­ing much bet­ter before. With your neu­ro­ge­n­e­sis exper­i­ment, your aver­age n-back actu­ally decreased.

Jaeggi her­self was more mod­er­ate in ~2008:

I would NOT rec­om­mend you [train the visual and audi­tory task sep­a­rate­ly] if you want to train the dual-task (the one we used in our study). The rea­son is that the com­bi­na­tion of both modal­i­ties is an entirely differ­ent task than doing both sep­a­rate­ly! If you do the task sep­a­rate­ly, I assume you use some “rehearsal strate­gies”, e.g. you repeat the let­ters or posi­tions for your­self. In the dual-task ver­sion how­ev­er, these strate­gies might be more diffi­cult to apply (since you have to do 2 things simul­ta­ne­ous­ly…), and that is exactly what we want… We don’t want to train strate­gies, we want to train process­es. Processes that then might help you in the per­for­mance of oth­er, non-trained tasks (and that is our ulti­mate goal). So, it is not impor­tant to reach a 7- or 8-back… It is impor­tant to fully focus your atten­tion on the task as well as pos­si­ble.

I can assure you, it is a very tough train­ing reg­i­men…. You can’t divert your atten­tion even 1 sec­ond (I’m sure you have noticed…). But even­tu­al­ly, you will see that you get bet­ter at it and maybe you notice that you are bet­ter able to con­cen­trate on cer­tain things, to remem­ber things more eas­i­ly, etc. (hope­ful­ly).

(Un­for­tu­nate­ly, doubt has been cast on this advice by the appar­ent effec­tive­ness of sin­gle n-back in Jaeggi 2010. If sin­gle (visual/position) n-back is effec­tive in increas­ing IQ, then maybe train­ing just audio or just visual is actu­ally a good idea.)

this is a ques­tion i am being asked a lot and unfor­tu­nate­ly, i don’t really know whether i can help with that. i can only tell you what we tell (or rather not tell) our par­tic­i­pants and what they tell us. so, first of all, we don’t tell peo­ple at all what strat­egy to use - it is up to them. thing is, there are some peo­ple that tell us what you describe above, i.e. some of them tell us that it works best if they don’t use a strat­egy at all and just “let the squares/letters flow by”. but of course, many par­tic­i­pants also use more con­scious strate­gies like rehears­ing or group­ing items togeth­er. but again - we let peo­ple chose their strate­gies them­selves! ref

But it may make no differ­ence. Even if you are engaged in a com­plex mnemon­ic-based strat­e­gy, you’re still work­ing your mem­o­ry. Strate­gies may not work; quot­ing from Jaeg­gi’s 2008 paper:

By this account, one rea­son for hav­ing obtained trans­fer between work­ing mem­ory and mea­sures of Gf is that our train­ing pro­ce­dure may have facil­i­tated the abil­ity to con­trol atten­tion. This abil­ity would come about because the con­stant updat­ing of mem­ory rep­re­sen­ta­tions with the pre­sen­ta­tion of each new stim­u­lus requires the engage­ment of mech­a­nisms to shift atten­tion. Also, our train­ing task dis­cour­ages the devel­op­ment of sim­ple task-spe­cific strate­gies that can pro­ceed in the absence of con­trolled allo­ca­tion of atten­tion.

Even if they do, they may not be a good idea; quot­ing from Jaeggi 2010:

We also pro­posed that it is impor­tant that par­tic­i­pants only min­i­mally learn task-spe­cific strate­gies in order to pre­vent spe­cific skill acqui­si­tion. We think that besides the trans­fer to matrix rea­son­ing, the improve­ment in the near trans­fer mea­sure pro­vides addi­tional evi­dence that the par­tic­i­pants trained on task-un­der­ly­ing processes rather than rely­ing on mate­ri­al-spe­cific strate­gies.

Hope­fully even if a trick lets you jump from 3-back to 5-back, Brain Work­shop will just keep esca­lat­ing the diffi­culty until you are chal­lenged again. It’s not the level you reach, but the work you do.

And the flashing right/wrong feedback?

A mat­ter of pref­er­ence, although those in favor of dis­abling the visual feed­back (SHOW_FEEDBACK = False) seem to be slightly more vocal or numer­ous. Brain Twister appar­ently does­n’t give feed­back. Jaeggi says:

the gam­ing lit­er­a­ture also dis­agrees on this issue - there are differ­ent ways to think about this: whereas feed­back after each trial gives you imme­di­ate feed­back whether you did right or wrong, it can also be dis­tract­ing as you are con­stantly mon­i­tor­ing (and eval­u­at­ing) your per­for­mance. we decided that we wanted peo­ple to fully and max­i­mally con­cen­trate on the task itself and thus chose the approach to only give feed­back at the end of the run. how­ev­er, we have newer ver­sions of the task for kids in which we give some sort of feed­back (points) for each tri­al. thus - i can’t tell you what the opti­mal way is - i guess there are interindi­vid­ual differ­ences and pref­er­ences as well.

Jonathan Toomim writes:

When I was doing visual psy­chophysics research, I heard from my lab­mates that this ques­tion has been inves­ti­gated empir­i­cally (at least in the con­text of visual psy­chophysic­s), and that the con­sen­sus in the field is that using feed­back reduces imme­di­ate per­for­mance but improves learn­ing rates. I haven’t looked up the research to con­firm their opin­ion, but it sounds plau­si­ble to me. I would also expect it to apply to Brain Work­shop. The idea, as I see it, is that feed­back reduces per­for­mance because, when you get an answer wrong and you know it, your brain goes into an intro­spec­tive mode to ana­lyze the rea­son for the error and (hope­ful­ly) cor­rect it, but while in this mode your brain will be dis­tracted from the task at hand and will be more likely to miss sub­se­quent tri­als.

How can I do better on N-back?

Focus hard­er. Play more. Sleep well, and eat health­ily. Use nat­ural light­ing57. Space out prac­tice. The less stressed you are, the bet­ter you can do.


Pen­ner et al 2012

This study com­pared a high inten­sity work­ing mem­ory train­ing (45 min­utes, 4 times per week for 4 weeks) with a dis­trib­uted train­ing (45 min­utes, 2 times per week for 8 weeks) in mid­dle-aged, healthy adult­s…Our results indi­cate that the dis­trib­uted train­ing led to increased per­for­mance in all cog­ni­tive domains when com­pared to the high inten­sity train­ing and the con­trol group with­out train­ing. The most sig­nifi­cant differ­ences revealed by inter­ac­tion con­trasts were found for ver­bal and visual work­ing mem­o­ry, ver­bal short­-term mem­ory and men­tal speed.

This is rem­i­nis­cent of sleep’s involve­ment in other forms of mem­ory and cog­ni­tive change, and Kuriyama 2008.


Cur­tis War­ren has noticed that when he under­went a 4-day rou­tine of prac­tic­ing more than 4 hours a day, he jumped an entire level on even quad N-back58:

For exam­ple, over the past week I have been try­ing a new train­ing rou­tine. My goal was to increase my intel­li­gence as quickly as pos­si­ble. To that end, over the past 4 days I’ve done a total of roughly 360 ses­sions @ 2 sec­onds per trial (≈360 min­utes of train­ing). I had to rest on Wednes­day, and I’m rest­ing again today (I only plan on doing about 40 tri­als today). But I intend to fin­ish off the week by doing 100 ses­sions on Sat­ur­day and another 100 on Sun­day. Or more, if I can man­age it.

But he cau­tions us that besides being a con­sid­er­able time invest­ment, it may only work for him:

The point is, while I can say with­out a doubt that this sched­ule has been effec­tive for me, it might not be effec­tive for you. Are the ben­e­fits worth the amount of work need­ed? Will you even notice an improve­ment? Is this healthy? These are all fac­tors which depend entirely upon the indi­vid­ual actu­ally doing the train­ing.

Raman started DNB train­ing, and in his first 30 days, he “took breaks every 5 days or so, and was doing about 20-30 ses­sion each day and n-back wise I made good gains (from 2 to 7 touch­ing 9 on the way).”; he kept a jour­nal on the mail­ing list about the expe­ri­ence with daily updates.

Alas, nei­ther Raman nor War­ren took an IQ or dig­it-s­pan test before start­ing, so they can only report DNB level increases & sub­jec­tive assess­ments.

The research does sug­gest that dimin­ish­ing returns does not set in with train­ing regimes of 10 or 15 min­utes a day; for exam­ple, Nut­ley 2011 trained 4-year-olds in WM exer­cis­es, Gf (NVR) exer­cis­es, or both:

…These analy­ses took into account that the groups differed in the amount of train­ing received, full dose for NVR or WM groups or half dose for the CB group (Table 3). Even though the pat­tern is not con­sis­tent across all tests (see Fig­ure 2), this is inter­preted as con­fir­ma­tion of the lin­ear dose effect that was expected to be seen. Our results sug­gest that the amount of trans­fer to non-trained tasks within the trained con­struct was roughly pro­por­tion­ate to the amount of train­ing on that con­struct. A sim­i­lar find­ing, with trans­fer pro­por­tional to amount of train­ing, was reported by Jaeggi et al. (2008). This has pos­si­ble impli­ca­tions for the design of future cog­ni­tive train­ing par­a­digms and sug­gests that the train­ing should be inten­sive enough to lead to sig­nifi­cant trans­fer and that train­ing more than one con­struct does not entail any advan­tages in itself. The train­ing effect pre­sum­ably reaches asymp­tote, but where this occurs is for future stud­ies to deter­mine. It is prob­a­bly impor­tant to ensure that par­tic­i­pants spend enough time on each task in order to see clin­i­cally sig­nifi­cant trans­fer, which may be diffi­cult when increas­ing the num­ber of tasks being trained. This may be one of the expla­na­tions for the lack of trans­fer seen in the (train­ing six tasks in 10 min­utes).

Plateauing, or, am I wasting time if I can’t get past 4-back?

Some peo­ple start n-back­ing with great vigor and rapidly ascend lev­els until sud­denly they stop improv­ing and panic, won­der­ing if some­thing is wrong with them. Not at all! Reach­ing a high level is a good thing, and if one does so in just a few weeks, all the more impres­sive since most mem­bers take much longer than, say, 2 weeks to reach good scores on D4B. In fact, if you look at the reports in the Group sur­vey, most reports are of plateau­ing at D4B or D5B months in.

The cru­cial thing about N-back is just that you are stress­ing your work­ing mem­o­ry, that’s all. The actual level does­n’t mat­ter very much, just whether you can barely man­age it; it is some­what like lift­ing weights, in that regard. From Jaeggi 2008:

The find­ing that the trans­fer to Gf remained even after tak­ing the spe­cific train­ing effect into account seems to be coun­ter­in­tu­itive, espe­cially because the spe­cific train­ing effect is also related to train­ing time. The rea­son for this capac­ity might be that par­tic­i­pants with a very high level of n at the end of the train­ing period may have devel­oped very task spe­cific strate­gies, which obvi­ously boosts n-back per­for­mance, but may pre­vent trans­fer because these strate­gies remain too task-spe­cific (5, 20). The aver­aged n-back level in the last ses­sion is there­fore not crit­i­cal to pre­dict­ing a gain in Gf; rather, it seems that work­ing at the capac­ity limit pro­motes trans­fer to Gf.

Mail­ing list mem­bers report ben­e­fits even if they have plateaued at 3 or 4-back; see the ben­e­fits sec­tion.

One com­monly reported tac­tic to break a plateau­ing is to delib­er­ately advance a level (or increase modal­i­ties), and prac­tice hard on that extra diffi­cult task, the idea being that this will spur adap­ta­tion and make one capa­ble of the lower lev­el.

Do breaks undo my work?

Some peo­ple have won­dered if not n-back­ing for a day/week/month or other extended period undoes all their hard work, and hence n-back­ing may not be use­ful in the long-term.

Mul­ti­ple group mem­bers have pointed to long gaps in their train­ing, some­times mul­ti­ple months up to a year, which did not change their scores sig­nifi­cantly (im­me­di­ately after the break, scores may dip a level or a few per­cent­age points in accu­ra­cy, but quickly rises to the old lev­el). Some mem­bers have ceased n-back­ing for 2 or 3 years, and found their scores dropped by only 2-4 lev­els - far from 1 or 2-back. (Pon­tus Granström, on the other hand, took a break for sev­eral months and fell for a long period from D8B-D9B to D6B-D7B; he spec­u­lates it might reflect a lack of moti­va­tion.) huhwhat/Nova fell 5 lev­els from D9B but recov­ered quick­ly:

I’ve been train­ing with n-back on and off, mostly off, for the past few years. I started about 3 years ago and was able to get up to 9-n back, but on aver­age I would be doing around 6 or 7 n back. Then I took a break for a few years. Now after com­ing back, even though I have had my fair share of par­ty­ing, box­ing, light drugs, even polypha­sic sleep, on my first few tries I was able to get back up to 5-6, and a week into it I am back at get­ting up to 9 n back.

This anec­do­tal evi­dence is sup­ported by at least one WM-train­ing let­ter, Chrabaszcz 2010:

Fig­ure 1b illus­trates the degree to which train­ing trans­ferred to an osten­si­bly differ­ent (and untrained) mea­sure of ver­bal work­ing mem­ory com­pared to a no-con­tact con­trol group. Not only did train­ing sig­nifi­cantly increase ver­bal work­ing mem­o­ry, but these gains per­sisted 3 months fol­low­ing the ces­sa­tion of train­ing!

Sim­i­lar­ly, Dahlin 2008 found WM train­ing gains which were durable over more than a year:

The authors inves­ti­gated imme­di­ate train­ing gains, trans­fer effects, and 18-month main­te­nance after 5 weeks of com­put­er-based train­ing in updat­ing of infor­ma­tion in work­ing mem­ory in young and older sub­jects. Trained young and older adults improved sig­nifi­cantly more than con­trols on the cri­te­rion task (let­ter mem­o­ry), and these gains were main­tained 18 months lat­er. Trans­fer effects were in gen­eral lim­ited and restricted to the young par­tic­i­pants, who showed trans­fer to an untrained task that required updat­ing (3-back)…

I heard 12-back is possible

Some users have reported being able to go all the way up to 12-back; Ashirgo reg­u­larly plays at D13B, but the high­est at other modes seems to be T9B and Q6B.

Ashirgo offers up her 8-point scheme as to how to accom­plish such feats:

  1. ’Be focused at all cost. The fluid intel­li­gence itself is some­times called “the strength of focus”.
  2. You had bet­ter not rehearse the last position/sound . It will even­tu­ally decrease your per­for­mance! I mean the rehearsal “step by step”: it will slow you down and dis­tract. The only rehearsal allowed should be nearly uncon­scious and “effort­less” (you will soon real­ize its mean­ing :)
  3. Both points 1 & 2 thus imply that you must be focused on the most cur­rent stim­u­lus as strongly as you can. Nev­er­the­less, you can­not for­get about the pre­vi­ous stim­uli. How to do that? You should hold the image of them (im­age, pic­ture, draw­ing, what­ever you like) in your mind. Notice that you still do not rehearse any­thing that way.
  4. Con­sider divid­ing the stream of data (n) on smaller parts. 6-back will be then two 3-back, for instance.
  5. Fol­low square with your eyes as it changes its posi­tion.
  6. Just turn on the Jaeggi mode with all the options to ensure your task is clos­est to the orig­i­nal ver­sion.
  7. Con­sider doing more than 20 tri­als. I am on my way to do no less than 30 today. It may also help.
  8. You may lower the diffi­culty by reduc­ing the fal­l-back and advance lev­els from >75 and =<90 to 70 and 85 respec­tively (for instance).’

What’s some relevant research?

Train­ing WM tasks has yielded a lit­er­a­ture of mixed results - for every pos­i­tive, there’s a neg­a­tive, it seems. The fol­low­ing sec­tions of pos­i­tive and null results illus­trate that, as do the papers them­selves; from Nut­ley 2011:

How­ev­er, there are some stud­ies using sev­eral WM tasks to train that have also shown trans­fer effects to rea­son­ing tasks (Kling­berg, Fer­nell, Ole­sen, John­son, Gustafs­son, Dahlstrçm, Gill­berg, Forss­berg & West­er­berg, 2005; Kling­berg, Forss­berg & West­er­berg, 2002), while other WM train­ing stud­ies have failed to show such trans­fer (Dahlin, Neely, Larsson, Back­man & Nyberg, 2008; Holmes, Gath­er­cole, Place, Dun­ning, Hilton & Elliott, 2009; Thorell, Lindqvist, Bergman Nut­ley, Bohlin & Kling­berg, 2009). Thus, it is still unclear under which con­di­tions effects of WM train­ing trans­fer to Gf.

Other inter­ven­tion stud­ies have included train­ing of atten­tion or exec­u­tive func­tions. Rueda and col­leagues trained atten­tion in a sam­ple of 4- and 6-year-olds and found sig­nifi­cant gains in intel­li­gence (as mea­sured with the Kauf­man Brief Intel­li­gence Test) in the 4-year-olds but only a ten­dency in the group of 6-year-olds (Rueda, Roth­bart, McCan­dliss, Sac­co­manno & Pos­ner, 2005). A large train­ing study with 11,430 par­tic­i­pants revealed prac­ti­cally no trans­fer after a 6-week inter­ven­tion (10 min ⁄ day, 3 days a week) of a broader range of tasks includ­ing rea­son­ing and plan­ning or mem­o­ry, visuo-s­pa­tial skills, math­e­mat­ics and atten­tion (). How­ev­er, this study lacked con­trol in sam­ple selec­tion and com­pli­ance. In sum­ma­ry, it is still an open ques­tion to what extent Gf can be improved by tar­geted train­ing.

Work­ing mem­ory train­ing includ­ing vari­ants on dual n-back has been shown to phys­i­cally change/increase the dis­tri­b­u­tion of white mat­ter in the brain59

Phys­i­cal changes have been linked to WM train­ing and n-back­ing. For exam­ple, Ole­sen PJ, West­er­berg H, Kling­berg T (2004) Increased pre­frontal and pari­etal activ­ity after train­ing of work­ing mem­o­ry. Nat Neu­ro­science 7:75-79; about this study, Kuriyama writes:

“Ole­sen et al. (2004) pre­sented pro­gres­sive evi­dence obtained by func­tional mag­netic res­o­nance imag­ing that repet­i­tive train­ing improves spa­tial WM per­for­mance [both accu­racy and response time (RT)] asso­ci­ated with increased cor­ti­cal activ­ity in the mid­dle frontal gyrus and the supe­rior and infe­rior pari­etal cor­tices. Such a find­ing sug­gests that train­ing-in­duced improve­ment in WM per­for­mance could be based on neural plas­tic­i­ty, sim­i­lar to that for other skil­l-learn­ing char­ac­ter­is­tics.”

West­er­berg 2007, “Changes in cor­ti­cal activ­ity after train­ing of work­ing mem­o­ry–a sin­gle-sub­ject analy­sis”:

“…Prac­tice on the WM tasks grad­u­ally improved per­for­mance and this effect lasted sev­eral months. The effect of prac­tice also gen­er­al­ized to improve per­for­mance on a non-trained WM task and a rea­son­ing task. After train­ing, WM-re­lated brain activ­ity was sig­nifi­cantly increased in the mid­dle and infe­rior frontal gyrus. The changes in activ­ity were not due to acti­va­tions of any addi­tional area that was not acti­vated before train­ing. Instead, the changes could best be described by small increases in the extent of the area of acti­vated cor­tex. The effect of train­ing of WM is thus in sev­eral respects sim­i­lar to the changes in the func­tional map observed in pri­mate stud­ies of skill learn­ing, although the phys­i­o­log­i­cal effect in WM train­ing is located in the pre­frontal asso­ci­a­tion cor­tex.”

Exec­u­tive func­tions, includ­ing work­ing mem­ory and inhi­bi­tion, are of cen­tral impor­tance to much of human behav­ior. Inter­ven­tions intended to improve exec­u­tive func­tions might there­fore serve an impor­tant pur­pose. Pre­vi­ous stud­ies show that work­ing mem­ory can be improved by train­ing, but it is unknown if this also holds for inhi­bi­tion, and whether it is pos­si­ble to train exec­u­tive func­tions in preschool­ers. In the present study, preschool chil­dren received com­put­er­ized train­ing of either visuo-s­pa­tial work­ing mem­ory or inhi­bi­tion for 5 weeks. An active con­trol group played com­mer­cially avail­able com­puter games, and a pas­sive con­trol group took part in only pre- and posttest­ing. Chil­dren trained on work­ing mem­ory improved sig­nifi­cantly on trained tasks; they showed train­ing effects on non-trained tests of spa­tial and ver­bal work­ing mem­o­ry, as well as trans­fer effects to atten­tion. Chil­dren trained on inhi­bi­tion showed a sig­nifi­cant improve­ment over time on two out of three trained task par­a­digms, but no sig­nifi­cant improve­ments rel­a­tive to the con­trol groups on tasks mea­sur­ing work­ing mem­ory or atten­tion. In nei­ther of the two inter­ven­tions were there effects on non-trained inhibitory tasks. The results sug­gest that work­ing mem­ory train­ing can have sig­nifi­cant effects also among preschool chil­dren. The find­ing that inhi­bi­tion could not be improved by either one of the two train­ing pro­grams might be due to the par­tic­u­lar train­ing pro­gram used in the present study or pos­si­bly indi­cate that exec­u­tive func­tions differ in how eas­ily they can be improved by train­ing, which in turn might relate to differ­ences in their under­ly­ing psy­cho­log­i­cal and neural process­es.

A neural net­work under­ly­ing atten­tional con­trol involves the ante­rior cin­gu­late in addi­tion to lat­eral pre­frontal areas. An impor­tant devel­op­ment of this net­work occurs between 3 and 7 years of age. We have exam­ined the effi­ciency of atten­tional net­works across age and after 5 days of atten­tion train­ing (ex­per­i­men­tal group) com­pared with differ­ent types of no train­ing (con­trol groups) in 4-year-old and 6-year-old chil­dren. Strong improve­ment in exec­u­tive atten­tion and intel­li­gence was found from ages 4 to 6 years. Both 4- and 6-year-olds showed more mature per­for­mance after the train­ing than did the con­trol groups. This find­ing applies to behav­ioral scores of the exec­u­tive atten­tion net­work as mea­sured by the atten­tion net­work test, even­t-re­lated poten­tials recorded from the scalp dur­ing atten­tion net­work test per­for­mance, and intel­li­gence test scores. We also doc­u­mented the role of the tem­pera­men­tal fac­tor of effort­ful con­trol and the DAT1 gene in indi­vid­ual differ­ences in atten­tion. Over­all, our data sug­gest that the exec­u­tive atten­tion net­work appears to develop under strong genetic con­trol, but that it is sub­ject to edu­ca­tional inter­ven­tions dur­ing devel­op­ment.

Behav­ioural find­ings indi­cate that the core exec­u­tive func­tions of inhi­bi­tion and work­ing mem­ory are closely linked, and neu­roimag­ing stud­ies indi­cate over­lap between their neural cor­re­lates. There has not, how­ev­er, been a com­pre­hen­sive study, includ­ing sev­eral inhi­bi­tion tasks and sev­eral work­ing mem­ory tasks, per­formed by the same sub­jects. In the present study, 11 healthy adult sub­jects com­pleted sep­a­rate blocks of 3 inhi­bi­tion tasks (a stop task, a go/no-go task and a flanker task), and 2 work­ing mem­ory tasks (one spa­tial and one ver­bal). Acti­va­tion com­mon to all 5 tasks was iden­ti­fied in the right infe­rior frontal gyrus, and, at a lower thresh­old, also the right mid­dle frontal gyrus and right pari­etal regions (BA 40 and BA 7). Left infe­rior frontal regions of inter­est (ROIs) showed a sig­nifi­cant con­junc­tion between all tasks except the flanker task. The present study could not pin­point the spe­cific func­tion of each com­mon region, but the pari­etal region iden­ti­fied here has pre­vi­ously been con­sis­tently related to work­ing mem­ory stor­age and the right infe­rior frontal gyrus has been asso­ci­ated with inhi­bi­tion in both lesion and imag­ing stud­ies. These results sup­port the notion that inhibitory and work­ing mem­ory tasks involve com­mon neural com­po­nents, which may pro­vide a neural basis for the inter­re­la­tion­ship between the two sys­tems.

Recent func­tional neu­roimag­ing evi­dence sug­gests a bot­tle­neck between learn­ing new infor­ma­tion and remem­ber­ing old infor­ma­tion. In two behav­ioral exper­i­ments and one func­tional MRI (fMRI) exper­i­ment, we tested the hypoth­e­sis that learn­ing and remem­ber­ing com­pete when both processes hap­pen within a brief period of time. In the first behav­ioral exper­i­ment, par­tic­i­pants inten­tion­ally remem­bered old words dis­played in the fore­ground, while inci­den­tally learn­ing new scenes dis­played in the back­ground. In line with a mem­ory com­pe­ti­tion, we found that remem­ber­ing old infor­ma­tion was asso­ci­ated with impaired learn­ing of new infor­ma­tion. We repli­cated this find­ing in a sub­se­quent fMRI exper­i­ment, which showed that this behav­ioral effect was cou­pled with a sup­pres­sion of learn­ing-re­lated activ­ity in visual and medial tem­po­ral areas. More­over, the fMRI exper­i­ment pro­vided evi­dence that left mid-ven­tro­lat­eral pre­frontal cor­tex is involved in resolv­ing the mem­ory com­pe­ti­tion, pos­si­bly by facil­i­tat­ing rapid switch­ing between learn­ing and remem­ber­ing. Crit­i­cal­ly, a fol­low-up behav­ioral exper­i­ment in which the back­ground scenes were replaced with a visual tar­get detec­tion task pro­vided indi­ca­tions that the com­pe­ti­tion between learn­ing and remem­ber­ing was not merely due to atten­tion. This study not only pro­vides novel insight into our capac­ity to learn and remem­ber, but also clar­i­fies the neural mech­a­nisms under­ly­ing flex­i­ble behav­ior.

(There’s also a worth­while blog arti­cle on this one: “Train­ing The Mind: Trans­fer Across Tasks Requir­ing Inter­fer­ence Res­o­lu­tion”.)

“How dis­tractible are you? The answer may lie in your work­ing mem­ory capac­ity”

  • Jen­nifer C. McVay, Michael J. Kane (2009). “Con­duct­ing the train of thought: Work­ing mem­ory capac­i­ty, goal neglect, and mind wan­der­ing in an exec­u­tive-con­trol task”. Jour­nal of Exper­i­men­tal Psy­chol­o­gy: Learn­ing, Mem­o­ry, and Cog­ni­tion, 35 (1), 196-204 DOI: 10.1037/a0014104:

On the basis of the exec­u­tive-at­ten­tion the­ory of work­ing mem­ory capac­ity (WMC; e.g., M. J. Kane, A. R. A. Con­way, D. Z. Ham­brick, & R. W. Engle, 2007), the authors tested the rela­tions among WMC, mind wan­der­ing, and goal neglect in a sus­tained atten­tion to response task (SART; a go/no-go task). In 3 SART ver­sions, mak­ing con­cep­tual ver­sus per­cep­tual pro­cess­ing demands, sub­jects peri­od­i­cally indi­cated their thought con­tent when probed fol­low­ing rare no-go tar­gets. SART pro­cess­ing demands did not affect mind-wan­der­ing rates, but mind-wan­der­ing rates var­ied with WMC and pre­dicted goal-ne­glect errors in the task; fur­ther­more, mind-wan­der­ing rates par­tially medi­ated the WMC-SART rela­tion, indi­cat­ing that WMC-related differ­ences in goal neglect were due, in part, to vari­a­tion in the con­trol of con­scious thought.

  • “Work­ing mem­ory capac­ity and its rela­tion to gen­eral intel­li­gence”; Andrew R.A. Con­way et al; TRENDS in Cog­ni­tive Sci­ences Vol.7 No.2003-12-12

    Sev­eral recent latent vari­able analy­ses sug­gest that (work­ing mem­ory capac­i­ty) accounts for at least one-third and per­haps as much as one-half of the vari­ance in (in­tel­li­gence).What seems to be impor­tant about WM span tasks is that they require the active main­te­nance of infor­ma­tion in the face of con­cur­rent pro­cess­ing and inter­fer­ence and there­fore recruit an exec­u­tive atten­tion-con­trol mech­a­nism to com­bat inter­fer­ence. Fur­ther­more, this abil­ity seems to be medi­ated by por­tions of the pre­frontal cor­tex.


Jaeggi 2005

“Capac­ity Lim­i­ta­tions in Human Cog­ni­tion: Behav­ioural and Bio­log­i­cal Con­tri­bu­tions”; Jaeggi the­sis:

…Ex­per­i­ment 6 and 7 finally tackle the issue, whether capac­ity lim­i­ta­tions are trait-like, i.e., fixed, or whether it is be pos­si­ble to extend these lim­i­ta­tions with train­ing and whether gen­er­al­ized effects on other domains can be observed. In the last sec­tion, all the find­ings are inte­grated and dis­cussed, and fur­ther issues remain­ing to be inves­ti­gated are pointed out.

…In this exper­i­ment [6], the effects of a 10-day train­ing of an adap­tive ver­sion of an n-back dual task were stud­ied. The adap­tive ver­sion should be very directly depend­ing on the actual per­for­mance of the par­tic­i­pant: Not being too easy, but also not too diffi­cult; always pro­vid­ing a sense of achieve­ment in the par­tic­i­pant in order to keep the moti­va­tion high. Com­par­ing pre and post mea­sures, effects on the task itself were eval­u­at­ed, but also effects on other WM mea­sures, and on a mea­sure of fluid intel­li­gence.

…As stated before, this study [7] was con­ducted in order to repli­cate and extend the find­ings of Exper­i­ment 6: I was pri­mar­ily inter­ested to see whether an asymp­totic curve regard­ing per­for­mance would be reached after nearly twice of the train­ing ses­sions used in Exper­i­ment 6, and fur­ther, whether gen­er­al­ized and differ­en­tial effects on var­i­ous cog­ni­tive tasks could be obtained with this train­ing. There­fore, more tasks were included com­pared to Exper­i­ment 6, cov­er­ing many aspects of WM (i.e., ver­bal tasks, visu­ospa­tial tasks), exec­u­tive func­tions, as well as con­trol tasks not used in Exper­i­ment 6 in order to inves­ti­gate whether the WM train­ing has a selec­tive effect on tasks which are related to the con­cept of WM and exec­u­tive func­tions with no effect on these con­trol tasks. With respect to fluid intel­li­gence, a more appro­pri­ate task than the APM, i.e., the ‘Bochumer Matrizen­test’ (BOMAT; Hossiep, Tur­ck, & Hasel­la, 1999) was used, which has the advan­tage that full par­al­lel-ver­sions are avail­able and that the task was explic­itly devel­oped in order not to yield ceil­ing effects in stu­dent sam­ples. The exper­i­ment was car­ried out together with Mar­tin Buschkuehl and Daniela Blaser; the lat­ter writ­ing her Mas­ter the­sis on the top­ic.

Jaeggi 2008

“Improv­ing fluid intel­li­gence with train­ing on work­ing mem­ory”, Jaeggi et al 2008 (sup­ple­ment; all the data in Jaeggi 2005 was used in this as well); this arti­cle was widely cov­ered (eg. “Brain-Train­ing To Improve Mem­ory Boosts Fluid Intel­li­gence” or ’s “For­get Brain Age: Researchers Develop Soft­ware That Makes You Smarter”) and sparked most peo­ple’s inter­est in the top­ic. The abstract:

Fluid intel­li­gence (Gf) refers to the abil­ity to rea­son and to solve new prob­lems inde­pen­dently of pre­vi­ously acquired knowl­edge. Gf is crit­i­cal for a wide vari­ety of cog­ni­tive tasks, and it is con­sid­ered one of the most impor­tant fac­tors in learn­ing. More­over, Gf is closely related to pro­fes­sional and edu­ca­tional suc­cess, espe­cially in com­plex and demand­ing envi­ron­ments. Although per­for­mance on tests of Gf can be improved through direct prac­tice on the tests them­selves, there is no evi­dence that train­ing on any other reg­i­men yields increased Gf in adults. Fur­ther­more, there is a long his­tory of research into cog­ni­tive train­ing show­ing that, although per­for­mance on trained tasks can increase dra­mat­i­cal­ly, trans­fer of this learn­ing to other tasks remains poor. Here, we present evi­dence for trans­fer from train­ing on a demand­ing work­ing mem­ory task to mea­sures of Gf. This trans­fer results even though the trained task is entirely differ­ent from the intel­li­gence test itself. Fur­ther­more, we demon­strate that the extent of gain in intel­li­gence crit­i­cally depends on the amount of train­ing: the more train­ing, the more improve­ment in Gf. That is, the train­ing effect is dosage-de­pen­dent. Thus, in con­trast to many pre­vi­ous stud­ies, we con­clude that it is pos­si­ble to improve Gf with­out prac­tic­ing the test­ing tasks them­selves, open­ing a wide range of appli­ca­tions.

Brain Work­shop includes a spe­cial ‘Jaeggi mode’ which repli­cates almost exactly the set­tings described for the “Brain Twister” soft­ware used in the study.

No study is defin­i­tive, of course, but Jaeggi 2008 is still one of the major stud­ies that must be cited in any DNB dis­cus­sion. There are some issues - not as many sub­jects as one would like, and the researchers (quoted in the Wired arti­cle) obvi­ously don’t know if the WM or Gf gains are durable; more tech­ni­cal issues like the admin­is­tered Gf IQ tests being speeded and thus pos­si­bly reduced in valid­ity have been raised by Moody and oth­ers.

Qiu 2009

“Study on Improv­ing Fluid Intel­li­gence through Cog­ni­tive Train­ing Sys­tem Based on Gabor Stim­u­lus”, 2009 First Inter­na­tional Con­fer­ence on Infor­ma­tion Sci­ence and Engi­neer­ing, abstract:

Gen­eral fluid intel­li­gence (Gf) is a human abil­ity to rea­son and solve new prob­lems inde­pen­dently of pre­vi­ously acquired knowl­edge and expe­ri­ence. It is con­sid­ered one of the most impor­tant fac­tors in learn­ing. One of the issues which aca­d­e­mic peo­ple con­cen­trates on is whether Gf of adults can be improved. Accord­ing to the Dual N-back work­ing mem­ory the­ory and the char­ac­ter­is­tics of visual per­cep­tual learn­ing, this paper put for­ward cog­ni­tive train­ing pat­tern based on Gabor stim­uli. A total of 20 under­grad­u­ate stu­dents at 24 years old par­tic­i­pated in the exper­i­ment, with ten train­ing ses­sions for ten days. Through using Raven’s Stan­dard Pro­gres­sive Matri­ces as the eval­u­a­tion method to get and ana­lyze the exper­i­men­tal results, it was proved that train­ing pat­tern can improve fluid intel­li­gence of adults. This will pro­mote a wide range of appli­ca­tions in the field of adult intel­lec­tual edu­ca­tion.

Dis­cus­sion and crit­i­cism of this Chi­nese6061 paper took place in 2 threads; the SPM was admin­is­ter in 25 min­utes, which while not as fast as Jaeggi 2008, is still not the nor­mal length. An addi­tional anom­aly is that accord­ing to the final graph, the con­trol group’s IQ dropped mas­sively in the post-test (driv­ing much of the improve­men­t). As part of my , I tried to con­tact the 4 authors in May, June, July & Sep­tem­ber 2012; they even­tu­ally replied with data.

polar (June 2009)

A group mem­ber, polar, con­ducted a small exper­i­ment at his uni­ver­sity where he was a stu­dent; his results seemed to show an improve­ment. As polar would be the first to admit, the attri­tion in sub­jects (few to begin with), rel­a­tively short time of train­ing and what­not make the power of his study weak.

Jaeggi 2010

“The rela­tion­ship between n-back per­for­mance and matrix rea­son­ing - impli­ca­tions for train­ing and trans­fer”, Jaeggi et al (coded as Jaeggi2 in meta-analy­sis); abstract:

…In the first study, we demon­strated that dual and sin­gle n-back task per­for­mances are approx­i­mately equally cor­re­lated with per­for­mance on two differ­ent tasks mea­sur­ing Gf, whereas the cor­re­la­tion with a task assess­ing work­ing mem­ory capac­ity was small­er. Based on these results, the sec­ond study was aimed on test­ing the hypoth­e­sis that train­ing on a sin­gle n-back task yields the same improve­ment in Gf as train­ing on a dual n-back task, but that there should be less trans­fer to work­ing mem­ory capac­i­ty. We trained two groups of stu­dents for four weeks with either a sin­gle or a dual n-back inter­ven­tion. We inves­ti­gated trans­fer effects on work­ing mem­ory capac­ity and Gf com­par­ing the two train­ing groups’ per­for­mance to con­trols who received no train­ing of any kind. Our results showed that both train­ing groups improved more on Gf than con­trols, thereby repli­cat­ing and extend­ing our prior results.

The 2 stud­ies mea­sured Gf using Raven’s APM and the BOMAT. In both stud­ies, the tests were admin­is­tered speeded to 10 or 15 min­utes as in Jaeggi 2008. The exper­i­men­tal groups saw aver­age gains of 1 or 2 addi­tional cor­rect answers on the BOMAT and APM. It’s worth not­ing that the Sin­gle N-Back was done with a visual modal­ity (and the DNB with the stan­dard visual & audio).

Fol­lowup work:

  • Schnei­ders et al 2012 trained audio WM and found no trans­fer to visual WM tasks; unfor­tu­nate­ly, they did not mea­sure any far trans­fer tasks like RAPM/BOMAT.
  • Beavon 2012 reports n = 47, exper­i­men­tals trained on sin­gle n-back & con­trols on “com­bined ver­bal tasks Define­time and Who wants to be a mil­lion­aire (Mil­lion­aire)”; no improve­ments on “STM span and atten­tion, short term audi­tory mem­ory span and divided atten­tion, and WM as oper­a­tionalised through the Wood­cock­-John­son III: Tests of cog­ni­tive abil­i­ties (WJ-III)”.

Studer-Luethi 2012

The sec­ond study’s data was reused for a Big Five per­son­al­ity fac­tor analy­sis in Stud­er-Luethi, Jaeg­gi, et al 2012, “Influ­ence of neu­roti­cism and con­sci­en­tious­ness on work­ing mem­ory train­ing out­come”.62

The lack of n-back score cor­re­la­tion with WM score seems in line with an ear­lier study; “Work­ing Mem­o­ry, Atten­tion Con­trol, and the N-Back Task: A Ques­tion of Con­struct Valid­ity”:

…Par­tic­i­pants also com­pleted a ver­bal WM span task (op­er­a­tion span task) and a marker test of gen­eral fluid intel­li­gence (Gf; Ravens Advanced Pro­gres­sive Matri­ces Test; J. C. Raven, J. E. Raven, & J. H. Court, 1998). N-back and WM span cor­re­lated weak­ly, sug­gest­ing they do not reflect pri­mar­ily a sin­gle con­struct; more­over, both accounted for inde­pen­dent vari­ance in Gf. N-back has face valid­ity as a WM task, but it does not demon­strate con­ver­gent valid­ity with at least 1 estab­lished WM mea­sure.

Stephenson 2010

“Does train­ing to increase work­ing mem­ory capac­ity improve fluid intel­li­gence?”:

The cur­rent study was suc­cess­ful in repli­cat­ing Jaeggi et al.’s (2008) results. How­ev­er, the cur­rent study also observed improve­ments in scores on the Raven’s Advanced Pro­gres­sive Matri­ces for par­tic­i­pants who com­pleted a vari­a­tion of the dual n-back task or a short­-term mem­ory task train­ing pro­gram. Par­tic­i­pants’ scores improved sig­nifi­cantly for only two of the four tests of GJ, which raises the issue of whether the tests mea­sure the con­struct Gf exclu­sive­ly, as defined by Cat­tell (1963), or whether they may be sen­si­tive to other fac­tors. The con­cern is whether the train­ing is actu­ally improv­ing Gf or if the train­ing is improv­ing atten­tional con­trol and/or visu­ospa­tial skills, which improves per­for­mance on spe­cific tests of Gf. The find­ings are dis­cussed in terms of impli­ca­tions for con­cep­tu­al­iz­ing and assess­ing Gf.

136 par­tic­i­pants split over 25-28 sub­jects in exper­i­men­tal groups and the con­trol group. Visual n-back improved more than audio n-back; the con­trol group was a pas­sive con­trol group (they did noth­ing but served as con­trols for test-retest effect­s).

Stephenson & Halpern 2013

“Improved matrix rea­son­ing is lim­ited to train­ing on tasks with a visu­ospa­tial com­po­nent”, Stephen­son & Halpern 2013:

Recent stud­ies (e.g., Jaeggi et al., 2008, 2010) have pro­vided evi­dence that scores on tests of fluid intel­li­gence can be improved by hav­ing par­tic­i­pants com­plete a four week train­ing pro­gram using the dual n-back task. The dual n-back task is a work­ing mem­ory task that presents audi­tory and visual stim­uli simul­ta­ne­ous­ly. The pri­mary goal of our study was to deter­mine whether a visu­ospa­tial com­po­nent is required in the train­ing pro­gram for par­tic­i­pants to expe­ri­ence gains in tests of fluid intel­li­gence. We had par­tic­i­pants com­plete vari­a­tions of the dual n-back task or a short­-term mem­ory task as train­ing. Par­tic­i­pants were assessed with four tests of fluid intel­li­gence and four cog­ni­tive tests. We were suc­cess­ful in cor­rob­o­rat­ing Jaeggi et al.’s results, how­ev­er, improve­ments in scores were observed on only two out of four tests of fluid intel­li­gence for par­tic­i­pants who com­pleted the dual n-back task, the visual n-back task, or a short­-term mem­ory task train­ing pro­gram. Our results raise the issue of whether the tests mea­sure the con­struct of fluid intel­li­gence exclu­sive­ly, or whether they may be sen­si­tive to other fac­tors. The find­ings are dis­cussed in terms of impli­ca­tions for con­cep­tu­al­iz­ing and assess­ing fluid intel­li­gence…The data in the cur­rent paper was part of Clay­ton Stephen­son’s doc­toral dis­ser­ta­tion.

Jaeggi 2011

Jaeg­gi, Buschkuehl, Jonides & Shah 2011 “Short- and long-term ben­e­fits of cog­ni­tive train­ing” (coded as Jaeggi3 in the meta-analy­sis); the abstract:

We trained ele­men­tary and mid­dle school chil­dren by means of a videogame-like work­ing mem­ory task. We found that only chil­dren who con­sid­er­ably improved on the train­ing task showed a per­for­mance increase on untrained fluid intel­li­gence tasks. This improve­ment was larger than the improve­ment of a con­trol group who trained on a knowl­edge-based task that did not engage work­ing mem­o­ry; fur­ther, this differ­en­tial pat­tern remained intact even after a 3-mo hia­tus from train­ing. We con­clude that cog­ni­tive train­ing can be effec­tive and long-last­ing, but that there are lim­it­ing fac­tors that must be con­sid­ered to eval­u­ate the effects of this train­ing, one of which is indi­vid­ual differ­ences in train­ing per­for­mance. We pro­pose that future research should not inves­ti­gate whether cog­ni­tive train­ing works, but rather should deter­mine what train­ing reg­i­mens and what train­ing con­di­tions result in the best trans­fer effects, inves­ti­gate the under­ly­ing neural and cog­ni­tive mech­a­nisms, and final­ly, inves­ti­gate for whom cog­ni­tive train­ing is most use­ful.

(This paper is not to be con­fused with the 2011 poster, “Work­ing Mem­ory Train­ing and Trans­fer to Gf: Evi­dence for Domain Speci­fici­ty?”, Jaeggi et al 2011.)

It is worth not­ing that the study used Sin­gle N-back (vi­su­al). Unlike Jaeggi 2008, “despite the exper­i­men­tal group’s clear train­ing effect, we observed no sig­nifi­cant group × test ses­sion inter­ac­tion on trans­fer to the mea­sures of Gf” (so per­haps the train­ing was long enough for sub­jects to hit their ceil­ings). The group which did n-back could be split, based on final IQ & n-back scores, into 2 groups; inter­est­ingly “Inspec­tion of n-back train­ing per­for­mance revealed that there were no group differ­ences in the first 3 wk of train­ing; thus, it seems that group differ­ences emerge more clearly over time [first 3 wk: t(30) < 1; P = ns; last week: t(16) = 3.00; P < 0.01] (Fig. 3).” 3 weeks is ~21 days, or >19 days (the longest period in Jaeggi 2008). It’s also worth not­ing that Jaeggi 2011 seems to avoid Moody’s most cogent crit­i­cism, the speed­ing of the IQ tests; from the paper’s “Mate­r­ial and Meth­ods” sec­tion;

We assessed matrix rea­son­ing with two differ­ent tasks, the Test of Non­ver­bal Intel­li­gence (TONI) (23) and Raven’s Stan­dard Pro­gres­sive Matri­ces (SPM) (24). Par­al­lel ver­sions were used for the pre, post-, and fol­low-up test ses­sions in coun­ter­bal­anced order. For the TONI, we used the stan­dard pro­ce­dure (45 items, five prac­tice items; untimed), whereas for the SPM, we used a short­ened ver­sion (split into odd and even items; 29 items per ver­sion; two prac­tice items; timed to 10 min after com­ple­tion of the prac­tice items. Note that vir­tu­ally all of the chil­dren com­pleted this task within the given time­frame).

The IQ results were, specifi­cal­ly, the con­trol group aver­aged 15.33/16.20 (before/after) cor­rect answers on the SPM and 20.87/22.50 on the TONI; the n-back group aver­aged 15.44/16.94 SPM and 20.41/22.03 TONI. 1.5 more right ques­tions rather than ~1 may not seem like much, but the split groups look quite differ­ent - the ‘small train­ing gain’ n-back­ing group actu­ally fell on its sec­ond SPM and improved by <0.2 ques­tions on the TONI, while the ‘large train­ing gain’ increased >3 ques­tions on the SPM and TONI. The differ­ence is not so dra­matic in the fol­lowup 3 months lat­er: the small group is now 17.43/23.43 (SPM/TONI), and the large group 15.67/24.67. Strangely in the fol­lowup, the con­trol group has a higher SPM than the large group (but not the small group), and a higher TONI than either group; the con­trol group has higher IQ scores on both TONI & SPM in the fol­lowup than the aggre­gate n-back group. (The split­ting of groups is also unortho­dox63.)

UoM pro­duced a video with Jonides; Jaeggi 2011 has also been dis­cussed in main­stream media. From the Wall Street Jour­nal’s “Boot Camp for Boost­ing IQ”:

…when sev­eral dozen ele­men­tary- and mid­dle-school kids from the Detroit area used this exer­cise for 15 min­utes a day, many showed sig­nifi­cant gains on a widely used intel­li­gence test. Most impres­sive, per­haps, is that these gains per­sisted for three months, even though the chil­dren had stopped train­ing…these school­child­ren showed gains in fluid intel­li­gence roughly equal to five IQ points after one month of train­ing…There are two impor­tant caveats to this research. The first is that not every kid showed such dra­matic improve­ments after train­ing. Ini­tial evi­dence sug­gests that chil­dren who failed to increase their fluid intel­li­gence found the exer­cise too diffi­cult or bor­ing and thus did­n’t fully engage with the train­ing.

From Dis­cover’s blogs, “Can intel­li­gence be boosted by a sim­ple task? For some…”, come addi­tional details:

She [Jaeg­gi] recruited 62 chil­dren, aged between seven and ten. While half of them sim­ply learned some basic gen­eral knowl­edge ques­tions, the other half trained with a cheer­ful com­put­erised n-back task. They saw a stream of images where a tar­get object appeared in one of six loca­tions - say, a frog in a lily pond. They had to press a but­ton if the frog was in the same place as it was two images ago, forc­ing them to store a con­tin­u­ously updated stream of images in their minds. If the chil­dren got bet­ter at the task, this gap increased so they had to keep more images in their heads. If they strug­gled, the gap was short­ened.

Before and after the train­ing ses­sions, all the chil­dren did two rea­son­ing tests designed to mea­sure their fluid intel­li­gence. At first, the results looked dis­ap­point­ing. On aver­age, the n-back chil­dren did­n’t become any bet­ter at these tests than their peers who stud­ied the knowl­edge ques­tions. But accord­ing to Jaeg­gi, that’s because some of them did­n’t take to the train­ing. When she divided the chil­dren accord­ing to how much they improved at the n-back task, she saw that those who showed the most progress also improved in fluid intel­li­gence. The oth­ers did not. Best of all, these ben­e­fits lasted for 3 months after the train­ing. That’s a first for this type of study, although Jaeggi her­self says that the effect is “not robust.” Over this time peri­od, all the chil­dren showed improve­ments in their fluid intel­li­gence, “prob­a­bly [as] a result of the nat­ural course of devel­op­ment”.

…Philip Ack­er­man, who stud­ies learn­ing and brain train­ing at the Uni­ver­sity of Illi­nois, says, “I am con­cerned about the small sam­ple, espe­cially after split­ting the groups on the basis of their per­for­mance improve­ments.” He has a point - the group that showed big improve­ments in the n-back train­ing only included 18 chil­dren….Why did some of the chil­dren ben­e­fit from the train­ing while oth­ers did not? Per­haps they were sim­ply unin­ter­ested in the task, no mat­ter how colour­fully it was dressed up with storks and vam­pires. In Jaeg­gi’s ear­lier study with adults, every vol­un­teer signed up them­selves and were “intrin­si­cally moti­vated to par­tic­i­pate and train.” By con­trast, the kids in this lat­est study were signed up by their par­ents and teach­ers, and some might only have con­tin­ued because they were told to do so.

It’s also pos­si­ble that the chang­ing diffi­culty of the game was frus­trat­ing for some of the chil­dren. Jaeggi says, “The chil­dren who did not ben­e­fit from the train­ing found the work­ing mem­ory inter­ven­tion too effort­ful and diffi­cult, were eas­ily frus­trat­ed, and became dis­en­gaged. This makes sense when you think of phys­i­cal train­ing - if you don’t try and really run and just walk instead, you won’t improve your car­dio­vas­cu­lar fit­ness.” Indeed, a recent study on IQ test­ing which found that they reflect moti­va­tion as well as intel­li­gence.

Schweizer et al 2011


This study inves­ti­gated whether brain-train­ing (work­ing mem­ory [WM] train­ing) improves cog­ni­tive func­tions beyond the train­ing task (trans­fer effect­s), espe­cially regard­ing the con­trol of emo­tional mate­r­ial since it con­sti­tutes much of the infor­ma­tion we process dai­ly. Forty-five par­tic­i­pants received WM train­ing using either emo­tional or neu­tral mate­ri­al, or an unde­mand­ing con­trol task. WM train­ing, regard­less of train­ing mate­ri­al, led to trans­fer gains on another WM task and in fluid intel­li­gence. How­ev­er, only brain-train­ing with emo­tional mate­r­ial yielded trans­fer­able gains to improved con­trol over affec­tive infor­ma­tion on an emo­tional Stroop task. The data sup­port the real­ity of trans­fer­able ben­e­fits of demand­ing WM train­ing and sug­gest that trans­fer­able gains across to affec­tive con­texts require train­ing with mate­r­ial con­gru­ent to those con­texts. These find­ings con­sti­tute pre­lim­i­nary evi­dence that inten­sive cog­ni­tively demand­ing brain-train­ing can improve not only our abstract prob­lem-solv­ing capac­i­ty, but also ame­lio­rate cog­ni­tive con­trol processes (e.g. deci­sion-mak­ing) in our daily emo­tive envi­ron­ments.


  1. There seems to be an IQ increase of around one ques­tion on the RPM (but there’s an odd­ity with the con­trol group which they think they cor­rect for64)
  2. The RPM does not seem to have been admin­is­tered speeded65
  3. The emo­tional aspect seems to be just replac­ing the ‘neu­tral’ exist­ing stim­uli like col­ors or let­ters or piano keys with more loaded ones66, nor does this tweak seem to change the DNB/WM/IQ scores of that group67

Their later study “Train­ing the Emo­tional Brain: Improv­ing Affec­tive Con­trol through Emo­tional Work­ing Mem­ory Train­ing” did not use any mea­sure of fluid intel­li­gence.

Kundu et al 2011

“Relat­ing indi­vid­ual differ­ences in short­-term mem­o­ry-derived EEG to cog­ni­tive train­ing effects” (coded as Kundu1 in the meta-analy­sis); 3 con­trols (Tetris), 3 exper­i­men­tals (Brain Work­shop) for 1000 min­utes. RAPM showed a slight increase. Extremely small exper­i­men­tal size, which may form part of the data for Kundu et al 2012.

Zhong 2011

“The Effect Of Train­ing Work­ing Mem­ory And Atten­tion On Pupils’ Fluid Intel­li­gence” (abstract), Zhong 2011; orig­i­nal encrypted file (8M), screen­shots of all pages in the­sis (20M); dis­cus­sion

Appears to have found IQ gains, but no dose-re­sponse effect, using a no-con­tact con­trol group. Diffi­cult to under­stand: trans­la­tion assis­tance from Chi­nese speak­ers would be appre­ci­at­ed.

Jausovec 2012

“Work­ing mem­ory train­ing: Improv­ing intel­li­gence - Chang­ing brain activ­ity”, Jaušovec 2012:

The main objec­tives of the study were: to inves­ti­gate whether train­ing on work­ing mem­ory (WM) could improve fluid intel­li­gence, and to inves­ti­gate the effects WM train­ing had on neu­ro­elec­tric (elec­troen­cephalog­ra­phy - EEG) and hemo­dy­namic (near-in­frared spec­troscopy - NIRS) pat­terns of brain activ­i­ty. In a par­al­lel group exper­i­men­tal design, respon­dents of the work­ing mem­ory group after 30 h of train­ing sig­nifi­cantly increased per­for­mance on all tests of fluid intel­li­gence. By con­trast, respon­dents of the active con­trol group (par­tic­i­pat­ing in a 30-h com­mu­ni­ca­tion train­ing course) showed no improve­ments in per­for­mance. The influ­ence of WM train­ing on pat­terns of neu­ro­elec­tric brain activ­ity was most pro­nounced in the theta and alpha bands. Theta and low­er-1 alpha band syn­chro­niza­tion was accom­pa­nied by increased low­er-2 and upper alpha desyn­chro­niza­tion. The hemo­dy­namic pat­terns of brain activ­ity after the train­ing changed from higher right hemi­spheric acti­va­tion to a bal­anced activ­ity of both frontal areas. The neu­ro­elec­tric as well as hemo­dy­namic pat­terns of brain activ­ity sug­gest that the train­ing influ­enced WM main­te­nance func­tions as well as processes directed by the cen­tral exec­u­tive. The changes in upper alpha band desyn­chro­niza­tion could fur­ther indi­cate that processes related to long term mem­ory were also influ­enced.

14 exper­i­men­tal & 15 con­trols; the test­ing was a lit­tle unusu­al:

Respon­dents solved four test-bat­ter­ies, for which the pro­ce­dure was the same dur­ing pre- and post-test­ing. The same test-bat­ter­ies were used on pre- and post-test­ing. The digit span sub­test (WAIS-R) was admin­is­tered sep­a­rate­ly, accord­ing to the direc­tions in the test man­ual (Wech­sler, 1981). The other three tests (RAPM, ver­bal analo­gies and spa­tial rota­tion) were admin­is­tered while the respon­dents’ EEG and NIRS mea­sures were record­ed.

The RAPM was based on a mod­i­fied ver­sion of Raven’s pro­gres­sive matri­ces (Raven, 1990), a widely used and well estab­lished test of fluid intel­li­gence (Stern­berg, Fer­rari, Clinken­beard, & Grig­orenko, 1996). The cor­re­la­tion between this mod­i­fied ver­sion of RAPM and WAIS-R was r = .56, (p < .05, n = 97). Sim­i­lar cor­re­la­tions of the order of 0.40-0.75, were also reported for the stan­dard ver­sion of RAPM (Court & Raven, 1995). There­fore it can be con­cluded that the mod­i­fied appli­ca­tion of the RAPM did not sig­nifi­cantly alter its met­ric char­ac­ter­is­tics. Used were 50 test items - 25 easy (Ad­vanced Pro­gres­sive Matri­ces Set I - 12 items and the B Set of the Col­ored Pro­gres­sive Matri­ces), and 25 diffi­cult items (Ad­vanced Pro­gres­sive Matri­ces Set II, items 12-36). Par­tic­i­pants saw a fig­ural matrix with the lower right entry miss­ing. They had to deter­mine which of the four options fit­ted into the miss­ing space. The tasks were pre­sented on a com­puter screen (po­si­tioned about 80-100 cm in front of the respon­den­t), at fixed 10 or 14 s inter­stim­u­lus inter­vals. They were exposed for 6 s (easy) or 10 s (diffi­cult) fol­low­ing a 2-s inter­val, when a cross was pre­sent­ed. Dur­ing this time the par­tic­i­pants were instructed to press a but­ton on a response pad (1-4) which indi­cated their answer.

At 25 hard ques­tions, and <14s a ques­tion, that implies the RAPM was admin­is­tered in <5.8 min­utes. They com­ment:

To fur­ther inves­ti­gate pos­si­ble influ­ences of task diffi­culty on the observed per­for­mance gains on the RAPM a GLM for repeated mea­sures test/retest  easy/difficult-items  group (WM, AC) was con­duct­ed. The analy­sis showed only a sig­nifi­cant inter­ac­tion effect for the test/retest con­di­tion and type of train­ing used in the two groups (F(1, 27) = 4.47; p < .05; par­tial eta2 = .15). A GLM con­ducted for the WM group showed only a sig­nifi­cant test/retest effect (F(1, 13) = 30.11; p < .05; par­tial eta2 = .70), but no inter­ac­tion between the test/retest con­di­tions and the diffi­culty level (F(1, 13) = 1.79; p = .17 not-sig­nifi­cant; par­tial eta2 = .12). As can be seen in Fig. 4 after WM train­ing an about equal increase in respon­dents’ per­for­mance for the easy and diffi­cult test items was observed. On the other hand, no increases in per­for­mance, nei­ther for the easy nor for the diffi­cult test items, in respon­dents of the active con­trol group were observed (F(1, 14) = .47; p = .50 not- sig­nifi­cant; par­tial eta2 = .03).

(Even on the “easy” ques­tions, no group per­formed bet­ter than 76% accu­ra­cy.)

Clouter 2013

“The Effects of Dual n-back Train­ing on the Com­po­nents of Work­ing Mem­ory and Fluid Intel­li­gence: An Indi­vid­ual Differ­ences Approach”, Clouter 2013:

A num­ber of recent stud­ies have pro­vided evi­dence that train­ing work­ing mem­ory can lead to improve­ments in fluid intel­li­gence, work­ing mem­ory span, and per­for­mance on other untrained tasks. How­ev­er, in addi­tion to a num­ber of mixed results, many of these stud­ies suffer from design lim­i­ta­tions. The aim of the present study was to exper­i­men­tally inves­ti­gate the effects of a dual n-back work­ing mem­ory train­ing task on a vari­ety of mea­sures of fluid intel­li­gence, rea­son­ing, work­ing mem­ory span, and atten­tional con­trol. The present study com­pared a train­ing group with an active con­trol group (a placebo group), using appro­pri­ate meth­ods that over­came the lim­i­ta­tions of pre­vi­ous stud­ies. The dual n-back train­ing group improved more than the active con­trol group on some, but not all out­come mea­sures. Differ­en­tial improve­ment for the train­ing group was observed on fluid intel­li­gence, work­ing mem­ory capac­i­ty, and response times on con­flict tri­als in the Stroop task. In addi­tion, indi­vid­ual differ­ences in pre-train­ing fluid intel­li­gence scores and ini­tial per­for­mance on the train­ing task explain some of the vari­ance in out­come mea­sure improve­ments. We dis­cuss these results in the con­text of pre­vi­ous stud­ies, and sug­gest that addi­tional work is needed in order to fur­ther under­stand the vari­ables respon­si­ble for trans­fer from train­ing.

Jaeggi et al 2013

“The role of indi­vid­ual differ­ences in cog­ni­tive train­ing and trans­fer”:

Work­ing mem­ory (WM) train­ing has recently become a topic of intense inter­est and con­tro­ver­sy. Although sev­eral recent stud­ies have reported near- and far-trans­fer effects as a result of train­ing WM-re­lated skills, oth­ers have failed to show far trans­fer, sug­gest­ing that gen­er­al­iza­tion effects are elu­sive. Also, many of the ear­lier inter­ven­tion attempts have been crit­i­cized on method­olog­i­cal grounds. The present study resolves some of the method­olog­i­cal lim­i­ta­tions of pre­vi­ous stud­ies and also con­sid­ers indi­vid­ual differ­ences as poten­tial expla­na­tions for the differ­ing trans­fer effects across stud­ies. We recruited intrin­si­cally moti­vated par­tic­i­pants and assessed their need for cog­ni­tion (NFC; Cacioppo & Petty Jour­nal of Per­son­al­ity and Social Psy­chol­ogy 42:116-131, 1982) and their implicit the­o­ries of intel­li­gence (Dweck, 1999) prior to train­ing. We assessed the effi­cacy of two inter­ven­tions by com­par­ing par­tic­i­pants’ improve­ments on a bat­tery of fluid intel­li­gence tests against those of an active con­trol group. We observed that trans­fer to a com­pos­ite mea­sure of fluid rea­son­ing resulted from both WM inter­ven­tions. In addi­tion, we uncov­ered fac­tors that con­tributed to train­ing suc­cess, includ­ing moti­va­tion, need for cog­ni­tion, pre­ex­ist­ing abil­i­ty, and implicit the­o­ries about intel­li­gence.

This is quite a com­plex study, with a lot of analy­sis I don’t think I entirely under­stand. The quick sum­mary is table 2 on pg10: the DNB group fell on APM, rose on BOMAT (nei­ther sta­tis­ti­cal­ly-sig­nifi­can­t); the SNB group increased on APM & BOMAT (but only BOMAT was sta­tis­ti­cal­ly-sig­nifi­can­t).

Michael J. Kane has writ­ten some crit­i­cal com­ments on the results.

Savage 2013

“Near and Far Trans­fer of Work­ing Mem­ory Train­ing Related Gains in Healthy Adults”, Sav­age 2013:

Enhanc­ing intel­li­gence through work­ing mem­ory train­ing is an attrac­tive con­cept, par­tic­u­larly for mid­dle-aged adults. How­ev­er, inves­ti­ga­tions of work­ing mem­ory train­ing ben­e­fits are lim­ited to younger or older adults, and results are incon­sis­tent. This study inves­ti­gates work­ing mem­ory train­ing in mid­dle age-range adults. Fifty healthy adults, aged 30-60, com­pleted mea­sures of work­ing mem­o­ry, pro­cess­ing speed, and fluid intel­li­gence before and after a 5-week web-based work­ing mem­ory (ex­per­i­men­tal) or pro­cess­ing speed (ac­tive con­trol) train­ing pro­gram. Base­line intel­li­gence and per­son­al­ity were mea­sured as poten­tial indi­vid­ual char­ac­ter­is­tics asso­ci­ated with change. Improved per­for­mance on work­ing mem­ory and pro­cess­ing speed tasks were expe­ri­enced by both groups; how­ev­er, only the work­ing mem­ory train­ing group improved in fluid intel­li­gence. Agree­able­ness emerged as a per­son­al­ity fac­tor asso­ci­ated with work­ing mem­ory train­ing related change. Albeit lim­ited by pow­er, find­ings sug­gest that dual n-back work­ing mem­ory train­ing not only enhances work­ing mem­ory but also fluid intel­li­gence in mid­dle-aged healthy adults.

The per­son­al­ity cor­re­la­tions seem to differ with Stud­er-Luethi.

Stepankova et al 2013

“The Mal­leabil­ity of Work­ing Mem­ory and Visu­ospa­tial Skills: A Ran­dom­ized Con­trolled Study in Older Adults”, Stepankova et al 2013:

There is accu­mu­lat­ing evi­dence that train­ing on work­ing mem­ory (WM) gen­er­al­izes to other non­trained domains, and there are reports of trans­fer effects extend­ing as far as to mea­sures of fluid intel­li­gence. Although there have been sev­eral demon­stra­tions of such trans­fer effects in young adults and chil­dren, they have been diffi­cult to demon­strate in older adults. In this study, we inves­ti­gated the gen­er­al­iz­ing effects of an adap­tive WM inter­ven­tion on non­trained mea­sures of WM and visu­ospa­tial skills. We ran­domly assigned healthy older adults to train on a ver­bal n-back task over the course of a month for either 10 or 20 ses­sions. Their per­for­mance change was com­pared with that of a con­trol group. Our results revealed reli­able group effects in non­trained stan­dard clin­i­cal mea­sures of WM and visu­ospa­tial skills in that both train­ing groups out­per­formed the con­trol group. We also observed a dose-re­sponse effect, that is, a pos­i­tive rela­tion­ship between train­ing fre­quency and the gain in visu­ospa­tial skills; this find­ing was fur­ther con­firmed by a pos­i­tive cor­re­la­tion between train­ing improve­ment and trans­fer. The improve­ments in visu­ospa­tial skills emerged even though the inter­ven­tion was restricted to the ver­bal domain. Our work has impor­tant impli­ca­tions in that our data pro­vide fur­ther evi­dence for plas­tic­ity of cog­ni­tive func­tions in old age.

Horvat 2014

“The effect of work­ing mem­ory train­ing on cog­ni­tive abil­i­ties”, Hor­vat 2014; Slove­ni­an, Eng­lish abstract:

In the last few years, there is a grow­ing evi­dence in psy­cho­log­i­cal lit­er­a­ture indi­cat­ing that work­ing mem­ory train­ing could serve as a use­ful tool to improve per­for­mance on non-trained tasks that mea­sure higher cog­ni­tive abil­i­ties; how­ev­er, results of differ­ent stud­ies remain incon­sis­tent. The aim of the present mas­ter the­sis was to dis­cover whether work­ing mem­ory train­ing could improve short­-term mem­ory capac­ity and increase test scores on test of fluid intel­li­gence in nor­mal devel­op­ing chil­dren.

Final sam­ple con­sisted of 29 par­tic­i­pants, between 13 to 15 years old; 14 of them were in exper­i­men­tal group, 15 was con­trols. Exper­i­men­tal group com­pleted series of ten work­ing mem­ory train­ings, based on adap­tive dual n-back task. Con­trol group was pas­sive and did not do any train­ing in the mean­time.

Results of our study showed that all par­tic­i­pants in exper­i­men­tal group improved their per­for­mance on trained task. There was no sta­tis­ti­cally sig­nifi­cant effect of exper­i­men­tal group on mea­sures of digit span and visu­ospa­tial mem­ory span before and after train­ing, when com­par­ing with per­for­mance of con­trol group. How­ev­er, exper­i­men­tal group improved more on mea­sure of fluid intel­li­gence com­pared with con­trol group.

Find­ing of our study sug­gest the impor­tance of inves­ti­gat­ing fac­tors asso­ci­ated with effec­tive­ness of work­ing mem­ory train­ing in future research.

Heinzel et al 2016

“Neural cor­re­lates of train­ing and trans­fer effects in work­ing mem­ory in older adults”, Heinzel et al 2016:

As indi­cated by pre­vi­ous research, aging is asso­ci­ated with a decline in work­ing mem­ory (WM) func­tion­ing, related to alter­ations in fron­to-pari­etal neural acti­va­tions. At the same time, pre­vi­ous stud­ies showed that WM train­ing in older adults may improve the per­for­mance in the trained task (train­ing effec­t), and more impor­tant­ly, also in untrained WM tasks (trans­fer effect­s). How­ev­er, neural cor­re­lates of these trans­fer effects that would improve under­stand­ing of its under­ly­ing mech­a­nisms, have not been shown in older par­tic­i­pants as yet. In this study, we inves­ti­gated blood­-oxy­gen-level-de­pen­dent (BOLD) sig­nal changes dur­ing n-back per­for­mance and an untrained delayed recog­ni­tion (Stern­berg) task fol­low­ing 12 ses­sions (45 min­utes each) of adap­tive n-back train­ing in older adults. The Stern­berg task used in this study allowed to test for neural train­ing effects inde­pen­dent of spe­cific task affor­dances of the trained task and to sep­a­rate main­te­nance from updat­ing process­es. Thir­ty-two healthy older par­tic­i­pants (60-75 years) were assigned either to an n-back train­ing or a no-con­tact con­trol group. Before (t1) and after (t2) training/waiting peri­od, both the n-back task and the Stern­berg task were con­ducted while BOLD sig­nal was mea­sured using func­tional Mag­netic Res­o­nance Imag­ing (fMRI) in all par­tic­i­pants. In addi­tion, neu­ropsy­cho­log­i­cal tests were per­formed out­side the scan­ner. WM per­for­mance improved with train­ing and behav­ioral trans­fer to tests mea­sur­ing exec­u­tive func­tions, pro­cess­ing speed, and fluid intel­li­gence was found. In the train­ing group, BOLD sig­nal in right lat­eral mid­dle frontal gyrus/ cau­dal supe­rior frontal sul­cus (Brod­mann area, BA 6/8) decreased in both the trained n-back and the updat­ing con­di­tion of the untrained Stern­berg task at t2, com­pared to the con­trol group. FMRI find­ings indi­cate a train­ing-re­lated increase in pro­cess­ing effi­ciency of WM net­works, poten­tially related to the process of WM updat­ing. Per­for­mance gains in untrained tasks sug­gest that trans­fer to other cog­ni­tive tasks remains pos­si­ble in aging.


Moody 2009 (re: Jaeggi 2008)

Jaeggi 2008, you may remem­ber, showed that train­ing on N-back improved work­ing mem­o­ry, but it also boosted scores on tests of Gf. The lat­ter would be a major result - indeed, unique - and is one of the main research results encour­ag­ing peo­ple to do N-back in a non-re­search set­ting. Peo­ple want to believe that N-back is effi­ca­cious and par­tic­u­larly that it will do more than boost work­ing mem­o­ry. So we need to be wary of (for those of you who read too much fan­ta­sy, you’ll know this as ).

For­tu­nate­ly, we can dis­cuss at length the work of one David E. Moody who has pub­lished a crit­i­cism of how the odd method­ol­ogy of Jaeggi 2008 under­mines this result. He’s worth quot­ing at length, since besides being impor­tant to under­stand­ing Jaeg­gi’s study, it’s an inter­est­ing exam­ple of how sub­tle issues can be impor­tant in psy­chol­o­gy:

"The sub­jects were divided into four groups, differ­ing in the num­ber of days of train­ing they received on the task of work­ing mem­o­ry. The group that received the least train­ing (8 days) was tested on Raven’s Advanced Pro­gres­sive Matri­ces (Raven, 1990), a widely used and well-estab­lished test of fluid intel­li­gence. This group, how­ev­er, demon­strated neg­li­gi­ble improve­ment between pre- and post-test per­for­mance.

The other three groups were not tested using Raven’s Matri­ces, but rather on an alter­na­tive test of much more recent ori­gin. The Bochumer Matri­ces Test (BOMAT) (Hossiep, Tur­ck, & Hasel­la, 1999) is sim­i­lar to Raven’s in that it con­sists of visual analo­gies. In both tests, a series of geo­met­ric and other fig­ures is pre­sented in a matrix for­mat and the sub­ject is required to infer a pat­tern in order to pre­dict the next fig­ure in the series. The authors pro­vide no rea­son for switch­ing from Raven’s to the BOMAT.

The BOMAT differs from Raven’s in some impor­tant respects, but is sim­i­lar in one cru­cial attrib­ute: both tests are pro­gres­sive in nature, which means that test items are sequen­tially arranged in order of increas­ing diffi­cul­ty. A high score on the test, there­fore, is pred­i­cated on sub­jects’ abil­ity to solve the more diffi­cult items.

How­ev­er, this pro­gres­sive fea­ture of the test was effec­tively elim­i­nated by the man­ner in which Jaeggi et al. admin­is­tered it. The BOMAT is a 29-item test which sub­jects are sup­posed to be allowed 45 min to com­plete. Remark­ably, how­ev­er, Jaeggi et al. reduced the allot­ted time from 45 min to 10. The effect of this restric­tion was to make it impos­si­ble for sub­jects to pro­ceed to the more diffi­cult items on the test. The large major­ity of the sub­ject­s-re­gard­less of the num­ber of days of train­ing they received-an­swered less than 14 test items cor­rect­ly.
By virtue of the man­ner in which they admin­is­tered the BOMAT, Jaeggi et al. trans­formed it from a test of fluid intel­li­gence into a speed test of abil­ity to solve the eas­ier visual analo­gies. The time restric­tion not only made it impos­si­ble for sub­jects to pro­ceed to the more diffi­cult items, it also lim­ited the oppor­tu­nity to learn about the test-and so improve per­for­mance-in the process of tak­ing it. This fac­tor can­not be neglected because test per­for­mance does improve with prac­tice, as demon­strated by the con­trol groups in the Jaeggi study, whose improve­ment from pre- to post-test was about half that of the exper­i­men­tal groups. The same learn­ing process that occurs from one admin­is­tra­tion of the test to the next may also oper­ate within a given admin­is­tra­tion of the test-pro­vided sub­jects are allowed suffi­cient time to com­plete it.

Since the whole weight of their con­clu­sion rests upon the valid­ity of their mea­sure of fluid intel­li­gence, one might assume the authors would present a care­ful defense of the man­ner in which they admin­is­tered the BOMAT. Instead they do not even men­tion that sub­jects are nor­mally allowed 45 min to com­plete the test. Nor do they men­tion that the test has 29 items, of which most of their sub­jects com­pleted less than half.

The authors’ entire ratio­nale for reduc­ing the allot­ted time to 10 min is con­fined to a foot­note. That foot­note reads as fol­lows:

Although this pro­ce­dure differs from the stan­dard­ized pro­ce­dure, there is evi­dence that this timed pro­ce­dure has lit­tle influ­ence on rel­a­tive stand­ing in these tests, in that the cor­re­la­tion of speeded and non-speeded ver­sions is very high (r = 0.95; ref. 37).

The ref­er­ence given in the foot­note is to a 1988 study (Frear­son & Eysenck, 1986) that is not in fact designed to sup­port the con­clu­sion stated by Jaeggi et al. The 1988 study merely con­tains a foot­note of its own, which refers in turn to unpub­lished research con­ducted forty years ear­li­er. That research involved Raven’s matri­ces, not the BOMAT, and entailed a reduc­tion in time of at most 50%, not more than 75%, as in the Jaeggi study.

So instead of offer­ing a rea­soned defense of their pro­ce­dure, Jaeggi et al. pro­vide merely a foot­note which refers in turn to a foot­note in another study. The sec­ond foot­note describes unpub­lished results, evi­dently recalled by mem­ory over a span of 40 years, involv­ing a differ­ent test and a much less severe reduc­tion in time.

In this con­text it bears repeat­ing that the group that was tested on Raven’s matri­ces (with pre­sum­ably the same time restric­tion) showed vir­tu­ally no improve­ment in test per­for­mance, in spite of eight days’ train­ing on work­ing mem­o­ry. Per­for­mance gains only appeared for the groups admin­is­tered the BOMAT. But the BOMAT differs in one impor­tant respect from Raven’s. Raven’s matri­ces are pre­sented in a 3 × 3 for­mat, whereas the BOMAT con­sists of a 5 × 3 matrix con­fig­u­ra­tion.

With 15 visual fig­ures to keep track of in each test item instead of 9, the BOMAT puts added empha­sis on sub­jects’ abil­ity to hold details of the fig­ures in work­ing mem­o­ry, espe­cially under the con­di­tion of a severe time con­straint. There­fore it is not sur­pris­ing that exten­sive train­ing on a task of work­ing mem­ory would facil­i­tate per­for­mance on the early and eas­i­est BOMAT test item­s-those that present less of a chal­lenge to fluid intel­li­gence.

This inter­pre­ta­tion acquires added plau­si­bil­ity from the nature of one of the two work­ing-mem­ory tasks admin­is­tered to the exper­i­men­tal groups. The authors main­tain that those tasks were “entirely differ­ent” from the test of fluid intel­li­gence. One of the tasks mer­its that descrip­tion: it was a sequence of let­ters pre­sented audi­to­rily through head­phones.

But the other work­ing-mem­ory task involved recall of the loca­tion of a small square in one of sev­eral posi­tions in a visual matrix pat­tern. It rep­re­sents in sim­pli­fied form pre­cisely the kind of detail required to solve visual analo­gies. Rather than being “entirely differ­ent” from the test items on the BOMAT, this task seems well-de­signed to facil­i­tate per­for­mance on that test."

Stern­berg reviewed Jaeggi 2008

Email from Jaeggi to Pon­tus about the time­lim­it; visual prob­lems bias RPM against women who have a slightly lower aver­age visu­ospa­tial per­for­mance.

Nut­ley 2011 dis­cusses why one test may be insuffi­cient when an exper­i­men­tal inter­ven­tion is done:

Since the defi­n­i­tion of Gf itself stems from , using the shared vari­ance of sev­eral tests to define the Gf fac­tor, a sim­i­lar method should be used to mea­sure gains in Gf. Another issue raised by Stern­berg (2008) is that the use of only one sin­gle train­ing task makes it diffi­cult to infer if the train­ing effect was due to some spe­cific aspect of the task rather than the gen­eral effect of train­ing a con­struct.

Ship­stead, Redick, & Engle 2012 elab­o­rate on how, while matrix-style IQ tests are con­sid­ered gold stan­dards, they are not per­fect mea­sures of IQ such that an increase in per­for­mance must reflect an increase in under­ly­ing intel­li­gence:

…far trans­fer tasks are not per­fect mea­sures of abil­i­ty. In many train­ing stud­ies, Raven’s Pro­gres­sive Matri­ces (Ravens; Raven, 1990, 1995, 1998) serves as the sole indi­ca­tor of Gf. This “matrix rea­son­ing” task presents test tak­ers with a series of abstract pic­tures that are arranged in a grid. One piece of the grid is miss­ing, and the test taker must choose an option (from among sev­er­al) that com­pletes the sequence. Jensen (1998) esti­mates that 64% of the vari­ance in Ravens per­for­mance can be explained by Gf. Sim­i­lar­ly, Fig­ure 3 indi­cates that in the study of Kane et al. (2004), 58% of the Ravens vari­ance was explained by Gf. It is clear that Ravens is strongly related to Gf. How­ev­er, 30%-40% of the vari­ance in Ravens is attrib­ut­able to other influ­ences. Thus, when Ravens (or any other task) serves as the sole indi­ca­tor of far trans­fer, per­for­mance improve­ments can be explained with­out assum­ing that a gen­eral abil­ity has changed. Instead, it can be par­si­mo­niously con­cluded that train­ing has influ­enced some­thing that is spe­cific to per­form­ing Ravens, but not nec­es­sar­ily applic­a­ble to other rea­son­ing con­texts (Car­roll, 1993; Jensen, 1998; Moody, 2009; Schmiedek et al., 2010; te Nijen­huis, van Via­nen, & van der Flier, 2007).

…Pre­emp­tion of crit­i­cisms such as Moody’s (2009) is, how­ev­er, read­ily accom­plished through demon­stra­tion of trans­fer to sev­eral mea­sures of an abil­i­ty. Unfor­tu­nate­ly, the prac­tice of equat­ing posttest improve­ment on one task with change to cog­ni­tive abil­i­ties is preva­lent within the WM train­ing lit­er­a­ture (cf. Jaeggi et al., 2008; Kling­berg, 2010). This is par­tially dri­ven by the time and mon­e­tary costs asso­ci­ated with con­duct­ing mul­ti­ses­sion, mul­ti­week stud­ies. Regard­less, train­ing stud­ies can greatly improve the per­sua­sive­ness of their results by mea­sur­ing trans­fer via sev­eral tasks that differ in periph­eral aspects but con­verge on an abil­ity of inter­est (e.g., a ver­bal, Gf, and spa­tial task from Fig­ure 3). If a train­ing effect is robust, it should be appar­ent in all tasks.

Explicit attempts at mea­sur­ing speed­ing:

Seidler 2010

“Cog­ni­tive Train­ing As An Inter­ven­tion To Improve Dri­ving Abil­ity In The Older Adult”, a by a group which includes Susanne Jaeg­gi, stud­ied the effect of DNB on the dri­ving abil­ity of younger/older adults. As part of the before/after test bat­tery, a Raven’s was admin­is­tered:

Type 2 tests included Raven’s matri­ces (Raven et al., 1990), which is a stan­dard­ized test of fluid intel­li­gence, and the BOMAT and ver­bal analo­gies tests of intel­li­gence (Hossiep et al., 1995). We have pre­vi­ously shown that work­ing mem­ory train­ing trans­fers to per­for­mance on this task (Jaeggi et al., 2008), and we included it here for the sake of repli­ca­tion.

They found the :

There were no sig­nifi­cant group by test ses­sion inter­ac­tions for the intel­li­gence mea­sures or com­plex motor tasks for the young adults, although one of the intel­li­gence mea­sures exhib­ited a trend for trans­fer effects that scaled with train­ing task gains.

…Un­like in our pre­vi­ous work (Jaeggi et al., 2008) we did not observe trans­fer to mea­sures of intel­li­gence. This may have been a by-prod­uct of the rather exten­sive pre and post test bat­tery of assess­ments that we per­formed, par­tic­u­larly given that one of the intel­li­gence mea­sures was always per­formed last in the sequence of tests. Given this, par­tic­i­pants may have been too fatigued and / or unmo­ti­vated to per­form these tests well.

Jonasson 2011

“Inves­ti­gat­ing train­ing and trans­fer in com­plex tasks with dual n-back”, bach­e­lor degree the­sis:

No clear con­sen­sus exists in the sci­en­tific com­mu­nity of what con­sti­tutes effi­cient dual-task­ing abil­i­ties. More­over, the train­ing of exec­u­tive com­po­nents has been given increased atten­tion in the lit­er­a­ture in recent years. Inves­ti­gat­ing trans­fer­abil­ity of cog­ni­tive train­ing in a com­plex task set­ting, thirty sub­jects prac­ticed for five days on a Name-Tag task (con­trols) or a Dual N-Back task (ex­per­i­men­tal), sub­se­quently being tested on two trans­fer tasks; the Auto­mated Oper­a­tion Span and a dual task (Trail Mak­ing task + Math­e­mat­i­cal Addi­tion task). Dual N-Back train­ing pre­vi­ously trans­ferred to unre­lated intel­li­gence tests and in this study is assumed to rely pri­mar­ily on exec­u­tive atten­tion. Exec­u­tive atten­tion, func­tion­ing to resolve inter­fer­ence and main­tain­ing task-rel­e­vant infor­ma­tion in work­ing mem­o­ry, has pre­vi­ously been linked to fluid intel­li­gence and to dual-task­ing. How­ev­er, no trans­fer effects were revealed. The length of train­ing may have been too short to reveal any such effects. How­ev­er, the three com­plex tasks cor­re­lated sig­nifi­cant­ly, sug­gest­ing com­mon resources, and there­fore hav­ing poten­tials as trans­fer tasks. Notably, sub­jects with the high­est task-spe­cific improve­ments per­formed worse on the trans­fer tasks than sub­jects improv­ing less, sug­gest­ing that task-spe­cific gains do not directly cor­re­late with any trans­fer effect. At pre­sent, if trans­fer exists in these set­tings, data implies that five days of train­ing is insuffi­cient for a trans­fer to occur. Impor­tant ques­tions for future research relates to the nec­es­sary con­di­tions for trans­fer to occur, such as the amount of train­ing, neural cor­re­lates, atten­tion, and moti­va­tion.

Caveats for this study:

  1. It did not attempt to mea­sure any form of Gf

  2. It used 30 total sub­jects, or 15 in each group

  3. Train­ing was over 5-6 days, 16-20 min­utes each day (although the DNB sub­jects did increase their scores), which may not be enough; although Jonas­son com­ments (pg 44-45):

    Nev­er­the­less, train­ing for five days or less has also led to sig­nifi­cant improve­ments in per­for­mance on trans­fer tasks (Damos & Wick­ens, 1980; Kramer et al., 1995; Rueda et al., 2005). How­ev­er, the study by Kramer et al. (1995) may have trans­ferred a strat­egy rather than train­ing a spe­cific com­po­nent, and the study by Rueda et al. (2005) found trans­fer in chil­dren between ages four and six, the chil­dren pos­si­bly being more sus­cep­ti­ble to train­ing than adults.

  4. Jonas­son sug­gests that sub­jects were unmo­ti­vat­ed, per­haps by the train­ing being done at home on Lumosity.com; only one did the full 6 days of train­ing, and incen­tives often increase per­for­mance on IQ and other tests.

Chooi 2011

“Improv­ing Intel­li­gence by Increas­ing Work­ing Mem­ory Capac­ity”, PhD the­sis:

…The cur­rent study aimed to repli­cate and extend the orig­i­nal study con­ducted by Jaeggi et al. (2008) in a well-con­trolled exper­i­ment that could explain the cause or causes of such trans­fer if indeed the case. There were a total of 93 par­tic­i­pants who com­pleted the study, and they were ran­domly assigned to one of three groups - a pas­sive con­trol group, active con­trol group and exper­i­men­tal group. Half of the par­tic­i­pants were ran­domly assigned to the 8-day con­di­tion and the other half to the 20-day con­di­tion. All par­tic­i­pants com­pleted a bat­tery of tests at pre- and post-tests that con­sisted of short timed tests, a com­plex work­ing mem­ory span and a [un­timed] matrix rea­son­ing task. Par­tic­i­pants in the active con­trol group prac­ticed for either 8 days or 20 days on the same task as the one used in the exper­i­men­tal group, the dual n-back, but at the eas­i­est level to con­trol for Hawthorne effect. Results from the cur­rent study did not sug­gest any sig­nifi­cant improve­ment in the men­tal abil­i­ties test­ed, espe­cially fluid intel­li­gence and work­ing mem­ory capac­i­ty, after train­ing for 8 days or 20 days. This leads to the con­clu­sion that increas­ing one’s work­ing mem­ory capac­ity by train­ing and prac­tice did not trans­fer to improve­ment on fluid intel­li­gence as asserted by Jaeggi and her col­leagues (2008, 2010).

Jonathan Toomim points out a con­cern about : the mul­ti­ple con­trol groups means that the num­ber of sub­jects doing actual n-back­ing is small and the null result is only trust­wor­thy if one expects a dra­matic effect from n-back­ing, a huge effect size taken from Jaeggi 2010 (but not Jaeggi 2008’s smaller effect size). He com­ments: “the for DNB train­ing is prob­a­bly less than 0.98. (Of course, that’s what I believed any­way before I saw this.) The effect size could quite rea­son­ably still be as high as 0.75.” Chooi 2011 seems to have been sum­ma­rized as Chooi & Thomp­son 2012, which dis­cusses the power issue fur­ther:

A major lim­i­ta­tion of the study was the small sam­ple size and pos­si­bly sam­ple char­ac­ter­is­tic, which may have low­ered the power of analy­ses con­duct­ed. When Jaeggi et al. (2010) repeated the study with 25 stu­dents who trained on the Raven’s Advanced Pro­gres­sive Matri­ces (RAPM) for 20 days, they obtained an effect size (Co­hen’s d) of 0.98. Addi­tion­al­ly, par­tic­i­pants in the Jaeggi et al. (2010) study were cul­tur­ally differ­ent from the par­tic­i­pants in the cur­rent study. Par­tic­i­pants from the for­mer study were under­grad­u­ates from a uni­ver­sity in Tai­wan (mean age=19.4), while those from the cur­rent study were mostly Amer­i­can stu­dents attend­ing a Mid­west­ern uni­ver­si­ty. The cur­rent study was designed accord­ing to the claims put forth by Jaeggi et al. (2008) as a study of repli­ca­tion and exten­sion. In that study, par­tic­i­pants were healthy, young adults who were slightly older (mean age=25.6 years) than the cur­rent sam­ple (mean age= 20.0), and they were recruited from a uni­ver­sity in Bern, Switzer­land. Effect sizes obtained from our study for RAPM were not as high as reported by Jaeggi et al. (2008, 2010) - d = 0.65 and d = 0.98 respec­tive­ly. With such large effect sizes, the analy­sis of paired t-test could achieve a power of 0.80 with 10- 12 par­tic­i­pants. Refer­ring to Table 4, the high­est RAPM effect size (d = 0.50) was from the 8-day pas­sive con­trol group that had 22 par­tic­i­pants and this achieved a power of 0.83. The 20-day train­ing group (n = 13) had an effect size of 0.06 in RAPM, and to achieve a power of 0.80 this group would need more than 1700 par­tic­i­pants. On the other hand, the effect size from the 20-day active con­trol group with 11 par­tic­i­pants was 0.40, and power could be improved by increas­ing the num­ber of par­tic­i­pants to 34. These obser­va­tions led us to believe that the lack of improve­ments in the test vari­ables was prob­a­bly due to a com­bi­na­tion of low sam­ple size and differ­ences in sam­ple char­ac­ter­is­tics, of which par­tic­i­pants in our study had restric­tion of range in intel­lec­tual abil­i­ty.

Preece 2011 / Palmer 2011

“The Effect of Work­ing Mem­ory (n-back) Train­ing on Fluid Intel­li­gence”, David Preece 2011:

The present study repli­cated and extended these results by test­ing the fluid intel­li­gence con­struct using a differ­ent type of fluid intel­li­gence test, and employ­ing an ‘active’ rather than ‘no-con­tact’ con­trol group to account for moti­va­tional effects on intel­li­gence test per­for­mance. 58 par­tic­i­pants were involved and their fluid intel­li­gence was assessed pre-train­ing using the Fig­ure Weights sub­test from the Wech­sler Adult Intel­li­gence Scale - Fourth Edi­tion (WAIS-IV). Par­tic­i­pants were ran­domly assigned to two groups (ex­per­i­men­tal or active con­trol), and both groups did a train­ing task on their home com­puter for 20 days, for 20 min­utes a day. The exper­i­men­tal group trained using a sin­gle n-back task whilst the con­trol group com­pleted gen­eral knowl­edge and vocab­u­lary ques­tions. After train­ing, par­tic­i­pants were retested using the Fig­ure Weights sub­test. Par­tic­i­pants’ Fig­ure Weights scores were analysed using an analy­sis of covari­ance (ANCOVA). The results of this analy­sis revealed no sig­nifi­cant differ­ence between the train­ing groups in terms of per­for­mance on the Fig­ure Weights sub­test, sug­gest­ing that the n-back task was not effec­tive in increas­ing fluid rea­son­ing abil­i­ty. These find­ings were in con­trast to those of Jaeggi et al. (2008) and Jaeggi et al. (2010) and sug­gested that differ­ences between the work­ing mem­ory group and con­trol group found in these stud­ies were likely the result of placebo/motivational effects rather than the prop­er­ties of the n-back task itself.

Sub­jects were also tested on the RAPM pre/post, but that was reported in a sep­a­rate the­sis, Vaughan Palmer’s “Improv­ing fluid intel­li­gence (Gf) though train­ing”, which is not avail­able. I have emailed the super­vis­ing pro­fes­sor for more infor­ma­tion.

2 closely related the­ses are “Improv­ing Mem­ory Using N-Back Train­ing”, Beavon 2012 (short­-term & work­ing mem­o­ry); and “Visual Mem­ory Improve­ment in Recog­ni­tion”, Prandl 2012.

Kundu et al 2012

“Behav­ioral and EEG Effects of Work­ing Mem­ory Train­ing” (RAPM sup­ple­ment); 13 con­trols and 13 exper­i­men­tals trained for 1000 min­utes on dual n-back (Brain Work­shop) or Tetris. “Train­ing does not appear to trans­fer to gf [RAPM] or com­plex span [OSPAN].” This is not a pub­lished study but a con­fer­ence poster, so details such as RAPM scores are not includ­ed. It may be related to Kundu et al 2011.

Kundu et al 2013

The interim posters Kundu 2011 & 2012 were pub­lished as , Kundu et al 2013:

Although long con­sid­ered a natively endowed and fixed trait, work­ing mem­ory (WM) abil­ity has recently been shown to improve with inten­sive train­ing. What remains con­tro­ver­sial and poorly under­stood, how­ev­er, are the neural bases of these train­ing effects, and the extent to which WM train­ing gains trans­fer to other cog­ni­tive tasks. Here we present evi­dence from human elec­tro­phys­i­ol­ogy (EEG) and simul­ta­ne­ous tran­scra­nial mag­netic stim­u­la­tion (TMS) and EEG that the trans­fer of WM train­ing to other cog­ni­tive tasks is sup­ported by changes in task-re­lated effec­tive con­nec­tiv­ity in fron­topari­etal and pari­eto-oc­cip­i­tal net­works that are engaged by both the trained and trans­fer tasks. One con­se­quence of this effect is greater effi­ciency of stim­u­lus pro­cess­ing, as evi­denced by changes in EEG indices of indi­vid­ual differ­ences in short­-term mem­ory capac­ity and in visual search per­for­mance. Trans­fer to search-re­lated activ­ity pro­vides evi­dence that some­thing more fun­da­men­tal than task-spe­cific strat­egy or stim­u­lus-spe­cific rep­re­sen­ta­tions have been learned. Fur­ther­more, these pat­terns of train­ing and trans­fer high­light the role of com­mon neural sys­tems in deter­min­ing indi­vid­ual differ­ences in aspects of visu­ospa­tial cog­ni­tion.

Salminen 2012

, Salmi­nen & Strobach & Schu­bert, Fron­tiers in Human Neu­ro­science:

Recent stud­ies have reported improve­ments in a vari­ety of cog­ni­tive func­tions fol­low­ing sole work­ing mem­ory (WM) train­ing. In spite of the emer­gence of sev­eral suc­cess­ful train­ing par­a­digms, the scope of trans­fer effects has remained mixed. This is most likely due to the het­ero­gene­ity of cog­ni­tive func­tions that have been mea­sured and tasks that have been applied. In the present study, we approached this issue sys­tem­at­i­cally by inves­ti­gat­ing trans­fer effects from WM train­ing to differ­ent aspects of exec­u­tive func­tion­ing. Our train­ing task was a demand­ing WM task that requires simul­ta­ne­ous per­for­mance of a visual and an audi­tory n-back task, while the trans­fer tasks tapped WM updat­ing, coor­di­na­tion of the per­for­mance of mul­ti­ple simul­ta­ne­ous tasks (i.e., dual-tasks) and sequen­tial tasks (i.e., task switch­ing), and the tem­po­ral dis­tri­b­u­tion of atten­tional pro­cess­ing. Addi­tion­al­ly, we exam­ined whether WM train­ing improves rea­son­ing abil­i­ties; a hypoth­e­sis that has so far gained mixed sup­port. Fol­low­ing train­ing, par­tic­i­pants showed improve­ments in the trained task as well as in the trans­fer WM updat­ing task. As for the other exec­u­tive func­tions, trained par­tic­i­pants improved in a task switch­ing sit­u­a­tion and in atten­tional pro­cess­ing. There was no trans­fer to the dual-task sit­u­a­tion or to rea­son­ing skills. These results, there­fore, con­firm pre­vi­ous find­ings that WM can be trained, and addi­tion­al­ly, they show that the train­ing effects can gen­er­al­ize to var­i­ous other tasks tap­ping on exec­u­tive func­tions.

Pas­sive con­trol group; unspeeded RAPM test.

Redick et al 2012

“No evi­dence of trans­fer after work­ing mem­ory train­ing: A con­trolled, ran­dom­ized study” (sup­ple­ment), Redick et al 2012; abstract:

Numer­ous recent stud­ies seem to pro­vide evi­dence for the gen­eral intel­lec­tual ben­e­fits of work­ing mem­ory train­ing. In reviews of the train­ing lit­er­a­ture, Ship­stead, Redick, and Engle (2010, in press) argued that the field should treat recent results with a crit­i­cal eye. Many pub­lished work­ing mem­ory train­ing stud­ies suffer from design lim­i­ta­tions (no-con­tact con­trol groups, sin­gle mea­sures of cog­ni­tive con­struct­s), mixed results (trans­fer of train­ing gains to some tasks but not oth­ers, incon­sis­tent trans­fer to the same tasks across stud­ies), and lack of the­o­ret­i­cal ground­ing (iden­ti­fy­ing the mech­a­nisms respon­si­ble for observed trans­fer). The cur­rent study com­pared young adults who received 20 ses­sions of prac­tice on an adap­tive dual n-back pro­gram (work­ing mem­ory train­ing group) or an adap­tive visual search pro­gram (ac­tive place­bo-con­trol group) with a no-con­tact con­trol group that received no prac­tice. In addi­tion, all sub­jects com­pleted pre-test, mid-test, and post-test ses­sions, com­pris­ing mul­ti­ple mea­sures of fluid intel­li­gence, mul­ti­task­ing, work­ing mem­ory capac­i­ty, crys­tal­lized intel­li­gence, and per­cep­tual speed. Despite improve­ments on both the dual n-back and visual search tasks with prac­tice, and despite a high level of sta­tis­ti­cal pow­er, there was no pos­i­tive trans­fer to any of the cog­ni­tive abil­ity tests. We dis­cuss these results in the con­text of pre­vi­ous work­ing mem­ory train­ing research, and address issues for future work­ing mem­ory train­ing stud­ies.

75 sub­jects; RAPM was speed­ed.

Rudebeck 2012

, Rude­beck et al 2012:

One cur­rent chal­lenge in cog­ni­tive train­ing is to cre­ate a train­ing regime that ben­e­fits mul­ti­ple cog­ni­tive domains, includ­ing episodic mem­o­ry, with­out rely­ing on a large bat­tery of tasks, which can be time-con­sum­ing and diffi­cult to learn. By giv­ing care­ful con­sid­er­a­tion to the neural cor­re­lates under­ly­ing episodic and work­ing mem­o­ry, we devised a com­put­er­ized work­ing mem­ory train­ing task in which neu­ro­log­i­cally healthy par­tic­i­pants were required to mon­i­tor and detect rep­e­ti­tions in two streams of spa­tial infor­ma­tion (spa­tial loca­tion and scene iden­ti­ty) pre­sented simul­ta­ne­ously (i.e. a dual n-back par­a­dig­m). Par­tic­i­pants’ episodic mem­ory abil­i­ties were assessed before and after train­ing using two object and scene recog­ni­tion mem­ory tasks incor­po­rat­ing mem­ory con­fi­dence judg­ments. Fur­ther­more, to deter­mine the gen­er­al­iz­abil­ity of the effects of train­ing, we also assessed fluid intel­li­gence using a matrix rea­son­ing task. By exam­in­ing the differ­ence between pre- and post-train­ing per­for­mance (i.e. gain scores), we found that the train­ers, com­pared to non-train­ers, exhib­ited a sig­nifi­cant improve­ment in fluid intel­li­gence after 20 days. Inter­est­ing­ly, pre-train­ing fluid intel­li­gence per­for­mance, but not train­ing task improve­ment, was a sig­nifi­cant pre­dic­tor of post-train­ing fluid intel­li­gence improve­ment, with lower pre-train­ing fluid intel­li­gence asso­ci­ated with greater post-train­ing gain. Cru­cial­ly, train­ers who improved the most on the train­ing task also showed an improve­ment in recog­ni­tion mem­ory as cap­tured by d-prime scores and esti­mates of rec­ol­lec­tion and famil­iar­ity mem­o­ry. Train­ing task improve­ment was a sig­nifi­cant pre­dic­tor of gains in recog­ni­tion and famil­iar­ity mem­ory per­for­mance, with greater train­ing improve­ment lead­ing to more marked gains. In con­trast, lower pre-train­ing rec­ol­lec­tion mem­ory scores, and not train­ing task improve­ment, led to greater rec­ol­lec­tion mem­ory per­for­mance after train­ing. Our find­ings demon­strate that prac­tice on a sin­gle work­ing mem­ory task can poten­tially improve aspects of both episodic mem­ory and fluid intel­li­gence, and that an exten­sive train­ing regime with mul­ti­ple tasks may not be nec­es­sary.

Speeded BOMAT (“Due to time restric­tions and the pos­si­bil­ity of ceil­ing effects asso­ci­ated with some Gf tests, par­tic­i­pants were given 10 min­utes to com­plete as many pat­terns as they could in each assess­ment ses­sion (for a sim­i­lar pro­ce­dure see Jaeggi et al 2008).”); 55 sub­jects total, exper­i­men­tals trained for 400 min­utes, pas­sive con­trol group. The improve­ment pre­dic­tor sounds like a post hoc analy­sis and may be some­thing like regres­sion to the mean.

Heinzel et al 2013

“Work­ing mem­ory train­ing improve­ments and gains in non-trained cog­ni­tive tasks in young and older adults”, Heinzel et al 2013:

Pre­vi­ous stud­ies on work­ing mem­ory train­ing have indi­cated that trans­fer to non-trained tasks of other cog­ni­tive domains may be pos­si­ble. The aim of this study is to com­pare work­ing mem­ory train­ing and trans­fer effects between younger and older adults (n = 60). A novel approach to adap­tive n-back train­ing (12 ses­sions) was imple­mented by vary­ing the work­ing mem­ory load and the pre­sen­ta­tion speed. All par­tic­i­pants com­pleted a neu­ropsy­cho­log­i­cal bat­tery of tests before and after the train­ing. On aver­age, younger train­ing par­tic­i­pants achieved diffi­culty level 12 after train­ing, while older train­ing par­tic­i­pants only reached diffi­culty level 5. In younger par­tic­i­pants, trans­fer to Ver­bal Flu­ency and Digit Sym­bol Sub­sti­tu­tion test was found. In older par­tic­i­pants, we observed a trans­fer to Digit Span For­ward, CERAD Delayed Recall, and Digit Sym­bol Sub­sti­tu­tion test. Results sug­gest that work­ing mem­ory train­ing may be a ben­e­fi­cial inter­ven­tion for main­tain­ing and improv­ing cog­ni­tive func­tion­ing in old age.

Sin­gle n-back; pas­sive con­trol group; no trans­fer in young or old train­ing group to “Raven’s Stan­dard Pro­gres­sive Matri­ces (Raven’s SPM) and the Fig­ural Rela­tions sub­test of a Ger­man intel­li­gence test (Leis­tung­spruef­sys­tem, LPS, Horn, 1983)” (in­creased but sam­ple size is too small to reach sta­tis­ti­cal-sig­nifi­cance in the young group); RPM speeded (7.5 min­utes). See pg19 for graphs of the IQ test per­for­mance.

Onken 2013

“Trans­fer von Arbeits­gedächt­nis­train­ing auf die flu­ide Intel­li­genz”, Johanna Onken 2013; some sort of re-re­port­ing or ver­sion of the Heinzel data.

Fluid intel­li­gence describes the abil­ity to think abstract, to adapt to new sit­u­a­tions and to solve unknown prob­lems. It is impor­tant for learn­ing as well as for aca­d­e­mic and pro­fes­sional suc­cess. Work­ing mem­ory is char­ac­ter­ized as a cog­ni­tive sys­tem, that saves infor­ma­tion over a short period of time in spite of pos­si­ble dis­trac­tions. More­over, work­ing mem­ory is able to assess the rel­e­vance of infor­ma­tion while require­ments change. Effec­tive implicit train­ing is able to increase the work­ing mem­ory capac­i­ty. Fur­ther­more it was shown that work­ing mem­ory train­ing may also cause trans­fer effects to higher cog­ni­tive abil­i­ties such as fluid intel­li­gence. To clar­ify the under­ly­ing processes of this trans­fer, var­i­ous trans­fer mod­els were pre­sent­ed, which either accen­tu­ate the rel­e­vance of pro­cess­ing speed, exec­u­tive func­tions or short time mem­o­ry. The pur­pose of this sur­vey was to con­firm trans­fer effects of work­ing mem­ory train­ing to differ­ent cog­ni­tive abil­i­ties and, on the other hand, to inves­ti­gate the mech­a­nism of the trans­fer accord­ing to the pro­posed trans­fer mod­els. 30 healthy sub­jects [age 22-30 years] par­tic­i­pated in the study and were ran­domly assigned to either train­ing or con­trol group. The train­ing group prac­ticed an adap­tive N-back work­ing mem­ory task for four weeks. Before, after one week and after four weeks of the train­ing, a range of neu­ropsy­cho­log­i­cal tasks was per­formed by the par­tic­i­pants, test­ing for differ­ent cog­ni­tive abil­i­ties. Rel­a­tive to the con­trol group that did not par­tic­i­pate in the train­ing, trans­fer effects to pro­cess­ing speed, exec­u­tive func­tions and fluid intel­li­gence tasks have been found. Addi­tion­al­ly, the train­ing resulted in a sig­nifi­cant short­en­ing of reac­tion time. In sum­ma­ry, the present study demon­strates that com­plex cog­ni­tive abil­i­ties can be improved through effec­tive work­ing mem­ory train­ing. The ques­tion on which cog­ni­tive mech­a­nisms the trans­fer is based could not be answered defin­i­tively by this study. The results sug­gest that the adap­tive work­ing mem­ory train­ing has led mainly to faster basal cog­ni­tive process­es, which in turn resulted in a faster pro­cess­ing of intel­li­gence tests.

30 sub­jects, pas­sive con­trol group, 4 weeks; con­trols paid 50 euros, exper­i­men­tals 150 euros, 480 min­utes of train­ing, sin­gle n-back. IQ tests admin­is­tered: RPM, LPS, MWT-B. On pg40 are all the post-test results: “Tabelle 3.7: Deskrip­tive Daten der Neu­ropsy­cholo­gie im Posttest (t3)”; dis­cus­sion of RPM results on pg46.

Thompson et al 2013


…The cur­rent study attempted to repli­cate and expand those results by admin­is­ter­ing a broad assess­ment of cog­ni­tive abil­i­ties and per­son­al­ity traits to young adults who under­went 20 ses­sions of an adap­tive dual n-back work­ing mem­ory train­ing pro­gram and com­par­ing their post-train­ing per­for­mance on those tests to a matched set of young adults who under­went 20 ses­sions of an adap­tive atten­tional track­ing pro­gram. Pre- and post-train­ing mea­sure­ments of fluid intel­li­gence, stan­dard­ized intel­li­gence tests, speed of pro­cess­ing, read­ing skills, and other tests of work­ing mem­ory were assessed. Both train­ing groups exhib­ited sub­stan­tial and spe­cific improve­ments on the trained tasks that per­sisted for at least 6 months post-train­ing, but no trans­fer of improve­ment was observed to any of the non-trained mea­sure­ments when com­pared to a third untrained group serv­ing as a pas­sive con­trol. These find­ings fail to sup­port the idea that adap­tive work­ing mem­ory train­ing in healthy young adults enhances work­ing mem­ory capac­ity in non-trained tasks, fluid intel­li­gence, or other mea­sures of cog­ni­tive abil­i­ties.

Covari­ate details:

…Two groups of young adults, strat­i­fied so as to be equated on ini­tial fluid IQ scores, were ran­domly assigned to two con­di­tions (a ran­dom­ized con­trolled trial or RCT). The exper­i­men­tal group per­formed the dual n-back task (as in the orig­i­nal Jaeggi et al., 2008 study [6]) for approx­i­mately 40 min­utes per day, 5 days per week for 4 weeks (20 ses­sions of 30 blocks per ses­sion, exceed­ing the max­i­mum of 19 ses­sions of 20 blocks per day in the orig­i­nal Jaeggi et al., 2008 study). An active con­trol group per­formed a visu­ospa­tial skill learn­ing task, mul­ti­ple object track­ing (or MOT), on an iden­ti­cal train­ing sched­ule. We also tested a no-con­tact group equated for ini­tial fluid IQ in case both kinds of train­ing enhanced cog­ni­tive abil­i­ties…­Par­tic­i­pants were given 25 min­utes to com­plete each half of the RAPM…Participants in the train­ing groups were paid $20 per train­ing ses­sion, with a $20 bonus per week for com­plet­ing all five train­ing ses­sions in that week. All par­tic­i­pants were paid $20 per hour for behav­ioral test­ing, and $30 per hour for imag­ing ses­sions (data from imag­ing ses­sions are reported sep­a­rate­ly)….After recruit­ment, par­tic­i­pants under­went approx­i­mately six hours of behav­ioral test­ing spread across three days and two hours of struc­tural and func­tional mag­netic res­o­nance imag­ing. [Thomp­son says there were addi­tional pay­ments for imag­ing not men­tioned, so the true expect­ed-value of par­tic­i­pa­tion was $740.]

In line with my meta-analy­sis’s null on a dose-re­sponse effect:

One method of assess­ing whether the amount of train­ing improve­ment affects the degree of trans­fer is to mea­sure the cor­re­la­tion between train­ing and trans­fer gains. For both the n-back and MOT groups, a pos­i­tive cor­re­la­tion was observed between the amount of improve­ment dur­ing train­ing and the amount of improve­ment on the trained task between the pre- and post-assess­ment (n-back r = .85, p<.0001; MOT r = .77, p<.0001). How­ev­er, the amount of train­ing gain did not sig­nifi­cantly pre­dict improve­ment on any trans­fer task; par­tic­i­pants who improved to a greater extent on the train­ing tasks did not improve more or less on poten­tial trans­fer tasks than did par­tic­i­pants who improved to a lesser extent (all n-back r val­ues <.33, all p’s >.15; all MOT r val­ues <.38, all p’s >.11). Fig­ure S2 depicts the absence of a rela­tion between improve­ment on trained tasks and the post-train­ing changes in the RAPM and the com­bined span tasks.

Note also that the post hoc split of chil­dren into ‘improvers’ and not in that Jaeggi paper does not repli­cate here either:

Another analy­sis that has pre­vi­ously revealed a differ­ence in trans­fer between par­tic­i­pants who exhib­ited larger or smaller train­ing gains has been a divi­sion of par­tic­i­pants into groups based on train­ing gains above or below the group median (me­dian split) [15]. Such a median split of par­tic­i­pants in the present study who per­formed the n-back train­ing yielded no sig­nifi­cant differ­ences in trans­fer between groups (all n-back t-ra­tios <1.78, all p’s >.09). The only trans­fer mea­sure that approached sig­nifi­cance (at p = .09) was on the RAPM test, in which the par­tic­i­pants who improved less on the trained n-back task had higher scores on the post-train­ing behav­ioral test­ing. Sim­i­lar­ly, when sep­a­rat­ing the MOT par­tic­i­pants into two groups based on median MOT improve­ment, the two groups showed no sig­nifi­cant differ­ences in trans­fer per­for­mance (all MOT t-ra­tios <1.74, all p’s >.10).

The per­son­al­ity cor­re­lates from Stud­er-Luethi 2012 also don’t work:

We also exam­ined whether per­son­al­ity assess­ments were asso­ci­ated with differ­ent train­ing or trans­fer out­comes. Nei­ther the Dweck mea­sure of atti­tude toward intel­li­gence (a “growth mind­set”) nor mea­sures of con­sci­en­tious­ness or grit cor­re­lated sig­nifi­cantly with train­ing gains on either train­ing task, although there was a trend toward a sig­nifi­cant neg­a­tive cor­re­la­tion between the growth mind­set and improve­ment on the n-back train­ing task (r = −.44, p = .051), such that par­tic­i­pants who viewed intel­li­gence as more mal­leable had less improve­ment across their n-back train­ing. A greater growth mind­set score was pos­i­tively cor­re­lat­ed, how­ev­er, with improve­ment on the Ravens Advanced Pro­gres­sive Matri­ces in the n-back group (r = .53, p = .017) and in the pas­sive con­trol group (r = .51, p = .027), but not in the MOT con­trol group (r = .031, p>.9). No other trans­fer mea­sures were sig­nifi­cantly pre­dicted by growth mind­set scores.

Although the con­sci­en­tious­ness scores and “grit” scores were highly cor­re­lated in each of the three treat­ment groups (n-back r = .75, p<.001; MOT r = .70, p<.001; pas­sive r = .76, p<.001), the two mea­sures differed in their cor­re­la­tions with the behav­ioral out­come mea­sures. A higher “grit” score pre­dicted less improve­ment on the RAPM for the n-back group (r = −.45, p = .049) and the MOT group (r = −.58, p = .009), such that par­tic­i­pants who viewed them­selves as hav­ing more “grit” improved less on the RAPM after train­ing, although this rela­tion­ship did not hold for the No-Con­tact group (r = .17, p = .5). Sim­i­lar­ly, a higher score on the con­sci­en­tious­ness mea­sure pre­dicted less improve­ment on the RAPM for the MOT group (r = −.57, p = .01), such that par­tic­i­pants who saw them­selves as more con­sci­en­tious improved less on the RAPM after train­ing, although this was not observed in either of the other two groups (n-back r = −.21, p = .37; no-con­tact r = −.04, p = .85). Final­ly, a high con­sci­en­tious­ness score pre­dicted a lower Pair Can­cel­la­tion improve­ment within the MOT group (r = −.47, p = .04), but not in the n-back or no-con­tact con­trol groups (n-back r = −.07, p = .77; no-con­tact r = −.13, p = .58). No other trans­fer mea­sures were sig­nifi­cantly pre­dicted by either con­sci­en­tious­ness or grit scores.

Smith et al 2013

“Explor­ing the effec­tive­ness of com­mer­cial and cus­tom-built games for cog­ni­tive train­ing”, Smith et al 2013

There is increas­ing inter­est in quan­ti­fy­ing the effec­tive­ness of com­puter games in non-en­ter­tain­ment domains. We have explored gen­eral intel­li­gence improve­ments for par­tic­i­pants using either a com­mer­cial-off-the-shelf (COTS) game [Brain Age], a cus­tom do-it-y­our­self (DIY) train­ing sys­tem for a work­ing mem­ory task [DNB] or an online strat­egy game to a con­trol group (with­out train­ing). Forty uni­ver­sity level par­tic­i­pants were divided into four groups (COTS, DIY, Gam­ing, [Pas­sive] Con­trol) and were eval­u­ated three times (pre-in­ter­ven­tion, post-in­ter­ven­tion, 1-week fol­low-up) with three weeks of train­ing. In gen­eral intel­li­gence tests both cog­ni­tive train­ing sys­tems (COTS and DIY groups) failed to pro­duce [sta­tis­ti­cal­ly-]sig­nifi­cant improve­ments in com­par­i­son to a con­trol group or a gam­ing group. Also nei­ther cog­ni­tive train­ing sys­tem pro­duced [sta­tis­ti­cal­ly-]sig­nifi­cant improve­ments over the inter­ven­tion or fol­low-up peri­ods.

Dual n-back; RAPM (10 min­utes each test); 1 pas­sive con­trol group, 2 actives; 340 min­utes train­ing; min­i­mal com­pen­sa­tion (course credit & entry into “a prize draw”, which was worth 100£). Very small sam­ple sizes (~10 in each of the 4 group­s).

Nussbaumer et al 2013

“Lim­i­ta­tions and chances of work­ing mem­ory train­ing”, Nuss­baumer et al 2013:

Recent stud­ies show con­tro­ver­sial results on the train­abil­ity of work­ing mem­ory (WM) capac­ity being a lim­it­ing fac­tor of human cog­ni­tion. In order to con­tribute to this open ques­tion we inves­ti­gated if par­tic­i­pants improve in trained tasks and whether gains gen­er­al­ize to untrained WM tasks, math­e­mat­i­cal prob­lem solv­ing and intel­li­gence tests.

83 adults trained over a three week period (7.5 hours total) in one of the fol­low­ing con­di­tions: A high, a medium or a low WM load group. The present find­ings show that task spe­cific char­ac­ter­is­tics could be learned but that there was no trans­fer between trained and untrained tasks which had no com­mon ele­ments. Pos­i­tive trans­fer occurred between two tasks focus­ing on inhibitory process­es. It might be pos­si­ble to enhance this spe­cific com­po­nent of WM but not WM capac­ity as such. A pos­si­ble enhance­ment in a learn­ing test is of high edu­ca­tional inter­est and worth­while to be inves­ti­gated fur­ther.

One of the two IQ tests was the RAPM; this was dual n-back, but it was adap­tive only in the “high” exper­i­men­tal group (so the “medium” and “low” groups are largely irrel­e­van­t). Paper does not pro­vide RAPM score details, so I emailed the lead author.

Oelhafen et al 2013

“Increased pari­etal activ­ity after train­ing of inter­fer­ence con­trol”,

…In the cur­rent study, we exam­ined whether train­ing on two vari­ants of the adap­tive dual n-back task would affect untrained task per­for­mance and the cor­re­spond­ing elec­tro­phys­i­o­log­i­cal even­t-re­lated poten­tials (ERPs). 43 healthy young adults trained for three weeks with a high or low inter­fer­ence train­ing vari­ant of the dual n-back task, or they were assigned to a pas­sive con­trol group. While n-back train­ing with high inter­fer­ence led to par­tial improve­ments in the Atten­tion Net­work Test (ANT), we did not find trans­fer to mea­sures of work­ing mem­ory and fluid intel­li­gence. ERP analy­sis in the n-back task and the ANT indi­cated over­lap­ping processes in the P3 time range. More­over, in the ANT, we detected increased pari­etal activ­ity for the inter­fer­ence train­ing group alone. In con­trast, we did not find elec­tro­phys­i­o­log­i­cal differ­ences between the low inter­fer­ence train­ing and the con­trol group. These find­ings sug­gest that train­ing on an inter­fer­ence con­trol task leads to higher elec­tro­phys­i­o­log­i­cal activ­ity in the pari­etal cor­tex, which may be related to improve­ments in pro­cess­ing speed, atten­tional con­trol, or both.

Sprenger et al 2013

“Train­ing work­ing mem­o­ry: Lim­its of trans­fer”, Sprenger et al 2013; abstract:

In two exper­i­ments (to­tal­ing 253 adult par­tic­i­pants), we exam­ined the extent to which inten­sive work­ing mem­ory train­ing led to improve­ments on untrained mea­sures of cog­ni­tive abil­i­ty. Although par­tic­i­pants showed improve­ment on the trained task and on tasks that either shared task char­ac­ter­is­tics or stim­uli, we found no evi­dence that train­ing led to gen­eral improve­ments in work­ing mem­o­ry. Using Bayes Fac­tor analy­sis, we show that the data gen­er­ally sup­port the hypoth­e­sis that work­ing mem­ory train­ing was ineffec­tive at improv­ing gen­eral cog­ni­tive abil­i­ty. This con­clu­sion held even after con­trol­ling for a num­ber of indi­vid­ual differ­ences, includ­ing need for cog­ni­tion, beliefs in the mal­leabil­ity of intel­li­gence, and age.

Colom et al 2013

“Adap­tive n-back train­ing does not improve fluid intel­li­gence at the con­struct lev­el; gains on indi­vid­ual tests sug­gest train­ing may enhance visu­ospa­tial pro­cess­ing”, Colom et al 2013:

Short­-term adap­tive cog­ni­tive train­ing based on the n-back task is reported to increase scores on indi­vid­ual abil­ity tests, but the key ques­tion of whether such increases gen­er­al­ize to the intel­li­gence con­struct is not clear. Here we eval­u­ate fluid/abstract intel­li­gence (Gf), crystallized/verbal intel­li­gence (Gc), work­ing mem­ory capac­ity (WMC), and atten­tion con­trol (ATT) using diverse mea­sures, with equiv­a­lent ver­sions, for esti­mat­ing any changes at the con­struct level after train­ing. Begin­ning with a sam­ple of 169 par­tic­i­pants, two groups of twen­ty-eight women each were selected and matched for their gen­eral cog­ni­tive abil­ity scores and demo­graphic vari­ables. Under strict super­vi­sion in the lab­o­ra­to­ry, the train­ing group com­pleted an inten­sive adap­tive train­ing pro­gram based on the n-back task (vi­su­al, audi­to­ry, and dual ver­sions) across twen­ty-four ses­sions dis­trib­uted over twelve weeks. Results showed this group had the expected sys­tem­atic improve­ments in n-back per­for­mance over time; this per­for­mance sys­tem­at­i­cally cor­re­lated across ses­sions with Gf, Gc, and WMC, but not with ATT. How­ev­er, the main find­ing showed no sig­nifi­cant changes in the assessed psy­cho­log­i­cal con­structs for the train­ing group as com­pared with the con­trol group. Nev­er­the­less, post-hoc analy­ses sug­gested that spe­cific tests and tasks tap­ping visu­ospa­tial pro­cess­ing might be sen­si­tive to train­ing.

<– One hun­dred and sixty nine psy­chol­ogy under­grad­u­ates com­pleted a bat­tery of twelve intel­li­gence tests and cog­ni­tive tasks mea­sur­ing flu­id-ab­stract and crys­tal­lized-ver­bal intel­li­gence, work­ing mem­ory capac­i­ty, and atten­tion con­trol. After com­put­ing a gen­eral index from the six intel­li­gence tests, two groups of twen­ty-eight females were recruited for the study. They were paid for their par­tic­i­pa­tion€. Mem­bers of each group were care­fully matched for their gen­eral intel­li­gence index, so they were per­fectly over­lapped and rep­re­sented a wide range of scores. All par­tic­i­pants were right hand­ed, as assessed by the Edin­burgh Test (Old­field, 1971). They also com­pleted a set of ques­tions ask­ing for med­ical or psy­chi­atric dis­or­ders, as well as sub­stance intake. The recruit­ment process fol­lowed the Helsinki guide­lines (World Med­ical Asso­ci­a­tion, 2008) and the local ethics com­mit­tee approved the study. Descrip­tive sta­tis­tics for the demo­graphic vari­ables and per­for­mance on the cog­ni­tive mea­sures for the two groups of par­tic­i­pants (train­ing and con­trol) can be seen in the Appen­dix (Table A.1.). [200€ if assigned to the train­ing group and 100€ if assigned to the con­trol group.] [150 euros in dol­lars is $204]

The col­lec­tive psy­cho­log­i­cal assess­ment for the pretest stage was done from Sep­tem­ber 19 to Octo­ber 14 2011. Par­tic­i­pants were assessed in groups not greater than twen­ty-five. The data obtained for the com­plete group (N = 169) were ana­lyzed for recruit­ing the train­ing (N = 28) and con­trol (N = 28) groups based on the gen­eral index com­puted from the mea­sures of fluid and crys­tal­lized intel­li­gence (Table A.1.). The adap­tive cog­ni­tive train­ing pro­gram began in Novem­ber 14 2011, remained active until 2012-02-17, and lasted for twelve weeks (with a break from Decem­ber 24 2011 to Jan­u­ary 9 2012). The psy­cho­log­i­cal assess­ment for the posttest was done indi­vid­u­ally from Feb­ru­ary 20 to March 09 (in­tel­li­gence tests) and from March 12 to March 30 (cog­ni­tive tasks) 2012.

Intel­li­gence and cog­ni­tive con­structs were assessed by three mea­sures each. As noted above, fluid intel­li­gence (Gf) requires abstract prob­lem solv­ing abil­i­ties, whereas crys­tal­lized intel­li­gence (Gc) involves the men­tal manip­u­la­tion of cul­tural knowl­edge. Gf was mea­sured by screen­ing ver­sions (odd num­bered items and even num­bered items for the pretest and posttest eval­u­a­tions, respec­tive­ly) of the Raven Advanced Pro­gres­sive Matri­ces Test (RAPM), the abstract rea­son­ing sub­test from the Differ­en­tial Apti­tude Test (DAT-AR), and the induc­tive rea­son­ing sub­test from the Pri­mary Men­tal Abil­i­ties Bat­tery (PMA-R). Gc was mea­sured by screen­ing ver­sions (odd num­bered items and even num­bered items for the pretest and posttest eval­u­a­tions, respec­tive­ly) of the ver­bal rea­son­ing sub­test from the DAT (DAT-VR), the numer­i­cal rea­son­ing sub­test from the DAT (DAT-NR), and the vocab­u­lary sub­test from the PMA (PMA-V). Gf and Gc were mea­sured by tests with (PMA sub­tests) and with­out (RAPM and DAT sub­tests) highly speeded con­straints.

The frame­work for the cog­ni­tive train­ing pro­gram fol­lowed the guide­lines reported by Jaeggi et al. (2008) but it was re-pro­grammed for Visual Basic (2008 Ver­sion). Nev­er­the­less, there were some differ­ences: (a) the train­ing began with four ses­sions (weeks 1 and 2) with a visual adap­tive n-back ver­sion and four ses­sions (weeks 3 and 4) with an audi­tory adap­tive n-back ver­sion before fac­ing the six­teen ses­sions of the adap­tive n-back dual pro­gram (weeks 5 to 12), and (b) while the train­ing pro­gram is usu­ally com­pleted in one mon­th, here we extended the train­ing period to three months (12 week­s). There were two train­ing ses­sions per week last­ing around 30 min each and they took place under strict super­vi­sion in the lab­o­ra­to­ry. Par­tic­i­pants worked within indi­vid­ual cab­ins and the exper­i­menter was always avail­able for attend­ing any request they might have. Data were ana­lyzed every week for check­ing their progress at both the indi­vid­ual and the group lev­el. Par­tic­i­pants received sys­tem­atic feed­back regard­ing their per­for­mance. Fur­ther­more, every two weeks par­tic­i­pants com­pleted a moti­va­tion ques­tion­naire ask­ing for their (a) involve­ment with the task, (b) per­ceived diffi­culty lev­el, (c) per­ceived chal­leng­ing of the task lev­els, and (d) expec­ta­tions for future achieve­ment. At the end of the train­ing period par­tic­i­pants were asked with respect to their gen­eral eval­u­a­tion of the pro­gram. Using a rat­ing scale from 0 to 10, aver­age val­ues were (a) 8.1 (range 8.0 to 8.2 across ses­sion­s), (b) 7.9 (range 7.4 to 8.5 across ses­sion­s), (c) 8.0 (range 7.8 to 8.2 across ses­sion­s), and (d) 7 (range 6.5 to 7.7 across ses­sion­s). [12 weeks * 2 ses­sions * 30 min­utes = 720 min­utes]

The con­trol group was pas­sive. After the recruit­ment process, mem­bers of this no-con­tact con­trol group were invited to fol­low their nor­mal life as uni­ver­sity stu­dents. As rea­soned in some of our pre­vi­ous research reports address­ing the poten­tial effect of cog­ni­tive train­ing, and accord­ing to the main the­o­ret­i­cal frame­work, we were not inter­ested in com­par­ing differ­ent types of train­ing, but in the com­par­i­son between a spe­cific cog­ni­tive train­ing and doing noth­ing beyond reg­u­lar life.

Four out of six intel­li­gence tests were applied with­out severe time con­straints. For the RAPM there was more than one minute per item (20 min­utes for 18 item­s). For DAT-AR DAT-NR and DAT-VR, there were approx­i­mately 30 sec­onds per item (10 min­utes for 20 item­s). For the speeded tests (PMA-R and PMA-V) there were between 5 and 12 sec­onds per item (PMA-R: 3 min­utes for 15 items and PMA-V: 2 min for 25 item­s.)

posttest: train­ing, n=28 RAPM: 37.25 (6.23) con­trol n=28 RAPM: 35.46 (8.26)


Mean differ­ences between the odd and even items were sig­nifi­cant (p < 0.001 for all the tests, exclud­ing the DAT-VR) which implies that pretest (odd) and posttest (even) scores must not be directly com­pared. –>

Burki et al 2014

“Indi­vid­ual differ­ences in cog­ni­tive plas­tic­i­ty: an inves­ti­ga­tion of train­ing curves in younger and older adults”, Burki et al 2014

To date, cog­ni­tive inter­ven­tion research has pro­vided mixed but nev­er­the­less promis­ing evi­dence with respect to the effects of cog­ni­tive train­ing on untrained tasks (trans­fer). How­ev­er, the mech­a­nisms behind learn­ing, train­ing effects and their pre­dic­tors are not fully under­stood. More­over, indi­vid­ual differ­ences, which may con­sti­tute an impor­tant fac­tor impact­ing train­ing out­come, are usu­ally neglect­ed. We sug­gest inves­ti­gat­ing indi­vid­ual train­ing per­for­mance across train­ing ses­sions in order to gain fin­er-grained knowl­edge of train­ing gains, on the one hand, and assess­ing the poten­tial impact of pre­dic­tors such as age and fluid intel­li­gence on learn­ing rate, on the other hand. To this aim, we pro­pose to model indi­vid­ual learn­ing curves to exam­ine the intra-in­di­vid­ual change in train­ing as well as inter-in­di­vid­ual differ­ences in intra-in­di­vid­ual change. We rec­om­mend intro­duc­ing a latent growth curve model (LGCM) analy­sis, a method fre­quently applied to learn­ing data but rarely used in cog­ni­tive train­ing research. Such advanced analy­ses of the train­ing phase allow iden­ti­fy­ing fac­tors to be respected when design­ing effec­tive tai­lor-made train­ing inter­ven­tions. To illus­trate the pro­posed approach, a LGCM analy­sis using data of a 10-day work­ing mem­ory train­ing study in younger and older adults is report­ed.

Repub­li­ca­tion of a the­sis.

Pugin et al 2014

“Work­ing mem­ory train­ing shows imme­di­ate and long-term effects on cog­ni­tive per­for­mance in chil­dren and ado­les­cents”, Pugin et al 2014:

Work­ing mem­ory is impor­tant for men­tal rea­son­ing and learn­ing process­es. Sev­eral stud­ies in adults and school-age chil­dren have shown per­for­mance improve­ment in cog­ni­tive tests after work­ing mem­ory train­ing. Our aim was to exam­ine not only imme­di­ate but also long-term effects of inten­sive work­ing mem­ory train­ing on cog­ni­tive per­for­mance tests in chil­dren and ado­les­cents. Four­teen healthy male sub­jects between 10 and 16 years trained a visu­ospa­tial n-back task over 3 weeks (30 min dai­ly), while 15 indi­vid­u­als of the same age range served as a pas­sive con­trol group. Sig­nifi­cant differ­ences in imme­di­ate (after 3 weeks of train­ing) and long-term effects (after 2-6 months) in an audi­tory n-back task were observed com­pared to con­trols (2.5 fold imme­di­ate and 4.7 fold long-term increase in the train­ing group com­pared to the con­trol­s). The improve­ment was more pro­nounced in sub­jects who improved their per­for­mance dur­ing the train­ing. Other cog­ni­tive func­tions (ma­tri­ces test and Stroop task) did not change when com­par­ing the train­ing group to the con­trol group. We con­clude that spa­tial work­ing mem­ory train­ing in chil­dren and ado­les­cents boosts per­for­mance in sim­i­lar mem­ory tasks such as the audi­tory n-back task. The sus­tained per­for­mance improve­ment sev­eral months after the train­ing sup­ports the effec­tive­ness of the train­ing.

Heffernan 2014

“The Gen­er­al­iz­abil­ity of Dual N-Back Train­ing in Younger Adults”, Heffer­nan 2014 (Hal­i­fax, Nova Sco­tia; Canada):

Intro­duc­tion: The pop­u­lar­ity of cog­ni­tive train­ing has increased in recent years. Accu­mu­lat­ing evi­dence shows that train­ing can some­times improve trained and non-trained cog­ni­tive func­tions, and these improve­ments may be related to indi­vid­ual differ­ences in ini­tial capac­ity and per­for­mance on the train­ing task. The cur­rent study assessed the effec­tive­ness of a cus­tom-de­signed n-back task (the N-IGMA) ver­sus an active con­trol task (Block­mas­ter) at improv­ing var­i­ous forms of work­ing mem­ory capac­i­ty, atten­tion, and fluid intel­li­gence. Three mea­sures of work­ing mem­ory capac­ity were con­sid­ered: ver­bal, visu­ospa­tial and observed action. Meth­ods: Out­come mea­sures were assessed pre- and post-train­ing. Nine­teen healthy young adults (19-30 years of age) trained at-home for 30 min­utes per day, five days a week for three weeks with either the N-IGMA (n=9) or Block­mas­ter (n=10) at-home games. Results: Pre-post changes were observed for some out­come mea­sures and these were equal for the N-IGMA and active con­trol group. Out­come improve­ments could be due to sim­ple test/re-test ben­e­fits or alter­na­tively the N-IGMA and Block­mas­ter tasks may pro­duce equiv­a­lent train­ing effects. Improve­ments in the train­ing tasks did not cor­re­late with the changes in the out­come mea­sures, sug­gest­ing improve­ments in the out­come mea­sures might not be attrib­ut­able to trans­fer of learn­ing. For ver­bal work­ing mem­ory only, par­tic­i­pants with higher (ver­sus low­er) ini­tial fluid intel­li­gence demon­strated larger improve­ments on the out­come mea­sures sug­gest­ing that in future research train­ing tasks might need to be tai­lored to the indi­vid­ual par­tic­i­pant. Pre-assess­ment but not change scores were related for observed action and visu­ospa­tial work­ing mem­o­ry, con­sis­tent with some over­lap between con­tent domains. Con­clu­sion: Despite specifi­cally tar­get­ing work­ing mem­o­ry, the N-IGMA was not bet­ter than a visu­ospa­tial con­trol game at improv­ing a vari­ety of cog­ni­tive out­come mea­sures in this small sam­ple. Results sug­gest that the indi­vid­u­al’s ini­tial cog­ni­tive capac­ity might need to be con­sid­ered in future train­ing stud­ies. Cau­tion should be used in extrap­o­lat­ing the results of this study to other pop­u­la­tions of inter­est (e.g., older adults or indi­vid­u­als with cog­ni­tive deficits) since the present inves­ti­ga­tion included rel­a­tively high func­tion­ing indi­vid­u­als.

Hancock 2013

“Pro­cess­ing speed and work­ing mem­ory train­ing in Mul­ti­ple Scle­ro­sis: a blinded ran­dom­ized con­trolled trial”, Han­cock 2013

Between 40-65% of patients with mul­ti­ple scle­ro­sis (MS) expe­ri­ence cog­ni­tive deficits asso­ci­ated with the dis­ease. The two most com­mon areas affected are infor­ma­tion pro­cess­ing speed and work­ing mem­o­ry. Infor­ma­tion pro­cess­ing speed has been posited as a core cog­ni­tive deficit in MS, and work­ing mem­ory has been shown to impact per­for­mance on a wide vari­ety of domains for MS patients. Cur­rent­ly, clin­i­cians have few reli­able options for address­ing cog­ni­tive deficits in MS. The cur­rent study aimed to inves­ti­gate the effect of com­put­er­ized, home­-based cog­ni­tive train­ing focused specifi­cally on improv­ing infor­ma­tion pro­cess­ing speed and work­ing mem­ory for MS patients. Par­tic­i­pants were recruited and ran­dom­ized into either the Active Train­ing or Sham Train­ing group, tested with a neu­rocog­ni­tive bat­tery at base­line, com­pleted six weeks of train­ing, and then were again tested with a neu­rocog­ni­tive bat­tery at fol­low-up. After cor­rect­ing for mul­ti­ple com­par­isons, results indi­cated that the Active Train­ing group scored higher on the Paced Audi­tory Ser­ial Addi­tion Test (a test of infor­ma­tion pro­cess­ing speed and atten­tion) fol­low­ing cog­ni­tive train­ing, and data trended toward sig­nifi­cance on the Con­trolled Oral Word Asso­ci­a­tions Task (a test of exec­u­tive func­tion­ing), Let­ter Num­ber Sequenc­ing (a test of work­ing mem­o­ry), Brief Visu­ospa­tial Mem­ory Test (a test of visual mem­o­ry), and the Con­ners’ Con­tin­u­ous Per­for­mance Test (a test of atten­tion). Results pro­vide pre­lim­i­nary evi­dence that cog­ni­tive train­ing with MS patients may pro­duce mod­er­ate improve­ment in select areas of cog­ni­tive func­tion­ing. Fol­low-up stud­ies with larger sam­ples should be con­ducted to deter­mine whether these results can be repli­cat­ed, and also to deter­mine the func­tional out­come of improve­ments on neu­rocog­ni­tive tests.

Waris et al 2015

, Waris et al 2015

Dur­ing the past decade, work­ing mem­ory train­ing has attracted much inter­est. How­ev­er, the train­ing out­comes have var­ied between stud­ies and method­olog­i­cal prob­lems have ham­pered the inter­pre­ta­tion of results. The cur­rent study exam­ined trans­fer after work­ing mem­ory updat­ing train­ing by employ­ing an exten­sive bat­tery of pre-post cog­ni­tive mea­sures with a focus on near trans­fer. Thir­ty-one healthy Finnish young adults were ran­dom­ized into either a work­ing mem­ory train­ing group or an active con­trol group. The work­ing mem­ory train­ing group prac­ticed with three work­ing mem­ory tasks, while the con­trol group trained with three com­mer­cial com­puter games with a low work­ing mem­ory load. The par­tic­i­pants trained thrice a week for five weeks, with one train­ing ses­sion last­ing about 45 min­utes. Com­pared to the con­trol group, the work­ing mem­ory train­ing group showed strongest trans­fer to an n-back task, fol­lowed by work­ing mem­ory updat­ing, which in turn was fol­lowed by active work­ing mem­ory capac­i­ty. Our results sup­port the view that work­ing mem­ory train­ing pro­duces near trans­fer effects, and that the degree of trans­fer depends on the cog­ni­tive over­lap between the train­ing and trans­fer mea­sures.

Baniqued et al 2015

, Ban­iqued et al 2015:

Although some stud­ies have shown that cog­ni­tive train­ing can pro­duce improve­ments to untrained cog­ni­tive domains (far trans­fer), many oth­ers fail to show these effects, espe­cially when it comes to improv­ing fluid intel­li­gence. The cur­rent study was designed to over­come sev­eral lim­i­ta­tions of pre­vi­ous train­ing stud­ies by incor­po­rat­ing train­ing expectancy assess­ments, an active con­trol group, and “Mind Fron­tiers,” a video game-based mobile pro­gram com­prised of six adap­tive, cog­ni­tively demand­ing train­ing tasks that have been found to lead to increased scores in fluid intel­li­gence (Gf) tests. We hypoth­e­size that such inte­grated train­ing may lead to broad improve­ments in cog­ni­tive abil­i­ties by tar­get­ing aspects of work­ing mem­o­ry, exec­u­tive func­tion, rea­son­ing, and prob­lem solv­ing. Ninety par­tic­i­pants com­pleted 20 hour-and-a-half long train­ing ses­sions over four to five weeks, 45 of whom played Mind Fron­tiers and 45 of whom com­pleted visual search and change detec­tion tasks (ac­tive con­trol). After train­ing, the Mind Fron­tiers group improved in work­ing mem­ory n-back tests, a com­pos­ite mea­sure of per­cep­tual speed, and a com­pos­ite mea­sure of reac­tion time in rea­son­ing tests. No train­ing-re­lated improve­ments were found in rea­son­ing accu­racy or other work­ing mem­ory tests, nor in com­pos­ite mea­sures of episodic mem­o­ry, selec­tive atten­tion, divided atten­tion, and mul­ti­-task­ing. Per­ceived self­-im­prove­ment in the tested abil­i­ties did not differ between groups. A gen­eral expectancy differ­ence in prob­lem-solv­ing was observed between groups, but this per­ceived ben­e­fit did not cor­re­late with train­ing-re­lated improve­ment. In sum­ma­ry, although these find­ings pro­vide mod­est evi­dence regard­ing the effi­cacy of an inte­grated cog­ni­tive train­ing pro­gram, more research is needed to deter­mine the util­ity of Mind Fron­tiers as a cog­ni­tive train­ing tool.

Kuper & Karbach 2015

“Increased train­ing com­plex­ity reduces the effec­tive­ness of brief work­ing mem­ory train­ing: evi­dence from short­-term sin­gle and dual n-back train­ing inter­ven­tions”, Kuper & Kar­bach 2015:

N-back train­ing has recently come under intense sci­en­tific scrutiny due to reports of train­ing-re­lated improve­ments in gen­eral fluid intel­li­gence. As of yet, rel­a­tively lit­tle is known about the effects of short­-term n-back train­ing inter­ven­tions, how­ev­er. In a pretest-train­ing-posttest design, we com­pared brief dual and sin­gle n-back train­ing reg­i­men in terms of train­ing gains and trans­fer effects rel­a­tive to a pas­sive con­trol group. Trans­fer effects indi­cated that, in the short­-term, sin­gle n-back train­ing may be the more effec­tive train­ing task: At the short train­ing dura­tion we employed, train­ing group showed far trans­fer to spe­cific task switch costs, Stroop inhi­bi­tion or matrix rea­son­ing index­ing fluid intel­li­gence. Yet, both types of train­ing resulted in a reduc­tion of gen­eral task switch costs indi­cat­ing improved cog­ni­tive con­trol dur­ing the sus­tained main­te­nance of com­pet­ing task sets. Sin­gle but not dual n-back train­ing addi­tion­ally yielded near trans­fer to an untrained work­ing mem­ory updat­ing task.

Lindeløv et al 2016

“Train­ing and trans­fer effects of N-back train­ing for brain-in­jured and healthy sub­jects”, Lin­deløv et al 2016:

Work­ing mem­ory impair­ments are preva­lent among patients with acquired brain injury (ABI). Com­put­erised train­ing tar­get­ing work­ing mem­ory has been researched exten­sively using sam­ples from healthy pop­u­la­tions but this field remains iso­lated from sim­i­lar research in ABI patients. We report the results of an actively con­trolled ran­domised con­trolled trial in which 17 patients and 18 healthy sub­jects com­pleted train­ing on an N-back task. The healthy group had supe­rior improve­ments on both train­ing tasks (SMD = 6.1 and 3.3) whereas the ABI group improved much less (SMD = 0.5 and 1.1). Nei­ther group demon­strated trans­fer to untrained tasks. We con­clude that com­put­erised train­ing facil­i­tates improve­ment of spe­cific skills rather than high­-level cog­ni­tion in healthy and ABI sub­jects alike. The acqui­si­tion of these spe­cific skills seems to be impaired by brain injury. The most effec­tive use of com­put­er-based cog­ni­tive train­ing may be to make the task resem­ble the tar­geted behav­iour(s) closely in order to exploit the stim­u­lus-speci­ficity of learn­ing.

Schwarb et al 2015

“Work­ing mem­ory train­ing improves visual short­-term mem­ory capac­ity”, Schwarb et al 2015

Lawlor-Savage & Goghari 2016

, Lawlor-Sav­age & Goghari 2016:

Enhanc­ing cog­ni­tive abil­ity is an attrac­tive con­cept, par­tic­u­larly for mid­dle-aged adults inter­ested in main­tain­ing cog­ni­tive func­tion­ing and pre­vent­ing age-re­lated declines. Com­put­er­ized work­ing mem­ory train­ing has been inves­ti­gated as a safe method of cog­ni­tive enhance­ment in younger and older adults, although few stud­ies have con­sid­ered the poten­tial impact of work­ing mem­ory train­ing on mid­dle-aged adults. This study inves­ti­gated dual n-back work­ing mem­ory train­ing in healthy adults aged 30-60. Fifty-seven adults com­pleted mea­sures of work­ing mem­o­ry, pro­cess­ing speed, and fluid intel­li­gence before and after a 5-week web-based dual n-back or active con­trol (pro­cess­ing speed) train­ing pro­gram. Results: Repeated mea­sures mul­ti­vari­ate analy­sis of vari­ance failed to iden­tify improve­ments across the three cog­ni­tive com­pos­ites, work­ing mem­o­ry, pro­cess­ing speed, and fluid intel­li­gence, after train­ing. Fol­low-up Bayesian analy­ses sup­ported null find­ings for train­ing effects for each indi­vid­ual com­pos­ite. Find­ings sug­gest that dual n-back work­ing mem­ory train­ing may not ben­e­fit work­ing mem­ory or fluid intel­li­gence in healthy adults. Fur­ther inves­ti­ga­tion is nec­es­sary to clar­ify if other forms of work­ing mem­ory train­ing may be ben­e­fi­cial, and what fac­tors impact train­ing-re­lated ben­e­fits, should they occur, in this pop­u­la­tion.

Studer-Luethi et al 2015

“Work­ing mem­ory train­ing in chil­dren: Effec­tive­ness depends on tem­pera­ment”, Stud­er-Luethi et al 2015:

Stud­ies reveal­ing trans­fer effects of work­ing mem­ory (WM) train­ing on non-trained cog­ni­tive per­for­mance of chil­dren hold promis­ing impli­ca­tions for scholas­tic learn­ing. How­ev­er, the results of exist­ing train­ing stud­ies are not con­sis­tent and pro­voke debates about the poten­tial and lim­i­ta­tions of cog­ni­tive enhance­ment. To exam­ine the influ­ence of indi­vid­ual differ­ences on train­ing out­comes is a promis­ing approach for find­ing causes for such incon­sis­ten­cies. In this study, we imple­mented WM train­ing in an ele­men­tary school set­ting. The aim was to inves­ti­gate near and far trans­fer effects on cog­ni­tive abil­i­ties and aca­d­e­mic achieve­ment and to exam­ine the mod­er­at­ing effects of a dis­po­si­tional and a reg­u­la­tive tem­pera­ment fac­tor, neu­roti­cism and effort­ful con­trol. Nine­ty-nine sec­ond-graders were ran­domly assigned to 20 ses­sions of com­put­er-based adap­tive WM train­ing, com­put­er-based read­ing train­ing, or a no-con­tact con­trol group. For the WM train­ing group, our analy­ses reveal near trans­fer on a visual WM task, far trans­fer on a vocab­u­lary task as a proxy for crys­tal­lized intel­li­gence, and increased aca­d­e­mic achieve­ment in read­ing and math by trend. Con­sid­er­ing indi­vid­ual differ­ences in tem­pera­ment, we found that effort­ful con­trol pre­dicts larger train­ing mean and gain scores and that there is a mod­er­a­tion effect of both tem­pera­ment fac­tors on post-train­ing improve­ment: WM train­ing con­di­tion pre­dicted higher post-train­ing gains com­pared to both con­trol con­di­tions only in chil­dren with high effort­ful con­trol or low neu­roti­cism. Our results sug­gest that a short but inten­sive WM train­ing pro­gram can enhance cog­ni­tive abil­i­ties in chil­dren, but that suffi­cient self­-reg­u­la­tive abil­i­ties and emo­tional sta­bil­ity are nec­es­sary for WM train­ing to be effec­tive….We found no sig­nifi­cant train­ing group inter­ac­tion on the per­for­mance in the Raven’s Pro­gres­sive Matri­ces (F(2,92) = 1.57, p = .22, ηp2 = .004)…We found no sig­nifi­cant long-term effects in the vari­ables mem­ory span, cog­ni­tive con­trol, Gf, Gc, and scholas­tic tests (all T < 1.4).

Minear et al 2016

“A simul­ta­ne­ous exam­i­na­tion of two forms of work­ing mem­ory train­ing: Evi­dence for near trans­fer only”, Min­ear et al 2016

The effi­cacy of work­ing-mem­ory train­ing is a topic of con­sid­er­able debate, with some stud­ies show­ing trans­fer to mea­sures such as fluid intel­li­gence while oth­ers have not. We report the results of a study designed to exam­ine two forms of work­ing-mem­ory train­ing, one using a spa­tial n-back and the other a ver­bal com­plex span. Thir­ty-one under­grad­u­ates com­pleted 4 weeks of n-back train­ing and 32 com­pleted 4 weeks of ver­bal com­plex span train­ing. We also included two active con­trol groups. One group trained on a non-adap­tive ver­sion of n-back and the other trained on a real-time strat­egy video game. All par­tic­i­pants com­pleted pre- and post-train­ing mea­sures of a large bat­tery of trans­fer tasks used to cre­ate com­pos­ite mea­sures of short­-term and work­ing mem­ory in both ver­bal and visu­ospa­tial domains as well as ver­bal rea­son­ing and fluid intel­li­gence. We only found clear evi­dence for near trans­fer from the spa­tial n-back train­ing to new forms of n-back, and this was the case for both adap­tive and non-adap­tive n-back.

Studer-Luethi et al 2016

“Work­ing mem­ory train­ing in chil­dren: Effec­tive­ness depends on tem­pera­ment”, Stud­er-Luethi et al 2016:

Stud­ies reveal­ing trans­fer effects of work­ing mem­ory (WM) train­ing on non-trained cog­ni­tive per­for­mance of chil­dren hold promis­ing impli­ca­tions for scholas­tic learn­ing. How­ev­er, the results of exist­ing train­ing stud­ies are not con­sis­tent and pro­voke debates about the poten­tial and lim­i­ta­tions of cog­ni­tive enhance­ment. To exam­ine the influ­ence of indi­vid­ual differ­ences on train­ing out­comes is a promis­ing approach for find­ing causes for such incon­sis­ten­cies. In this study, we imple­mented WM train­ing in an ele­men­tary school set­ting. The aim was to inves­ti­gate near and far trans­fer effects on cog­ni­tive abil­i­ties and aca­d­e­mic achieve­ment and to exam­ine the mod­er­at­ing effects of a dis­po­si­tional and a reg­u­la­tive tem­pera­ment fac­tor, neu­roti­cism and effort­ful con­trol. Nine­ty-nine sec­ond-graders were ran­domly assigned to 20 ses­sions of com­put­er-based adap­tive WM train­ing, com­put­er-based read­ing train­ing, or a no-con­tact con­trol group. For the WM train­ing group, our analy­ses reveal near trans­fer on a visual WM task, far trans­fer on a vocab­u­lary task as a proxy for crys­tal­lized intel­li­gence, and increased aca­d­e­mic achieve­ment in read­ing and math by trend. Con­sid­er­ing indi­vid­ual differ­ences in tem­pera­ment, we found that effort­ful con­trol pre­dicts larger train­ing mean and gain scores and that there is a mod­er­a­tion effect of both tem­pera­ment fac­tors on post-train­ing improve­ment: WM train­ing con­di­tion pre­dicted higher post-train­ing gains com­pared to both con­trol con­di­tions only in chil­dren with high effort­ful con­trol or low neu­roti­cism. Our results sug­gest that a short but inten­sive WM train­ing pro­gram can enhance cog­ni­tive abil­i­ties in chil­dren, but that suffi­cient self­-reg­u­la­tive abil­i­ties and emo­tional sta­bil­ity are nec­es­sary for WM train­ing to be effec­tive.

…Fluid intel­li­gence was assessed using either the even or odd items of Raven’s Pro­gres­sive Matri­ces in coun­ter­bal­anced order (RPM, 30 items; Raven, 1998). After two prac­tice tri­als, chil­dren were allowed to work for 10 min and cross the right solu­tion for each task. The num­ber of cor­rect solu­tions pro­vided in this time limit was used as the depen­dent vari­able.

…We found no sig­nifi­cant train­ing group inter­ac­tion on the per­for­mance in the Raven’s Pro­gres­sive Matri­ces (F(2,92) = 1.57, p = .22, ηp2 = .004).


I con­struct a meta-analy­sis of the >19 stud­ies which mea­sure IQ after an n-back inter­ven­tion, con­firm­ing that there is a gain of smal­l­-to-medium effect size. I also inves­ti­gate sev­eral n-back claims, crit­i­cisms, and indi­ca­tors of bias, find­ing:

Due to its length and tech­ni­cal detail, my meta-analy­sis has been moved to .

Does it really work?

N-back improves working memory

There are quite a few stud­ies show­ing sig­nifi­cant increases in work­ing mem­o­ry: WM is some­thing that can be trained. See for exam­ple “Changes in cor­ti­cal activ­ity after train­ing of work­ing mem­ory - a sin­gle-sub­ject analy­sis.” or “Increased pre­frontal and pari­etal activ­ity after train­ing of work­ing mem­ory”.

There are a few stud­ies show­ing that DNB train­ing enhances Gf; see the sup­port sec­tion. There is also a study show­ing that WM train­ing (not DNB) enhances Gc68.

IQ Tests


Because N-back is sup­posed to improve your pure ‘’ (Gf), and not, say, your Eng­lish vocab­u­lary, the most accu­rate tests for see­ing whether N-back has done any­thing are going to be ones that avoid vocab­u­lary or lit­er­a­ture or tests of sub­jec­t-area knowl­edge. That is, ‘cul­ture-neu­tral’ IQ tests. (A non-neu­tral test focuses more on your ‘crys­tal­lized intel­li­gence’, while N-back is sup­posed to affect ‘fluid intel­li­gence’; they do affect each other a lit­tle but it’s bet­ter to test fluid intel­li­gence with a fluid intel­li­gence test.)

As one ML mem­ber writes:

The WAIS test involves crys­tal­lized intel­li­gence and is unsuit­able for judg­ing fluid intel­li­gence. High work­ing mem­ory will not spawn the abil­ity to solve com­plex math­e­mat­i­cal and ver­bal prob­lems on its own, you have to put your extended capac­ity to learn­ing. All very-high­-level IQ tests are largely crys­tal­lized IQ tests, there­fore work­ing mem­ory gains will not be imme­di­ately appar­ent by their mea­sure.ref

Available tests

The gold-s­tan­dard of cul­ture-neu­tral IQ tests is . Unfor­tu­nate­ly, Raven’s is not avail­able for free online, but there are a num­ber of clones one can use - bear­ing in mind their likely inac­cu­racy and the fact that many of them do not ran­dom­ize their ques­tions. It’s a very good idea, if you plan to n-back for a long time, to take an IQ test before and an IQ test after, both to find out whether you improved and so you can tell the rest of us. But the inter­val has to be a long one: if you are test­ing at the begin­ning and end of your train­ing there is prob­a­bly going to be a which will dis­tort your sec­ond score upwards69; it’s strongly rec­om­mended you take a par­tic­u­lar test only once, or with inter­vals on the order of months (and prefer­ably years).

The tests are:

Raven-style matrix tests can be mechan­i­cally gen­er­ated by the San­dia Gen­er­ated Matrix Tool; the gen­er­ated matrix test scores sta­tis­ti­cally look very sim­i­lar to SPM test scores accord­ing to the paper, “Recre­at­ing Raven’s: Soft­ware for sys­tem­at­i­cally gen­er­at­ing large num­bers of Raven-like matrix prob­lems with normed prop­er­ties”.

If Raven-style tests bore you or you’ve gone through the pre­vi­ous ones, there are a wealth of diffi­cult tests at Miyaguchi’s “Uncom­monly Diffi­cult IQ Tests”, and a per­son­al­ity test, the IPIP-NEO, is free (although the con­nec­tion to IQ is min­i­mal).

Other tests that might be use­ful include tests: they pro­vide a non-d­u­al-N-back method of mea­sur­ing WM before one begins train­ing and then after. There is also Cogtest suite of spans and atten­tion tasks or the http://cognitivefun.net/ site (which imple­ments many tasks). The Auto­mated Oper­a­tion Span (OSPAN) Task could be used as well.

IQ test results

Reports of IQ tests have been mixed. Some results have been stun­ning, oth­ers have shown noth­ing.


LSaul posted about his appar­ent rise in IQ back in Octo­ber. From what I remem­ber, he had recently failed to qual­ify for MENSA, which requires a score of about 131 (98th per­centile). He then got a 151 (99.97th per­centile) on a pro­fes­sion­ally admin­is­tered IQ test (WAIS) three months lat­er, after 2 months of reg­u­lar dual-n-back use. –MR

(A >20 point gain sounds very impres­sive. But pos­si­ble con­found­ing fac­tors here are that LSaul appar­ently took 2 differ­ent IQ tests; besides the gen­eral incom­pa­ra­bil­ity of differ­ent IQ tests, it sounds as if the first test was a cul­ture-neu­tral one, while the WAIS has com­po­nents such as ver­bal tests - the sec­ond might well be ‘eas­ier’ for LSaul than the first.)

Mike L. writes:

Empir­i­cally speak­ing, how­ev­er: I took a WAIS-IV IQ test (ad­min­is­tered pro­fes­sion­al­ly) around a year ago and got a 110. I took a deriv­a­tive of the same test recently (mind you, after about 20 days of DNB train­ing) and got a score of 121.

The blog­ger of “Inhu­man Exper­i­ment”, who played for ~22 days and went from ~2.6-back to ~4-back, reports:

The other test proved to be quite good (you can find it here). In this one, the ques­tions vary, the diffi­culty is adjusted on the go depend­ing on whether you answer them cor­rect­ly, and there’s a time limit of 45 sec­onds per ques­tion, which makes this test bet­ter suited for re-tak­ing. My first test, taken before play­ing the game, gave me a score of 126; my sec­ond test, taken yes­ter­day, gave me a score of 132 (an increase of about 5%)….As you can see, it’s kind of diffi­cult to draw any mean­ing­ful con­clu­sions from this. Yes, there was a slight increase in my score, but I would say a sim­i­lar increase could’ve been pos­si­ble even with­out play­ing the game. I think the vari­a­tion in the IQ test ques­tions reduces the “learn­ing by heart” effect, but that’s impos­si­ble to say with­out a con­trol group.

Pon­tus Granström writes that

I scored 133 on www.mensa.dk/iqtest.swf today. I have never scored that high before I really feel the “DNB think­ing” kick­ing in.

(He appar­ently took that test about a year ago, and avers that his orig­i­nal score on it ‘was 122. Well below 130.’)

Pheonexia writes:

Approx­i­mately three years ago I took the “Euro­pean IQ Test.” It was posted on some mes­sage board and the author of the thread said the test was cred­i­ble. At that time, I scored 126.

I’ve been n-back­ing since early Feb­ru­ary, so I fig­ured I’d try it again today. I googled “Euro­pean IQ Test” and clicked the first result, a test from Nanyang Tech­no­log­i­cal Uni­ver­sity in Sin­ga­pore.. I don’t recall any of the exact ques­tions for the first one I took three years ago, but the for­mat of this test seemed almost iden­ti­cal. Today I scored 144, 18 points higher than before. http://www3.ntu.edu.sg/home/czzhao/iq/test.htm

To me, this is anec­do­tal evi­dence that n-back­ing does increase intel­li­gence. I’ll try again for another three months and take a com­pletely differ­ent test. I will admit, how­ev­er, that I rec­og­nized one of the first ques­tions as the Fibonacci sequence, so I attribute that to crys­tal­lized, not fluid intel­li­gence. The high­est score this test allows for is 171, mean­ing you got ZERO ques­tions wrong. I got 6 wrong and 3 half ques­tions wrong where it requires two answers (that was my worst sec­tion), so either 7.5 or 9 out of 33 ques­tions wrong.


I took one of the IQ tests I did pre­vi­ously [pre­vi­ously linked as “High IQ Soci­ety Online Test”] and scored 109 on, I just took it again and scored 116…I don’t know about retest effect, but all the ques­tions were differ­ent.

Toto writes in TNB(PIA) may improve intel­li­gence”:

While DNB proved ineffec­tive for me (at least it did­n’t increase my IQ, though it improved mem­o­ry) TNB may have made a differ­ence. I took 2 high­-range tests dur­ing the last 2 months and the results were higher than I expected - my IQ was some­where between 130 and 135 on good online tests, I scored 132 on a super­vised test (Raven’s SM). My results on CFNSE http://www.etienne.se/cfnse/ and GEThttp://www.epiqsociety.net/get/ were approx­i­mately 10 points higher - 6 on CFNSE (8 on my sec­ond attempt) and 21 on G.E.T . It could be because of a flaw of these tests, or they may not test the same abil­ity as timed tests (though the cor­re­la­tion between them and famous super­vised timed tests is said to be very high), it may be for some other rea­son as well, but it could be because of TNB. I had tried CFNSE long ago and scored 0 (but I prob­a­bly did­n’t try hard enough then).

christo­pher lines reports:

I did a cou­ple of the online IQ tests after about 10 days (scored 126 in one of them [iqtest.dk] and 106 [iqout.­com] in anoth­er); I repeated the same tests about a month later (about 1 month ago) and scored (133 and 109). I have no idea why the tests gave such big differ­ences in scores but I defi­nately [sic] think its eas­ier the sec­ond time you do the tests because I remem­bered the strate­gies for solv­ing the prob­lems which took some time to fig­ure out when I first did the tests. I am kind of against keep re-do­ing the tests because of learn­ing effects and a bit truobled [sic] that differ­ent test pro­duce such differ­ent results.

Tofu writes:

I’ve pur­posely not been doing any­thing to prac­tice for the tests or any­thing else I thought could increase my score so I would­n’t have to fac­tor other things into an improve­ment in iq, which makes improve­ments more likely attrib­ut­able to dual n-back. Before I took the test I scored at 117, a score about 1 in about 8 peo­ple can get (7.78 to be exac­t), and yes­ter­day I scored at 127 (a score that 1 in 28 peo­ple would get). Its a pretty big differ­ence I would say.

After a year of N-back­ing, Tofu has 3 sets of IQ test results using Self­-S­cor­ing IQ Tests (). To sum­ma­rize:

  1. 0 months (D3B; ~71%): 27,25 = 117 IQ
  2. 3 (T4B; ~76%): 37,4070 = 128
  3. 12 (Q6B; ~47%): 34,42 = 128

Other rel­e­vant tests for Tofu:

As a side­note- after 6 months I took a prac­tice with­out any study­ing and got a 146, roughly 30th per­centile, and I took an IQ test from http://iqtest.dk/main.swf after 1 year which I scored a 115 on. Also, in high school I took a pro­fes­sion­ally admin­is­tered IQ test and got a 137 which may have been high because they took my age into account in the scor­ing like the old school IQ tests used to do, but I’m not sure if they actu­ally did that.


Last year I scored 123 in www.iqtest.dk and today I made 140. If you elim­i­nate sta­tis­tic devi­a­tions, even if it’s just 5-10 points it’s very good IMO.


I do actu­ally have gains to report on the “Advanced Cul­ture Fair Test” found on iqcom­par­ison­site.­com that I just took today. Facts: I scored 29 raw (out of 36) IQ 146 or 99.9%ile, com­pared to my 130 or 98%ile raw 21 that I scored when I took the test over a year ago.

…For com­par­i­son to other fluid mea­sures, this result is 3 points higher than my Get-gamma score and 2 points higher than my GIGI cer­ti­fied and 13 points higher than my iqtestdk result which lands in the same place every time I take it (last time I took it was less than a month ago). My cur­rent DNB level aver­ages 8+ over mul­ti­ple (10-20) ses­sions.

mile­stones later reported:

Last night I retook the iqtest.org.uk and scored higher on a sec­ond try than I did a few months ago—145 up from 133. This could be due to 1. con­sis­tent quad back prac­tice 2. being back on cre­a­tine as I have been for the last month 3. Omega 3/epa/fish oil 4. just a nor­mal swing in scores due to other fac­tors, includ­ing famil­iar­ity with the items. Or, of course, maybe some com­bi­na­tion of the above

Lach­lan Jones:

Hey guys I’ve been using brain work­shop (Dual N back) for about 2 months now and would like to report an increase in IQ from 124 to 132 (on pro­fes­sion­ally admin­is­tered IQ tests that were super­vised) The IQ tests were sep­a­rated by a period of about a year as well.

(It’s a lit­tle unclear whether this was an improve­ment or not; the sec­ond score was on the test, but Lach­lan has­n’t said what the first one was. Since differ­ent tests have differ­ent norms and what not, Lach­lan’s scores could actu­ally be the same or declin­ing.)

Jan Pouchlý:

2008/06 DNB for cca 1 mon­th, 1 - 2 hours a day, 5 times a week; after 2 weeks prob­a­bly gain +8 points IQ (I think it was Wech­sler IQ 140 admin­is­tered by school psy­chol­o­gist and after 5 months Raven IQ 148 admin­is­tered by some Mensa guy). No prob­lems at all. Bet­ter dream recall.

Argumzio com­ments on Jan:

The differ­ence between (what I assume both are) WAIS-III and RAPM is fairly sig­nifi­cant; the for­mer is about 2.667 sigma (FSIQ), and the lat­ter is just over 3 sig­ma. For those who wish to know, both are set with a stan­dard devi­a­tion of 15.

Keep in mind, how­ev­er, that with WAIS-III you get the full treat­ment while with RAPM your fluid abil­ity is assessed as in the orig­i­nal Jaeggi study, so Jan’s per­for­mance on other fac­tors may have depressed and con­cealed his (al­ready) high Gf, or per­for­mance, capa­bil­i­ties. That’s why it is para­mount to use the same test, or a test that is essen­tially of the same design.

Mug­gin­Buns (gains may’ve been from prac­tice, or from a ‘fea­ture selec­tion’-based game Mug­gin­Buns is devel­op­ing):

"http://www.iqtest.dk/main.swf - 126 3 months ago

https://www.gigiassessment.com/shop/index.php - 126 3 months ago

http://www.iqtest.dk/main.swf - 140 2 weeks ago

http://www.cerebrals.org/wp/?page_id=44 - 137 yes­ter­day"

Min Mae:

Pre­vi­ously I had a RAPM IQ test result was 112 by cer­ti­fied psy­chol­o­gist. In 2009 August I prac­ticed DNB 2.5 hours a day for 20 days with time off two days, Sat­ur­day and Sun­day. After 20 days I took RAPM test in my uni­ver­sity by cer­ti­fied psy­chol­o­gist. I got IQ gain was 12.1 points. In the test i was only able to answer more ques­tions that related to changes in posi­tion of objects in the test (RAPM).

At 2010/04/12 I started SNB(single N-Back, visual modal­i­ty) with train­ing time was same as on DNB train­ing time for 20 days. IQ gain by RAPM test was no change.That time i also was able to answer more ques­tions that related to changes in posi­tion of objects in the RAPM test.

Colin Dick­er­man in the thread “IQ test in one mon­th!”:

I took a free android IQ test (I’m com­put­er­less) and scored 123 about 4 months ago. I’ve started n-back­ing again and after 4 weeks of con­sis­tent effort, I aver­age around 70% at dual 5 back­…Okay, I jumped the gun a lit­tle bit and retook the test a day ear­ly. I scored 126…[My] N-back level stag­nated in the mid 60% at 6-back.


Ok, so a few months ago the most i could get on the http://mensa.dk/iqtest/ was about 95,115-126. Well now its 136….with the stan­dard devi­a­tion 24 of course…I got to admit that score was on one of my bad days, and I was­n’t really focused, plus I did­n’t spend much time on the ques­tions. Prob­a­bly 2 months ago the high­est was 126.


Before n-back­ing, my IQ lay in the region between 109 and 120 (most online tests always put me in the 113-120 range, but the MENSA test only gave me a result of 109). I’ve prob­a­bly com­pleted 10 IQ tests over the last 3 years and my scores seem to be rel­a­tively con­sis­ten­t….­So, I’ve spent about one and a half months on dual-n-back. I did the http://www.iqtest.dk/main.swf test and got an IQ of 123.


iqtest.dk (Eng­lish) I first attempted this test more than 2 years ago where I obtained a score of between 110-115. I attempted this test again today where I achieved a score of 138.

N-level: Well, as most of you may know, over­time I’ve very much just been rolling around in the mud of what BW has to offer, so because I haven’t stayed in one coun­try long enough to call it home, it’s pretty much impos­si­ble for me to attribute my new ‘world view’ to one par­tic­u­lar mode or another…DNB: 4-back - 8-back = Time taken to reach lev­el, 10+ months Quad-n-back: 2-back - 6-back = Time taken to reach lev­el, between 6-8 months Vari­able-arith­metic n-back: 3-back - 7-back = Time taken to reach lev­el, between 3-4 months… [de­scrip­tion of daily rou­tine]


I’ve done about 40 half-hour ses­sions of dual n back and have made gains within the task-ie higher n-back score. Per­son­ally I don’t feel much smarter but I’ve noticed I read faster and can com­pre­hend what I am read­ing at a faster speed as well. Pre­vi­ously I scored a 109, then after 40 ses­sions I scored at 122 on the Den­mark Mensa IQ test. http://www.iqtest.dk/main.swf. My con­cern is that this sup­posed gain has not made a not­i­ca­ble improve­ment in my real-world intel­li­gence, and that the Den­mark IQ test is unre­li­able…I recently took a Mensa puz­zle brain-teaser and scored a 18/30, which seems fairly medioc­re? I don’t know..I was pretty stumped by some of the ques­tions. Did­n’t make me feel too smart….Up­date: I took the same IQ test(­Den­mark Men­sa) again and I scored a 126, 4 points higher than my pre­vi­ous score of 122. Between tak­ing the tests I had prac­ticed dual n back for about 14 half hour ses­sions.


I liked IQ tests, espe­cially the iqtest.dk. I did it for the last time between 1-2 years ago. My score was 110, I’m pretty sure. I scored never higher than 110, but also not much lower than 110. Guess i had just aver­age intel­li­gence and i was feel­ing that way too. On the Wech­sler test i did 3-4 years ago i scored 107. so if i for­mu­late it cor­rectly my fluid intel­li­gence was in line with my gen­eral intel­li­gence. Now i do it for 11 con­se­quent days, just 25 min­utes a day and mostly at this point 2-3-4 back. But after the third day i felt much more clar­ity and bet­ter abil­ity to for­mu­late things, because my mem­ory seemed so much bet­ter. Take note that I’m really an extremely sen­si­tive per­son, so that is prob­a­bly the rea­son i felt it so quick­ly. Today i decided to do the iqtest.dk test again, because i was excited to do it and not wait till the 19th day. My expec­ta­tion was a iq in the 110-115 range, but guess what i scored 126…A few min­utes ago i ended the iqtest.dk again and scored 122. This means for me I’m approx­i­mately as smart as 12 days before, fur­ther­more i think i should­n’t be any differ­ences in raw intel­li­gence.

He later reported addi­tional results:

i did the full wais-iii 12-13 months ago. i scored 111 on the POI, which i think is the best mea­sure for gf (although not a pure mea­sure, but more com­pre­hen­sive than just matri­ces) This where the scores within the POI:

  • Pic­ture com­ple­tion 11
  • Block Design 10
  • Matrix Rea­son­ing 15

I trained 2.5 months from Feb­ru­ary to April 2012. Note that i am 21 years old (in­tel­li­gence is in some degree mal­leable till 22/23 years old, right?) Well, i did the wais-iii again and have the results since a week. My POI is now 125 and this is how it looks:

  • Pic­ture com­ple­tion 8 (-3)
  • Block Design 18 (+8)
  • Matrix rea­son­ing 16 (+1)

karthik bm:

I started dual n back­ing about 5months ago.After train­ing for first 2 months, I took an IQ test and Iqtest.dk. My score was in low 120s.(­took the test mul­ti­ple times and got almost the same score each and every time)

For the next 3 months apart from n back­ing, I included med­i­ta­tion, image stream­ing and jug­gling into my sched­ule. Yes­ter­day I took the same test at iqtest.dk and got the score as 133.


I first started play­ing DNB some 3 years ago, try­ing to play three rounds every­day (skipped at most 20% of the total days), regard­less of the n-back lev­el.

  • 2 years ago (high­est con­sis­tent DNB: 4), I took my first MENSA test - and scored 130 (SD 24), top 10%.
  • 2 weeks ago (high­est con­sis­tent DNB: 7), I took my sec­ond MENSA test - and scored 156 (SD 24), top 1%, so I am join­ing.


I’ve been doing n-back since June of 2011 and I’m aver­ag­ing now between 8.0 and 8.25. I lay off for as long as six months at a time but get back to where I ended just two days after resum­ing. Before I started I toke iqtest.dk and scored 105. After six months and reg­u­larly scor­ing above 6 I retook the same test and scored 115. After one year and scor­ing above 7 I scored 127 IQ points. I retook the test just now and I scored 115.

No improvement

Some have not:

I took the Online Den­mark IQ test again [after N-back train­ing] and I got 140 (the same result) I took a stan­dard­ized (and charged) online IQ test from www.iqtest.­com and I got 134 (though it may be a bit higher because Eng­lish is not my mother tongue) –Crypto

jttoto reports a null result:

6 months ago I posted my IQ on this site after tak­ing the Mensa Nor­way test… [see IQ tests sec­tion] I scored a 135. After 6 months of dual n-back, triple n-back, and quad n-back train­ing, I took the same exact test. I scored exactly the same, 135. Grant­ed, I took 7 less min­utes to com­plete the test, but this was due to famil­iar­ity of some of the ques­tions. That being said, I have been see­ing sig­nifi­cant increases in my digit span and other WM gains, so while my apti­tude on ques­tions like the Raven’s may not have increased, my mem­ory has.

(It’s worth not­ing that Jtto­to’s expe­ri­ence does­n’t rule out an IQ increase of some sort, as the orig­i­nal 135 score was from an IQ test he took after at least 10 hours of n-back­ing over 5 days, accord­ing to an ear­lier email; what it shows is that Jttoto did­n’t ben­e­fit or the ben­e­fits hap­pened early on, or there’s some other con­found­ing fac­tor. Test results can be very diffi­cult to inter­pret.)

moe writes:

“After 6 months of train­ing I decided to take the tri 52 again and there has been no improve­ment in intel­li­gence (or should I say abstract rea­son­ing abil­i­ty), I’m still at 144 sd15 on that test. My digit span has gone up a bit from 9 for­ward 8 reverse to between 10-12 for­ward and reverse depend­ing on how I’m feel­ing. I’m still not sure if the improve­ments in digit span are gen­uine mem­ory improve­ments or increased skill at chunk­ing.”

Jttoto fur­ther wrote in response to moe:

“Yes, I’ve con­tin­ued to train QnB myself (about 3-4 times a week). Based on the iqout.­com test, if any­thing, I’ve gone down a lit­tle!. This is not sur­pris­ing and prob­a­bly not attrib­uted to n-back­ing. I’m at the age where cog­ni­tive decline begins and I was depressed that day. At the same time, one would think I would see mea­sur­able gains by now.”

"I’ve had pretty unim­pres­sive find­ings. I’ve used Brain Work­shop 4.4 for about four months, with about a half-hours use 4-6 days a week. I used Den­mark IQ test and scored a 112 and after DNB I scored a 110.

My max DNB level was 11. Hours and hours and no gain in iq."


“…here they are (in order of test tak­ing): 119, 125, 125, 107, 153, 131, (and I would say between 125 and 131 was my real iq) from differ­ent online tests almost 2 years ago before start­ing n-back­ing. after two years (I took the same bunch of online iq tests 3 weeks ago before try­ing faster tri­als) I got: 126, 135, 124, 125. so there was­n’t much of a change. but I had been play­ing n-back softly for a long time. I expected my iq to jump at least by 5 to 10 points, from what I felt in my life. then after my week of faster tri­als, I did this iq test: and I got 149. if any­one who already knows his iq wants to try it, I’d be curi­ous to know if they also score higher than expect­ed, at first try of course. I thought the 153 I once got was pure chance, but maybe it was­n’t com­plete­ly, and that would be cool.”


“Been train­ing for about 5 weeks now, 30 mins a day and made very quick progress ini­tial­ly, and now shut­tling between n=7 and n=8 and occa­sion­ally reach­ing n=9 (when I set out, I begin with n=2 and the value of N for the next round depends on my per­for­mance in the round I just fin­ished)…I took a few intel­li­gence tests (mostly cul­ture insen­si­tive), and the scores have actu­ally”DROPPED" some 3-4 per­cent. Although I guess that does­n’t mean much because I took those tests towards the end of the day at work and was some­what exhaust­ed, but it sure as hell means that there is no increase in my intel­li­gence either!!"


“I have used dual, sin­gle and com­bi­na­tion n back reg­u­larly for almost 2 years and no pos­i­tive results come from it. I have the exact same IQ as I have accord­ing to Den­mark IQ test. Not even a cou­ple of points high­er….Just to clar­i­fy, I have used n back, seen no improve­ment based on IQ tests or real-life ben­e­fits.”

Keep in mind, that if IQ is improved, that does­n’t nec­es­sar­ily mean any­thing unless one employs it to some end. It would be a shame to boost one’s IQ through N-back, but never use it because one was too busy play­ing!

Other effects

Between 2008 and 2011, I col­lected a num­ber of anec­do­tal reports about the effects of n-back­ing; there are many other anec­dotes out there, but the fol­low­ing are a good rep­re­sen­ta­tion - for what they’re worth71.

Besides these col­lected reports, there is an ongo­ing group sur­vey (spread­sheet results); n-back­ers are strongly encour­aged to sub­mit their dat­a­points & opin­ions there as well.


  • Ashir­go: “To be hon­est, I do not feel any obvi­ous differ­ence. There are moments in which I per­ceive a sig­nifi­cant improve­ment, though, as well as par­tic­u­lars task which are much eas­ier now.”

    “I have also expe­ri­enced bet­ter dream recall­ing, with all these rever­ies and other hal­lu­ci­na­tions includ­ed. I am more hap­pier now than ever. I did doubt it would be ever pos­si­ble! I am also more prone to get excit­ed…Now peo­ple in my moth­er­land are just bor­ing to lis­ten to. They speak too slow and seem as though it took them pains to express any­thing. I did not notice that after I had done my first ninety days of n-back, but now (after 2.5 months) it is just con­spic­u­ous.”ref

    “My change of opin­ion72 can be eas­ily attrib­uted to the improve­ment of mood, in coin­ci­dence with the mere fact that the win­ter days have passed and now there is a bright and sunny Spring in my coun­try”; when asked if the pre­vi­ous means Ashirgo attrib­utes all the improve­ment to the weath­er, Ashirgo replied: "For­tu­nate­ly, I can attribute many changes to n-back, I can now han­dle var­i­ous tasks with lit­tle effort and it takes me much less time in com­par­i­son with oth­ers (espe­cially when I know what to do). Nev­er­the­less, the main prob­lem for me is that I am also occu­pied with few things that I sup­pose to be able to test my newly acquired poten­tial, there­fore I can­not say that ‘changes’ are explicit every­where.

    On the other hand, I am start­ing to believe that any improve­ments (that one can expect) so smoothly and swiftly become a nat­ural part of one’s capa­bil­i­ties that it makes them hardly notice­able until some tests/measures are tak­en."

  • chin­mi04: “For me, it defi­nitely has taught me how to focus. But I’m still not sure whether that has some­thing to do with merely com­ing to real­ize the impor­tance of focus­ing, or whether the pro­gram has really phys­i­cally rewired my brain to focus bet­ter. In any case, it appears that I’m now faster at men­tal rea­son­ing, cre­ative think­ing and speak­ing flu­en­cy. But again, the effects are not so clear as to com­pletely elim­i­nate any doubt regard­ing the con­nec­tion with the n-back pro­gram.”

    “I have been main­tain­ing a per­sonal blog on Word­Press since 3 years ago. Aver­age post per month : a lit­tle over 1. Then I started with dual-n-back at the end of Novem­ber… num­ber of posts in Jan­u­ary : 7! (none are about n-back)”

  • ArseneLupin: “Not much, yet, but I feel that I can eas­ier get a hold of a dis­cus­sion. The feel­ing is the same as when I am mas­ter­ing a cer­tain n-back in the game (a bit hard to explain).”

  • John: “I feel much sharper since I started in the mid­dle of last Novem­ber…My pro­duc­tiv­ity is much higher these days. I’m a non-fic­tion writer, so hav­ing a higher work­ing mem­ory and fluid intel­li­gence directly leads to bet­ter (and faster) per­for­mance. It’s amaz­ing to see the stuff I pro­duce today and com­pare it to before I began the Dual N-Back train­ing. Also, I am simul­ta­ne­ously learn­ing Ger­man, French and Span­ish, and I’m cer­tain this is help­ing me learn those lan­guages faster.”

  • Ginkgo: “DN-Back has prob­a­bly helped me with one of my hob­bies.”

  • BamaDoc: “I note a sub­jec­tive differ­ence in recall. There might be some increase in atten­tion, but I cer­tainly do notice a differ­ence in recall. It might be place­bo, but I am con­vinced enough that I con­tinue to find time to use the pro­gram.”ref

  • kar­nau­trahl: “Since Novem­ber how­ev­er, I began to read the Neu­ro­science book in more detail. I men­tioned late Decem­ber I think that I was find­ing I could under­stand more stuff. I’ve spent about £1000 on books since Novem­ber. The large major­ity are books on the brain, source from Ama­zon reviews, read­ing lists and out of my own pirate list when I liked a book. I stopped Dual n Back in Decem­ber, ear­ly. The ben­e­fits have stayed how­ev­er. I tested this the other day, very eas­ily going to 3 n back, which was mostly where I was before. I guess in a way I’m try­ing to say that for me, whilst the focus may have been on G increase and IQ etc, now the focus is on–what’s really hap­pened and what can I do with it. What I can do with it is choose to con­cen­trate long enough to gen­uinely under­stand fairly tech­ni­cal in depth chap­ters on sub­jects often new to me.”ref Kar­nau­trahl writes more on his self­-im­prove­ments in his thread “Sec­ond lot of train­ing start­ed-and long term expe­ri­ence over­all.”, and describes an inci­dent in which though he stopped using DNB 3 months pre­vi­ous­ly, he still dealt with a tech­ni­cal issue much faster and more effec­tively than he feels he would’ve before.

  • nega­tron: “One per­haps coin­ci­den­tal thing I noticed is that dream rec­ol­lec­tion went up sub­stan­tial­ly. A good while after I stopped I devel­oped an odd curios­ity for what I pre­vi­ously con­sid­ered unpleas­ant mate­ri­al, such as advanced math­e­mat­ics. Never imag­ined I’d con­sider the thought of advanced cal­cu­lus excit­ing. I began read­ing up on such sub­jects far more fre­quently than I used to. This was well after I’ve long for­got­ten about dual n-back so I find it hard to attribute it to a placebo effect, believ­ing that I’m more adapted to this mate­r­i­al. On the other hand I don’t recall read­ing any­thing about moti­va­tional ben­e­fits to dual n-back train­ing so I still con­sider this con­jec­ture and per­haps an event­ful coin­ci­dence just the same.”ref

  • sutur: “i did­n’t really notice any con­crete changes in my think­ing process, which prob­a­bly, if exis­tent, are rather hard to detect reli­ably any­way. one thing i did notice how­ever is an increased sense of calm­ness. i used to move my legs around an awful lot while sit­ting which i now don’t feel the urge to any­more. but of course this could be placebo or some­thing else entire­ly. i also seem to be able to read text (in books or on screen) more flu­ently now with less dan­ger of dis­trac­tion. how­ev­er, per­son­ally i am quite skep­tic when peo­ple describe the changes they notice. changes in cog­ni­tive capac­ity are prob­a­bly quite sub­tle, build up slowly and are hard to notice through intro­spec­tion.”ref

  • astriaos: “By ‘robust’, I mean prac­ti­cally every­thing I do is qual­i­ta­tively differ­ent from how I did things 30 days pre­vi­ous to the dual n-back train­ing. For instance, in physics class I went from vaguely under­stand­ing most of the con­cepts cov­ered in class to a mas­tery thor­ough enough that now my ques­tions usu­ally tran­scend the scope of the in-class and text­book mate­ri­al, rou­tinely stu­pe­fy­ing my physics teacher into longer-than-av­er­age paus­es. It’s the same expe­ri­ence for all of my class­es. Some­how, I’ve learned more-than-I usu­ally learn of physics/government/ etc. (all of my class­es, and any topic in gen­er­al) infor­ma­tion from sources out­side of class, and with­out what I con­sider sig­nifi­cant effort. I feel like my learn­ing speed has gone up by some fac­tor greater than 1; I can fol­low longer argu­ments with greater pre­ci­sion; my vocab­u­lary has improved; I can pay atten­tion longer; my prob­lem solv­ing skills are sig­nifi­cantly bet­ter… Real­ly, it’s amaz­ing how much cog­ni­tion depends on atten­tion!”ref

  • flashquar­ter­mas­ter reports N-back cured his chronic fatigue syn­drome?

  • UOChris1: “Harry Kahne was said to have devel­oped the abil­ity to per­form sev­eral tasks at one time involv­ing no less the 16 differ­ent areas of the brain….­Sur­pris­ing­ly, I am slowly devel­op­ing the abil­ity simul­ta­ne­ously per­form quad com­bi­na­tion 3-back while recit­ing the alpha­bet back­wards. The prac­tice is very diffi­cult and requires loads of con­cen­tra­tion but I am expe­ri­enc­ing per­ceiv­able gains in clar­ity of thought from one day of prac­tice to the next whereas my gains from Brain Work­shop alone were not per­ceiv­able on a daily basis.” UOChris1 wrote of another mode: “Triple-N-Back at .5sec inter­vals and piano notes instead of let­ters has greatly improved my sub­jec­tively per­ceived flu­id­ity of thought. I am much more engaged in class, can read much quick­er, and am com­ing up with many more cre­ative solu­tions now than ever before. I did­n’t notice the improve­ments as much when I was using slower inter­val­s–I feel I make more deci­sion cycles in a given amount of time before com­ing to a solu­tion.”

  • Pon­tus Granström “I cer­tainly feel calmer hap­pier and more moti­vated after doing DNB, it has to do with the increase of dopamine recep­tors no doubt!”

  • Chris War­ren sum­ma­rizes the results of his inten­sive prac­tice (cov­ered above): “For those that are curi­ous, I noticed the largest change in my thought processes on Wednes­day. My abil­i­ties were notice­ably differ­ent, to the extent that, at some points, it was, well, star­tling. I’ve started get­ting used to the feel­ing, so I can’t really com­pare my intel­li­gence now vs. Wednes­day. How­ev­er, I’m com­pletely con­fi­dent that I’ve become smarter. Under the kind of stress I’ve put my brain through, I can’t imag­ine a sce­nario where that would­n’t hap­pen.”

    “After the first cou­ple days of train­ing, I expe­ri­enced a very rapid increase in intel­li­gence. It sud­denly became eas­ier to think. I can’t give you any hard evi­dence, since I did­n’t bother to take any tests before I start­ed. How­ev­er, I can give you this: when I woke up Wednes­day morn­ing, I felt the same as I did after the first time I tried n-back. Except the feel­ing was 10 times stronger, and my think­ing was notice­ably faster and more com­pre­hen­sive.”

  • Raman reports an ini­tial null result: “19 days with n-back are over… no sub­jec­tive ben­e­fits as such. But I am aware at what point I am com­fort­able or not. e.g. y’day play­ing the game was effort­less, and today my brain felt sort of sticky, the sequence was just not stick­ing in my brain. very strange what a few hours can do.”

  • iwan tuli­jef says that “Long time ago I was diag­nosed Adhd [sic] and for long time I took meds and this train­ing helped me to reduce my meds nearly to zero, com­pared with the doses I took before. Unfor­tu­nately this haven’t fixed the whole thing. But what I noticed was, hmm… those things are very diffi­cult to describe…. that time by time I got more con­trol about my men­tal life. Obvi­ous effects in social mat­ters were e.g. that I could fol­low con­ver­sa­tions bet­ter and behave more nat­u­ral­ly. In my edu­ca­tion mat­ters, e.g. that I under­stood maths proofs bet­ter. There are a lot of details. Inter­est­ing was, as these issues are, to under­state it a bit, not unim­por­tant for me, that in the begin­ning when I remarked changes, I got a bit euphoric, so the first effects of n-back feeled like the strongest.” and warns us that “It’s very diffi­cult and very ques­tion­able to take objec­tive infor­ma­tions out of sub­jec­tive self eval­u­a­tion.” (Iwan trained for 3-4 months, 20 rounds a day in the morn­ing & evening.)

  • jttoto saw no gain on an IQ test, but thinks he’s ben­e­fited any­way: “My friends have always called me inat­ten­tive and absen­t-mind­ed, but since play­ing n-back no one has called me that for a while. I now never for­get where I park my car, when I used to do that nearly every other day. I feel more atten­tive. Even if my abil­ity to solve prob­lems has­n’t improved, the gains in my mem­ory are real and mea­sur­able.”

  • reece: “Not that I’ve noticed [an improve­ment in ]. I have noticed an improve­ment in my work­ing mem­ory how­ev­er—seems eas­ier to jug­gle a few ideas in my head at the same time which pre­sum­ably the quad-n-back has helped with.” “I recently noticed that it appears to have made me bet­ter at play­ing ping pong and tetris. Oddly enough how­ev­er, it does­n’t appear to have improved my reac­tion time…­Work­ing mem­ory has improved, how­ever other things I’ve always strug­gled with such as uncued long term mem­ory recall have not… I’m still very absen­t-minded and believe n-back has made me more eas­ily dis­tractable (low­ered latent inhi­bi­tion?), although to be fair, I may have brought this on myself by play­ing quad n-back and this was not some­thing I noticed when only play­ing dual n-back. I seem to be able to get by on about one hour less sleep per night and per­form bet­ter cog­ni­tively when sleep deprived. Dream recall has increased sig­nifi­cantly as has lucid dream­ing. I do take a few nootrop­ics, how­ever I’ve been tak­ing the same ones for years…Ver­bal flu­ency appears to have improved, proper spelling and punc­tu­a­tion are things I’ve always strug­gled with and do not appear to have ame­lio­rated resul­tant from n-back train­ing.” (Poll) “In my expe­ri­ence with dual and mul­ti­modal n-back, the ben­e­fits I’ve most observed have been increased mul­ti­task­ing abil­ity and increased con­cen­tra­tion in the pres­ence of dis­trac­tions. For me, the ben­e­fits of n-back train­ing are most appar­ent on days I don’t take my ADHD med­ica­tion. I have been train­ing DNB with posi­tion-sound and col­or-im­age modes late­ly. I used QNB for sev­eral months in the past, how­ever I (sub­jec­tive­ly) believe DNB is giv­ing me the most ben­e­fit.” (ADHD thread)

  • Michael Camp­bell: “Some­thing very minor to some, but was good for me; I’m able to con­cen­trate while read­ing a lot more than I have been able to in the past.”

  • exi­gentsky: "I’ve seen improve­ments in exec­u­tive func­tion and moti­va­tion. After DNB, I am more inclined to study and com­plete long pend­ing items. How­ev­er, there is a con­found­ing vari­able. I don’t usu­ally do DNB when in an unhealthy state of mind (for exam­ple, with lit­tle sleep and extremely high stress). Still, I believe that I can attribute some of the effects only to DNB.

    In terms of work­ing mem­ory and other cog­ni­tive mea­sures, I’m not sure. I don’t notice any­thing dra­matic but also haven’t stuck to a DNB regime for more than a few week­s."

  • cev: "I think I’ve put my fin­ger on a par­tic­u­lar ben­e­fit of dnb train­ing: it seems to help my brain’s ‘inter­nal clock’ - I am bet­ter able to order my thoughts in time.

    DNB has also helped my foos­ball (!) play­ing: at a high level the game involves com­plex strings of motor move­ments and since I’ve been train­ing, I’ve found that my coor­di­na­tion of these move­ments has greatly improved despite no longer prac­tis­ing."

  • erm: “I can rely on this to dras­ti­cally reduce anx­i­ety, flight­i­ness, improve con­cen­tra­tion. It also seems to whet my appetite for intel­lec­tual work and increase pur­pose­ful­ness across the board.”

  • Tofu, after a year of n-back­ing: "N-back train­ing may have some­how improved my ver­bal intel­li­gence, but since ver­bal intel­li­gence is a form of crys­tal­lized intel­li­gence and train­ing work­ing mem­ory is sup­posed to pri­mar­ily improve fluid intel­li­gence, it prob­a­bly did­n’t. My score on the ver­bal sub­test went up and then down which would make no sense if it did have any influ­ence…S­ince my IQ score increased from the first test to the sec­ond test, and stayed the same from the sec­ond test to the third test it could be pos­si­bly that work­ing mem­ory only con­tributes to IQ up to a cer­tain point. All in all, I feel more inclined to say that n-back train­ing has only a lit­tle if any effect on IQ though which is why rea­son I’m prob­a­bly going to stop doing the n-back train­ing.

    On a more pos­i­tive note, since I started n-back train­ing I have noticed bet­ter con­cen­tra­tion which I had a seri­ous prob­lem with before. In gen­er­al, I feel like I think more clearly and I at least feel like I’ve become smarter too. I’ve reached a pretty high level in n-back and any gains I’ve made in the last month or two have been small, so I think I’ve reached a long-term plateau which is another rea­son for me to stop the train­ing. From my expe­ri­ence when I stop the n-back train­ing for a month or two and return to n-back train­ing I still per­form at the same level any­way. It seems like the effects from train­ing are going to last a while which is also good news. Over­all, I feel like the n-back train­ing was worth it but if I had it to do over I would have prob­a­bly stopped after a cou­ple of month­s."

  • kriegerlie: “i’ve defnitely had some ben­e­fit, like pon­tus said, dunno about being smarter, but my focus is incred­i­ble now. I can do what i thought I could never do, purely because I can focus more. Placebo or not. It’s a defi­nite effect.”

  • Rotem: “DNB works, It’s one of the best invest­ments I made in my life. I have much less anx­i­ety ( I suffered from GAD my life was a night­mare), more con­fi­dence and I guar­an­tee more intel­li­gence - I can feel it…”

  • chortly: “For a while I imag­ined that my work­ing mem­ory mus­cles were indeed strength­en­ing, the main sen­sa­tion being that I could retain the var­i­ous threads of a com­pli­cated con­ver­sa­tion bet­ter as they dan­gled and were for­got­ten by the other con­ver­sa­tion­al­ists. But that was prob­a­bly just wish­ful think­ing. Because it’s bor­ing and diffi­cult, I haven’t stuck with it, though I keep intend­ing to.”

  • JHar­ris: “I’ve been work­ing with the dual n-back pro­gram for a bit of time now. Improve­ment is slow, but seems to be hap­pen­ing; I just had a 68% run at dual 3-back. Obser­va­tions like this are not really sci­en­tific and hell­ishly sub­ject to bias, but I think I may be notic­ing it slightly eas­ier to think effec­tive­ly.”

  • Neu­ro­hacker (in a thread on ): “I’m defi­nitely find­ing it help­ful, even if it’s just giv­ing me some prac­tice at focus­ing…as a com­ple­men­tary strat­egy [to med­ica­tion], it’s cer­tainly work­ing won­ders.”

  • iwan tuli­jef: “n-Back helped me a lot. Espe­cially in the begin­ning when I started with DNB, the effect was astound­ing. I got much faster in under­stand­ing writ­ten and spo­ken word­s.In the begin­ning I think the func­tion of my work­ing mem­ory was really bad. What then hap­pened is that I got habit­u­ated to the effect and the increases were small­er, so notic­ing improve­ments got more diffi­cult.”

  • Arkan­j3l: “On a side, I really enjoy the lucid feel­ing I get after an hour of n-back. I start to look at things and ideas seem to flow into my head very vividly (I’ve made some of my best Lego cre­ations after an n-back ses­sion :p).”

  • Michael Logan: “…and learned very, very quickly that I had a short term mem­ory and atten­tion issue. The dual n back task laughed at me, but I vowed to over­come my inat­ten­tion and short term mem­ory issues, and within a few prac­tices, I noticed an improve­ment not only in my scores on the com­put­er­ized game, but in ses­sion with my clients….So Mind Sparke does pro­vide that kind of novel learn­ing chal­lenge. I have not taken an IQ test, but I do believe the use of the tool is help­ing me build cog­ni­tive reserve for the later stages of my life.”

  • mile­stones: “I’m grate­ful for the gains I seemed to have received from train­ing dual n back. I used to be extremely for­get­ful with remem­ber­ing where I put things and now it’s very easy to retrace steps and recall where I placed xyz item. As far as IQ tests go, I did see a gain on a well designed (un­timed) cul­ture fair test of about 1 stan­dard devi­a­tion after train­ing one DNB on and off for close 2 years. (Other tests with lower ceil­ings, how­ev­er, showed no or mar­ginal gain­s).” A later post: “The gains I’m see­ing are: faster encod­ing speed; faster and more accu­rate retrieval of data from long term mem­o­ry; as well as an increase in data-se­quenc­ing speed (the lat­ter is a rel­a­tive weak­ness of mine that now seems to have been helped by con­sis­tent quad-back train­ing—though I’ve not tested any trans­fer so this is sub­jec­tive). Also, though my fluid intel­li­gence has prob­a­bly ceased gain­ing, it seems I’m func­tion­ing at higher bands of abil­ity far more reg­u­lar­ly—even when I’m tired or slug­gish.”

  • Lach­lan Jones wrote, after a before/after IQ report, “The most sig­nifi­cant real word appli­ca­tion for me has been improve­ments in my piano play­ing. I am a pianist and can report sig­nifi­cant improve­ments in my sight read­ing and the rate at which I learn new pieces.”

  • unfunf: “While I haven’t taken an IQ test to see if it has gar­nered any IQ improve­ment, I can say I started off at dual 4-back only 2 weeks ago and I am now near­ing dual 6-back. I can also attest to a pretty large work­ing mem­ory improve­ment, beyond what I would call placebo (the effects of which I am very well aware). Even if it is not very effec­tive, I still say this game is fun.”

  • Neu­roGuy: “Dual-N-Back has sub­jec­tively done more for me in less then two weeks than any sin­gle nootrop­ic, can hardly imag­ine it com­bined with spaced-repepi­tion.”

  • TeC­NoY­oTTa: “I also want to report that after train­ing on DNB I found that I am dream­ing almost every day…by the way I remem­ber that this effect was not directly after train­ing…un­for­tu­nately I stopped using DNB from about 2 months or some­thing like that and now I dream less”

  • dime­coin: “I make no claims, other than anec­do­tal - in that it seems to relax me and able to han­dle stress bet­ter when I do it reg­u­lar­ly.”

  • Arbo Arba: “I did find a lot of changes come to my brain and per­son­al­i­ty, but I’m not sure if it’s from improv­ing WM or if it’s just from spend­ing a lot of time in an alpha-wave dom­i­nant state. I think it’s being in a pro­longed alpha-brain wave dom­i­nant state, tbh, because I found that when I was younger and took up heavy read­ing projects I felt the same improve­ments–that is, hav­ing more focus, being able to ‘hear’ myself think very dis­tinctly to the point where I could com­pose poems/emails in my head with­out effort. I don’t know why this is, but it hap­pens so much with me that I can’t doubt that there is a real effect on my per­son­al­ity and default men­tal state when I’m doing ‘intel­lec­tual’ things.”

  • Akiyama Shinichi: “I train 3 times a day and every sesion last about 20 min­utes. After a month a went to my chess club and com­pletely crash play­ers who was at com­pletely differ­ent lev­el. I chose one of the strongest player (at my level of course), because he was able to tell me if I’ve realy improved. Then I had to reveal my secret, and after month I tell how it works for them. I notice that I improve not only in chess. I’m a piano player and it’s really chal­leng­ing. I was learn­ing very slow­ly, but yes­ter­day my teacher told me that in two weeks I learnt much more than in the last 2 months. He was even sus­pect­ing me that I take lec­tures from other teacher, not only from him. And that’s not all. I’m a stu­dent and one a month every of us have to pre­pare pre­sen­ta­tion on some top­ic. A few days ago was my turn. I did­n’t notice it by myself, but one of my friend told me that I was very well-pre­pared, because I stopped mak­ing that annoy­ing sound like”umm“,”yyyy" when I was think­ing what to say. When I was per­form­ing my pre­sen­ta­tion I don’t have to think what to say next because I’d already know and did­n’t have to think about it much."

  • whois­bam­bam: “My mind feels faster. I also seem to have less men­tal fatigue dur­ing study­ing, n-back­ing, etc. I am more con­fi­dent. I am con­fi­dent that my mem­ory has improved inde­pen­dent of n-back­ing (what they call ‘far trans­fer’ effec­t). I am not say­ing it is a HUGE differ­ence. I can not say the same is true for any sup­ple­ment i have taken other than pos­si­bly some small effect with mag­ne­sium l-thre­onate which also seems to make me ‘less men­tally tired’ in par­tic­u­lar, inter­est­ing­ly.”

  • Christo­pher Dzi­ało: “I’ve trained with n-back for sev­eral months and have noticed a pro­found abil­ity to sight read music and locate the notes, my speed and over­all dex­ter­ity has dras­ti­cally increased and I shall con­tinue to n-back and grow my musi­cal tal­ent.”

No benefits

  • Con­fuzedd: “[asked if felt ‘sharper’]: Noth­ing.”
  • Chris: “One thing I have noticed is the rec­ol­lec­tion of a num­ber of very unpleas­ant images in dreams. Specifi­cal­ly, images of bod­ily dis­ease, muti­la­tion, injury and post-mortem decom­po­si­tion. I find it diffi­cult to believe it’s just a coin­ci­dence, because I can’t remem­ber when I last had such a dream, and I’ve had maybe half a dozen since I started dual n-back. But per­haps it’s sim­ply owing to bet­ter recall.”ref
  • Pheonexia: “now I’m at 6-back and am con­sis­tently between 50 and 80% accu­rate….All that said, I have NOT noticed any differ­ences in my men­tal capac­i­ty, intel­li­gence, daily life, or even abil­ity to remem­ber things that just hap­pened. I still some­times for­get peo­ple’s names right after they tell me them. I’m going to keep train­ing though, because just because I haven’t con­sciously noticed these things, I have faith in sci­en­tific stud­ies, so with enough train­ing hope­fully I’ll yield some pos­i­tive ben­e­fits.”
  • The­Q17 reports lit­tle to no ben­e­fit: “At any rate, I don’t feel study­ing is any eas­ier although it was­n’t really diffi­cult to begin with for me. Per­haps I’ll give it another go over break and report back. My goal orig­i­nally was to get to P5B before adding a sec­ond sound stim­u­lus mak­ing a Sex­tu­ple Nback but I don’t know if Shamanu made an updated ver­sion to make that any eas­i­er. I’m also kind of on the fence about the effect on the depth of train­ing. It may have been more ben­e­fi­cial to do higher N lev­els instead of more stim­uli.”
  • Jonathan Graehl: “I can do dual 4-back with 95%+ accu­racy and 5-back with 60%, and I’ve likely plateaued (nat­u­ral­ly, my skill rapidly improved at first). I enjoy it as”prac­tice focus­ing on some­thing“, but haven’t noticed any evi­dence of any gen­eral improve­ment in mem­ory or other men­tal abil­i­ties.”
  • Will New­some: “After doing 100 tri­als of dual N back stretched over a week (mostly 4 back) I noticed that I felt slightly more con­scious: my emo­tions were more salient, I enjoyed sim­ple things more, and I just felt gen­er­ally more alive. There were tons of free vari­ables for me, though, so I doubt cau­sa­tion.”
  • steven0461: “I did maybe 10-15 half-hour ses­sions of mostly D5B-D6B last year over the course of a few weeks and did­n’t notice any effects.”
  • Egg­plantWiz­ard (D3B->D10B): “I would say that there has been some form of improve­men­t—though it’s not clear if the improve­ment is task-spe­cific. I haven’t noticed any sig­nifi­cant differ­ence in my day to day life, but (to be immod­est in the name of effi­ciency for a moment) I had a very good mem­ory to begin with, and I would say strong fluid intel­li­gence. It’s pos­si­ble that peo­ple start­ing from posi­tions of lower fluid intel­li­gence would see a more pro­nounced ben­e­fit.”
  • Matt: “…I’ve cer­tainly improved at n-back type tasks, I can’t say that I’ve noticed any improve­ment while han­dling real life prob­lems. I think the effects do gen­er­al­ize - I’m quite good at highly g-loaded tasks like the now, even with­out much prac­tice - but the range of tasks which are sub­ject to improve­ment from n-back­ing seems lim­it­ed. I’m bet­ter at tasks involv­ing men­tal updat­ing, but my short term mem­ory has only slightly improved, if at all. I don’t have an accu­rate way of mea­sur­ing my change in Gf (or g), as most of the fluid rea­son­ing tasks avail­able online use the same/similar rule pat­terns or aren’t accu­rately normed, but as I said before, my real life prob­lem solv­ing abil­i­ties have not sub­jec­tively improved…”
  • Jelani Sims: “I’ve been doing DNB since the group start­ed, I haven’t noticed any­thing out of the ordi­nary in terms of cog­ni­tion. But I never took a before and after IQ test and I haven’t really done any­thing that I found men­tally diffi­cult before. So it’s very hard for me to gauge men­tal improve­ments with noth­ing for me to base it on. I also changed my diet, started mind­ful­ness med­i­ta­tion and exer­cis­ing around the same time I started DNB, in an over­all attempt to delay brain decline. Mak­ing it even more diffi­cult to attribute any­thing directly to DNB. What I can say is I have been stuck on 12 for 4 months now, each level was increas­ingly more diffi­cult to pass and 12 seems to be some sort of tem­po­rary plateau.”
  • argumzio: “I’ve seen no net ben­e­fit. Com­pared to improved nutri­tion, exer­cise, sleep­ing, and the occa­sional nootropic (e.g., Pirac­etam, Alpha GPC, CDP Citi­co­l­ine, Resver­a­trol, Kre-Al­ka­lyn & Cre­a­tine Mono­hy­drate, etc.), DNB did noth­ing. How­ev­er, in terms of sub­jec­tively improved focus (count­ing the near-cer­tain pos­si­bil­ity that the afore­men­tioned changes also influ­enced it), QNB* did the most for me, that is, allow­ing me to absorb infor­ma­tion for longer peri­ods of time and main­tain this effort much later into the evening while mit­i­gat­ing the dele­te­ri­ous effects of fatigue and allow­ing me to feel rested after unusu­ally shorter peri­ods of sleep.”


One of the wor­ries occa­sion­ally cited is that DNB train­ing mostly serves to increase one’s focus on the task one is think­ing about. Which is great in most con­texts but, the fear goes, the abil­ity to focus on one thing is the abil­ity to exclude (‘inhibit’) thoughts on all other top­ics - which is cru­cial to cre­ativ­i­ty. Work­ing mem­ory and abil­ity to shift atten­tion has a strong cor­re­la­tion with being able to solve s with lat­eral think­ing, but as with the WM-IQ link, that does­n’t say what hap­pens when one inter­venes on one side of the cor­re­la­tion (cor­re­la­tion is not cau­sa­tion):

Indi­vid­u­als may have diffi­culty in keep­ing in mind alter­na­tives because mul­ti­ple pos­si­bil­i­ties can exceed their work­ing mem­ory capac­ity (Byrne, 2005; John­son-Laird and Byrne, 1991; 2002). They also need to be able to switch their atten­tion between the alter­na­tive pos­si­bil­i­ties to reach a solu­tion. On this account, key com­po­nent skills required in insight prob­lem solv­ing include atten­tion switch­ing and work­ing mem­ory skill­s….At­ten­tion and work­ing mem­ory may be cru­cial for differ­ent aspects of suc­cess­ful insight prob­lem solv­ing. Plan­ning a num­ber of moves in advance may be impor­tant to solve insight prob­lems such as the well-known nine-dot prob­lem (Chron­i­cle, Ormerod and Mac­Gre­gor, 2001). Atten­tion may play a role in help­ing peo­ple to decide what ele­ments of a prob­lem to focus on or in help­ing them to direct the search for rel­e­vant infor­ma­tion inter­nally and exter­nal­ly.

…In­di­vid­u­als who are good at solv­ing insight prob­lems are also good at switch­ing atten­tion. Cor­rect per­for­mance on the insight prob­lems was asso­ci­ated with cor­rect per­for­mance on the visual ele­va­tor task (r=.515, p<.01). Cor­rect per­for­mance on the insight prob­lems was asso­ci­ated with cor­rect per­for­mance on the plus-mi­nus prob­lems (r=-.511, n=32, p<.001)…­Con­sis­tent with this account indi­vid­u­als who are bet­ter at stor­ing and pro­cess­ing infor­ma­tion in work­ing mem­ory are bet­ter at solv­ing insight prob­lems. [cor­re­la­tion with prob­lem score: r=.39 for digit span, r=.511 for sen­tence span]73

The major piece of exper­i­men­tal evi­dence is Takeuchi 2011 & Var­tan­ian 2013, treated at length in the fol­low­ing sub­sec­tion and well worth con­sid­er­a­tion; the rest of this sec­tion will dis­cuss other lines of evi­dence.

is related to changes caused by n-back­ing (see the McNab recep­tor study & for a gen­eral review, Söderqvist et al 2011), and increase in dopamine has been shown to cause a nar­row­ing of focus/associations in tasks74. There are other related cor­re­la­tions on this; for exam­ple, Cas­sim­jee 201075 report that “…the tem­pera­ment dimen­sion of was inversely related to per­for­mance accu­racy on the LNB2 (Let­ter-N-Back).” But as ever, ; this result might not mean any­thing about some­one delib­er­ately increas­ing per­for­mance accu­racy by prac­tice - we might take it to mean just that nar­row unin­ter­ested peo­ple had a small advan­tage at n-back­ing when they first began. Cas­sim­jee 2010 cites 2 other stud­ies sug­gest­ing what this cor­re­la­tion means: “…par­tic­i­pants with higher impul­siv­ity may lack the atten­tional resources to retain crit­i­cal infor­ma­tion and inhibit irrel­e­vant infor­ma­tion. The acti­va­tion of reac­tive con­trol, which is a sys­tem that mon­i­tors, mod­u­lates and reg­u­lates reac­tive aspects of tem­pera­ment, is inhib­ited in indi­vid­u­als high in nov­elty seek­ing…” This sug­gests the per­for­mance differ­ence is a weak­ness that can be strength­ened, not to a fun­da­men­tal trade-off.

Reports from n-back­ers are mixed. One neg­a­tive report is from john21012101:

I’ve done the dual n-back task avidly for over a month and while I find it makes me men­tally sharper, that comes a high cost - the loss of cre­ativ­ity and lat­eral think­ing. In fact, I expe­ri­ence what is called as severe directed atten­tion fatigue (see www.troutfoot.com/attn/dafintro.html).

…and even short booster ses­sions severely impair cre­ativ­ity to the point that one becomes very men­tally flat, sin­gle-mind­ed, and I’d even say zom­bie-ish.

Ashir­go, chin­mi04, & putomayo begged to differ in the same thread, with biped plump­ing for a null result.

There are some the­o­ret­i­cal rea­sons to believe DNB isn’t caus­ing gains at the expense of cre­ativ­i­ty, as there is that Jaeggi study show­ing Gf gains, and Gf is mildly cor­re­lated with cre­ativ­i­ty, accord­ing to exi­gentsky:

"Fur­ther­more, if the pre­lim­i­nary results hold and dual-n-back actu­ally increased Gf, it should actu­ally con­tribute to cre­ativ­ity for most peo­ple. After all, stud­ies have shown that cre­ativ­ity (ac­cord­ing to stan­dard tests) and IQ are sig­nifi­cantly cor­re­lated to a cer­tain point (~120 on most). While both tests are imper­fect and incom­plete, they do give a gen­eral pic­ture.

I have not felt a decrease in my cre­ativ­ity and am skep­ti­cal of the idea that dual-n-back harms it. If the pur­ported mech­a­nism is increas­ing , that would be an even big­ger break­through than increas­ing IQ. The for­mer is still largely con­sid­ered immutable."

Vlad has some more details on those cor­re­la­tions:

“Last but not least, there was this research”Rela­tion­ship of intel­li­gence and cre­ativ­ity in gifted and non-gifted stu­dents“, which I stud­ied because of this today, and they found pos­i­tive cor­re­la­tion IQ vs ver­bal and fig­ural cre­ative processes (flu­en­cy, flex­i­bil­i­ty, object design­ing, spe­cific traits, insight…). And this mild cor­re­la­tion (of 0.3 - 0.5), did not differ for differ­ent IQ lev­els (higher IQs had mild higher cre­ativ­i­ty, lower IQs had mild lower cre­ativ­ity - always mild rela­tion­ship, so excep­tions too, but in gen­eral more IQ meant more cre­ativ­i­ty).”

On the other hand, Vlad also points out that:

"…there are few the­o­ries how WM works, and one of the most explain­ing is, that WM and atten­tion are tied closely together (Ash always empha­sizes this and he is right :). This should work through the fact, that higher WM means more sources for inhi­bi­tion of dis­trac­tion. So, the more WM, the bet­ter you can con­cen­trate. They tested this with cock­tail party effect: in gen­er­al, only 33% of per­sons catch their name from irrel­e­vant back­ground noise, while con­cen­trat­ing on some task. Now they found, that only 20% of high WM peo­ple caught their name, but 65% of low WM. On the other side, con­tem­po­rary researches some­times differ between WM, STM, pri­mary / sec­ondary WM, even LTM… But the point is, atten­tion works at least partly as a fil­ter, and it gets bet­ter with higher WM.

Now the issue with cre­ativ­i­ty. I find this inter­est­ing, because I think some­body here wor­ried already about being sub­jec­tively less cre­ative than before BW train­ing, and I got this feel­ing few times too.

…Ev­ery cre­ator must deeply con­cen­trate on his work. Maybe there are differ­ent kinds of cre­ativ­i­ty: “ADHD” cre­ativ­i­ty, mean­ing­ful cre­ativ­i­ty, brain­storm­ing cre­ativ­i­ty, appre­ci­a­tion of art, and so on.

Btw after train­ing dnb, I got this inter­est in art - I down­loaded lots of clas­si­cal and other artis­tic pic­tures (never before), and really enjoyed choos­ing which I like. Or have you ever seen “the hours”? I fell in love with that movie and even started to read things from vir­ginia woolf"

As well, Pheonexia points out that & McNab 2009 demon­strated increases in var­i­ous things related to because of DNB, and that there is one study that “Dopamine ago­nists dis­rupt visual latent inhi­bi­tion in nor­mal males using a with­in-sub­ject par­a­digm”.

Takeuchi 2011

, Takeuchi et al 2011:

Train­ing work­ing mem­ory (WM) improves per­for­mance on untrained cog­ni­tive tasks and alters func­tional activ­i­ty. How­ev­er, WM train­ing’s effects on gray mat­ter mor­phol­ogy and a wide range of cog­ni­tive tasks are still unknown. We inves­ti­gated this issue using vox­el-based mor­phom­e­try (VBM), var­i­ous psy­cho­log­i­cal mea­sures, such as non-trained WM tasks and a cre­ativ­ity task, and inten­sive adap­tive train­ing of WM using men­tal cal­cu­la­tions (IATWMMC), all of which are typ­i­cal WM tasks. IATWMMC was asso­ci­ated with reduced regional gray mat­ter vol­ume in the bilat­eral fron­to-pari­etal regions and the left supe­rior tem­po­ral gyrus. It improved ver­bal let­ter span and com­plex arith­metic abil­i­ty, but dete­ri­o­rated cre­ativ­i­ty. These results con­firm the train­ing-in­duced plas­tic­ity in psy­cho­log­i­cal mech­a­nisms and the plas­tic­ity of gray mat­ter struc­tures in regions that have been assumed to be under strong genetic con­trol.

Takeuchi 2011 has many points of inter­est:

  • these sub­jects are really high qual­ity stu­dents and grad stu­dents - which is why a num­ber of them hit the RAPM ceil­ing (!); and it’s implied they are all stu­dents. Tohoku isn’t Tokyo U, but it’s still really good, Wikipedia telling me “It is the third old­est Impe­r­ial Uni­ver­sity in Japan and is a mem­ber of the National Seven Uni­ver­si­ties. It is con­sid­ered as one of the top uni­ver­si­ties in Japan, and one of the top 50 uni­ver­si­ties in the world.”

  • While high qual­i­ty, there aren’t that many of them; Jaeggi 2008 had 35 sub­jects doing WM train­ing, while this one has 18 doing the adap­tive and another 18 doing non-adap­tive, and the last of the 55 were pure con­trol. So a lit­tle more than half as many; this is reflected in some of the weak results, so while rather dis­turbing, this isn’t a defin­i­tive refu­ta­tion or any­thing.

  • the WM task sub­jects did not see any rel­a­tive IQ gains, or much of a gain at all; the IATWMMC (adap­tive arith­metic) group went from 27.3±1 to 31.3±0.7, and the placebo group (non-adap­tive arith­metic) went from 29.1±0.9 to 32.0±0.8. This does­n’t show any notice­able differ­ence, the authors describ­ing the IQ as ‘prob­a­bly void’.

  • 20 hours of train­ing is more than twice as much train­ing as Jaeggi 2008’s longest group76, so one should not dis­miss this solely on the grounds ‘if only they had trained more’

  • adap­tive arith­metic does­n’t seem like much of a WM task; they did do some n-back­ing (men­tioned briefly) dur­ing the fMRI pre/post, but not clear why they chose arith­metic over n-back. On the other hand, don’t many n-back­ers use the arith­metic mod­es…?

  • the adap­tive­ness is really impor­tant; they say the group doing non-adap­tive arith­metic was the same as the no-in­ter­ven­tion group on every mea­sure! Includ­ing ‘a com­plex arith­metic task’

  • one of the key quotes:

    Behav­ioral results com­par­ing the com­bined con­trol group, and the IATWMMC group showed a sig­nifi­cantly larger pre- to post- test increase for per­for­mance of a com­plex arith­metic task (P = 0.049), for per­for­mance of the let­ter span task (P = 0.002), and for reverse Stroop inter­fer­ence (P = 0.008) in the IATWMMC group. The IATWMMC group showed a sig­nifi­cantly larger pre- to post- test decrease in cre­ativ­ity test per­for­mance (P = 0.007) (for all the results of the psy­cho­log­i­cal mea­sures, see Table 1). Also the IATWMMC group showed a sta­tis­ti­cal trend of increase in the men­tal rota­tion task (P = 0.064).

  • About the only good news for n-back­ers is that the results were not huge enough to eas­ily sur­vive mul­ti­ple-com­par­i­son cor­rec­tion

    We per­formed sev­eral psy­cho­log­i­cal tests and did not cor­rect for the num­ber of com­par­isons between sta­tis­ti­cal tests, as is almost always the case with this kind of study. When cor­rected using the , even after remov­ing the prob­a­bly void tests (RAPM and WAIS arith­metic), the sta­tis­ti­cal value for the effect of IATWMMC on the cre­ativ­ity tests mar­gin­ally sur­passed the thresh­old of P = 0.05 (P = 0.06). Thus, the results should be inter­preted with cau­tion until repli­cat­ed.

Vartanian 2013

“Work­ing Mem­ory Train­ing Is Asso­ci­ated with Lower Pre­frontal Cor­tex Acti­va­tion in a Diver­gent Think­ing Task”; empha­sis added:

Work­ing mem­ory (WM) train­ing has been shown to lead to improve­ments in WM capac­ity and fluid intel­li­gence. Given that diver­gent think­ing loads on WM and fluid intel­li­gence, we tested the hypoth­e­sis that WM train­ing would improve per­for­mance and mod­er­ate neural func­tion in the Alter­nate Uses Task (AUT)-a clas­sic test of diver­gent think­ing. We tested this hypoth­e­sis by admin­is­ter­ing the AUT in the func­tional mag­netic res­o­nance imag­ing scan­ner fol­low­ing a short reg­i­men of WM train­ing (ex­per­i­men­tal con­di­tion), or engage­ment in a choice reac­tion time task not expected to engage WM (ac­tive con­trol con­di­tion). Par­tic­i­pants in the exper­i­men­tal group exhib­ited sig­nifi­cant improve­ment in per­for­mance in the WM task as a func­tion of train­ing, as well as a sig­nifi­cant gain in fluid intel­li­gence. Although the two groups did not differ in their per­for­mance on the AUT, acti­va­tion was sig­nifi­cantly lower in the exper­i­men­tal group in ven­tro­lat­eral pre­frontal and dor­so­lat­eral pre­frontal cor­tex-two brain regions known to play dis­so­cia­ble and crit­i­cal roles in diver­gent think­ing. Fur­ther­more, gain in fluid intel­li­gence medi­ated the effect of train­ing on brain acti­va­tion in ven­tro­lat­eral pre­frontal cor­tex. These results indi­cate that a short reg­i­men of WM train­ing is asso­ci­ated with lower pre­frontal acti­va­tion - a marker of neural effi­ciency - in diver­gent think­ing.

Non-IQ or non-DNB gains

This sec­tion is for stud­ies that tested non-DNB WM inter­ven­tions on IQ, or DNB inter­ven­tions on non-IQ prop­er­ties, and mis­cel­la­neous.

Chein 2010

“Expand­ing the mind’s work­space: Train­ing and trans­fer effects with a com­plex work­ing mem­ory span task” (FLOSS imple­men­ta­tion); from the intro­duc­tion:

In the present study, a novel work­ing mem­ory (WM) train­ing par­a­digm was used to test the mal­leabil­ity of WM capac­ity and to deter­mine the extent to which the ben­e­fits of this train­ing could be trans­ferred to other cog­ni­tive skills. Train­ing involved ver­bal and spa­tial ver­sions of a com­plex WM span task designed to empha­size simul­ta­ne­ous stor­age and pro­cess­ing require­ments. Par­tic­i­pants who com­pleted 4 weeks of WM train­ing demon­strated sig­nifi­cant improve­ments on mea­sures of tem­po­rary mem­o­ry. These WM train­ing ben­e­fits gen­er­al­ized to per­for­mance on the Stroop task and, in a novel find­ing, pro­moted sig­nifi­cant increases in read­ing com­pre­hen­sion. The results are dis­cussed in rela­tion to the hypoth­e­sis that WM train­ing affects domain-gen­eral atten­tion con­trol mech­a­nisms and can thereby elicit far-reach­ing cog­ni­tive ben­e­fits. Impli­ca­tions include the use of WM train­ing as a gen­eral tool for enhanc­ing impor­tant cog­ni­tive skills.

While WM train­ing yielded many valu­able ben­e­fits such as increased read­ing com­pre­hen­sion, it did not improve IQ as mea­sured by an unspeeded Advanced Pro­gres­sive Matri­ces (APM) IQ test;

How­ev­er, such power lim­i­ta­tions do not read­ily account for our fail­ure to repli­cate a trans­fer of WM train­ing ben­e­fits to mea­sures of fluid intel­li­gence (as was observed by Jaeggi et al., 2008), since we did not find even a trend for improve­ment in trained par­tic­i­pants on Raven’s APM. Beyond sta­tis­ti­cal expla­na­tions, differ­ences in the train­ing par­a­digms used for the two stud­ies may explain the differ­ences in trans­fer effects. The train­ing pro­gram used by Jaeggi et al. (2008) involved 400 tri­als per train­ing ses­sion, with a dual n-back train­ing par­a­digm designed to empha­size bind­ing processes and task man­age­ment. Con­verse­ly, our train­ing par­a­digm included only 32 tri­als per ses­sion and more heav­ily empha­sized main­te­nance in the face of dis­trac­tion. Final­ly, the seem­ingly con­flict­ing results may be due to differ­ences in intel­li­gence test admin­is­tra­tion. As was pointed out in a recent cri­tique (Moody, 2009), Jaeggi et al. (2008) used atyp­i­cal speeded pro­ce­dures in admin­is­ter­ing their tests of fluid intel­li­gence, and these alter­ations may have con­founded the appar­ent effect of WM train­ing on intel­li­gence.

Colom 2010

“Improve­ment in work­ing mem­ory is not related to increased intel­li­gence scores” (full text) trained 173 stu­dents on WM tasks (such as the ) with ran­dom­ized diffi­cul­ties, and found no linked IQ improve­ment; the IQ tests were “the Advanced Pro­gres­sive Matri­ces Test (APM) along with the abstract rea­son­ing (DAT-AR), ver­bal rea­son­ing (DAT-VR), and spa­tial rela­tions (DAT-SR) sub­tests from the Differ­en­tial Apti­tude Test Bat­tery”. None were speeded as in Jaeggi 2008. Abstract:

The acknowl­edged high rela­tion­ship between work­ing mem­ory and intel­li­gence sug­gests com­mon under­ly­ing cog­ni­tive mech­a­nisms and, per­haps, shared bio­log­i­cal sub­strates. If this is the case, improve­ment in work­ing mem­ory by repeated expo­sure to chal­leng­ing span tasks might be reflected in increased intel­li­gence scores. Here we report a study in which 288 uni­ver­sity under­grad­u­ates com­pleted the odd num­bered items of four intel­li­gence tests on time 1 and the even num­bered items of the same tests one month later (time 2). In between, 173 par­tic­i­pants com­pleted three ses­sions, sep­a­rated by exactly one week, com­pris­ing ver­bal, numer­i­cal, and spa­tial short­-term mem­ory (STM) and work­ing mem­ory (WMC) tasks impos­ing high pro­cess­ing demands (STM-WMC group). 115 par­tic­i­pants also com­pleted three ses­sions, sep­a­rated by exactly one week, but com­pris­ing ver­bal, numer­i­cal, and spa­tial sim­ple speed tasks (pro­cess­ing speed, PS, and atten­tion, ATT) with very low pro­cess­ing demands (PS-ATT group). The main find­ing reveals increased scores from the pre-test to the post-test intel­li­gence ses­sion (more than half a stan­dard devi­a­tion on aver­age). How­ev­er, there was no differ­en­tial improve­ment on intel­li­gence between the STM-WMC and PS-ATT groups.

Com­men­ta­tors on the ML dis­cus­sion crit­i­cized the study for:

  1. Not using DNB itself
  2. appar­ently lit­tle train­ing time on the WM tasks (3 ses­sions over weeks, each of unclear dura­tion)
  3. the ran­dom­iza­tion of diffi­culty (as opposed to DNB’s adap­tive­ness)
  4. the large increase in scores on the WM tasks over the 3 ses­sions (sug­gest­ing grow­ing famil­iar­ity than real chal­lenge & growth)
  5. and the sta­tis­ti­cal obser­va­tion that if IQ gains were lin­ear with train­ing and started small then 173 par­tic­i­pants is not enough to observe with con­fi­dence any improve­ments.

Loosli et al 2011

“Work­ing mem­ory train­ing improves read­ing processes in typ­i­cally devel­op­ing chil­dren”, Loosli, Buschkuehl, Per­rig, and Jaeg­gi:

The goal of this study was to inves­ti­gate whether a brief cog­ni­tive train­ing inter­ven­tion results in a spe­cific per­for­mance increase in the trained task, and whether there are trans­fer effects to other non­trained mea­sures. A com­put­er­ized, adap­tive work­ing mem­ory inter­ven­tion was con­ducted with 9- to 11-year-old typ­i­cally devel­op­ing chil­dren. The chil­dren con­sid­er­ably improved their per­for­mance in the trained work­ing mem­ory task. Addi­tion­al­ly, com­pared to a matched con­trol group, the exper­i­men­tal group sig­nifi­cantly enhanced their read­ing per­for­mance after train­ing, pro­vid­ing fur­ther evi­dence for shared processes between work­ing mem­ory and read­ing.

This is show­ing con­nec­tion to use­ful tasks, but not for show­ing any gain to IQ. The differ­ence in score improve­ment between groups was small, half a point, and the train­ing period fairly short; the authors write:

Due to the short train­ing time, we did not expect large effects on Gf (cf. Jaeggi et al., 2008), also since two other stud­ies that trained ADHD chil­dren observed trans­fer effects on Gf only after 5 weeks involv­ing ses­sions of 40 min­utes each (Kling­berg et al., 2002, 2005).

In addi­tion, the same group failed to show trans­fer on Gf with a shorter train­ing (Thorell et al., 2008). Thus, con­sid­er­ing that our train­ing inter­ven­tion was merely 10 ses­sions long, our lack of trans­fer to Gf is hardly sur­pris­ing; although there is now recent evi­dence that trans­fer to Gf is pos­si­ble with very lit­tle train­ing time (; poster). Our results, how­ev­er, are com­pa­ra­ble to those of Chein and Mor­ri­son (2010), who also trained their par­tic­i­pants on a com­plex WM task and found no trans­fer to Gf.

(Sim­i­lar stud­ies have also found improve­ment in read­ing skills after WM train­ing, eg Dahlin 2011 and Shi­ran & Breznitz 2011, but I do not believe oth­ers used n-back or looked for pos­si­ble IQ gain­s.)

Nutley 2011

“Gains in fluid intel­li­gence after train­ing non-ver­bal rea­son­ing in 4-year-old chil­dren: a con­trolled, ran­dom­ized study”, Sis­sela Bergman Nut­ley et al:

Fluid intel­li­gence (Gf) pre­dicts per­for­mance on a wide range of cog­ni­tive activ­i­ties, and chil­dren with impaired Gf often expe­ri­ence aca­d­e­mic diffi­cul­ties. Pre­vi­ous attempts to improve Gf have been ham­pered by poor con­trol con­di­tions and sin­gle out­come mea­sures77. It is thus still an open ques­tion whether Gf can be improved by train­ing. This study included 4-year-old chil­dren (N = 101) who per­formed com­put­er­ized train­ing (15 min/day for 25 days) of either non-ver­bal rea­son­ing, work­ing mem­o­ry, a com­bi­na­tion of both, or a placebo ver­sion of the com­bined train­ing. Com­pared to the placebo group, the non-ver­bal rea­son­ing train­ing group improved sig­nifi­cantly on Gf when analysed as a latent vari­able of sev­eral rea­son­ing tasks. Smaller gains on prob­lem solv­ing tests were seen in the com­bi­na­tion train­ing group. The group train­ing work­ing mem­ory improved on mea­sures of work­ing mem­o­ry, but not on prob­lem solv­ing tests. This study shows that it is pos­si­ble to improve Gf with train­ing, which could have impli­ca­tions for early inter­ven­tions in chil­dren.


  1. The WM tasks were not n-back:

    “The WM train­ing was the same as described in Thorell et al. (2009) devel­oped by Cogmed Sys­tems Inc. There were seven differ­ent ver­sions of visuo-s­pa­tial WM tasks, out of which three were trained every day on a rotat­ing sched­ule. Briefly, the tasks all con­sisted of a num­ber of ani­mated fig­ures pre­sented in differ­ent set­tings (e.g. swim­ming in a pool, rid­ing on a roller­coast­er). Some of the fig­ures (start­ing with two fig­ures and then increas­ing in num­ber depend­ing on the child’s per­for­mance) made a sound and changed colour dur­ing a short time peri­od. The task then con­sisted of remem­ber­ing which fig­ures had changed colour and in what order this had occurred.”

  2. The mag­ni­tude of Gf increase was not sus­pi­ciously large:

    “The NVR train­ing group showed trans­fer both when this was esti­mated with sin­gle tests, as well as when Gf was mea­sured as a latent vari­able. The mag­ni­tude of this improve­ment was approx­i­mately 8% (com­pared to the placebo group) which is com­pa­ra­ble with pre­vi­ously reported gains of Gf of 5-13.5% (Hamers et al., 1998; Jaeggi et al., 2008; Klauer & Willmes, 2002; Stankov, 1986).”

  3. There are some pos­si­ble coun­ter-ar­gu­ments to gen­er­al­iz­ing the lack of Gf gains in the WM-only group, mostly related to the young age:

    "This could mean that WM is not a lim­it­ing fac­tor for 4-year-old chil­dren solv­ing rea­son­ing prob­lems such as Raven’s CPM and Block Design. The mod­er­ate cor­re­la­tions between the Grid Task and the rea­son­ing tests (be­tween 0.3 and 0.6, see Table 1) point to the some­what coun­ter­in­tu­itive con­clu­sion that cor­re­la­tion between two under­ly­ing abil­i­ties is not a suffi­cient pre­dic­tor to deter­mine amount of trans­fer of train­ing effects between these abil­i­ties. A sim­i­lar con­clu­sion was drawn after the lack of train­ing effects on WM after train­ing inhibitory func­tions (Thorell et al., 2009). In that study WM capac­ity cor­re­lated with per­for­mance on the inhibitory tasks at base­line (R = 0.3). An imag­ing study also showed that per­for­mance on a WM grid task and inhibitory tasks acti­vate over­lap­ping parts of the cor­tex (). Inhibitory train­ing improved per­for­mance on the trained tasks, yet there was no trans­fer seen on WM tasks. The prin­ci­ples gov­ern­ing the type of cog­ni­tive train­ing that will trans­fer are still unclear and pose an impor­tant ques­tion for future stud­ies.

    One way to find these prin­ci­ples may be through under­stand­ing the neural mech­a­nisms of train­ing. For exam­ple, WM train­ing in 4-year-olds might have a more pro­nounced effect on the pari­etal lobe, com­pared to the less mature frontal lobe. If the trans­fer to Gf is depen­dent on pre­frontal func­tions, it may explain the lack of trans­fer from WM train­ing to Gf in 4-year-olds. In other words, trans­fer effects may differ with the pro­gres­sion of devel­op­men­t."

Zhao et al 2011

“Effect of updat­ing train­ing on fluid intel­li­gence in chil­dren”, Chi­nese Sci­ence Bul­letin:

Recent stud­ies have indi­cated that work­ing mem­ory (WM) train­ing can improve fluid intel­li­gence. How­ev­er, these ear­lier stud­ies con­fused the impact of WM stor­age and cen­tral exec­u­tive func­tion on the effects of train­ing. The cur­rent study used the run­ning mem­ory task to train the updat­ing abil­ity of [33] 9-11 year-old chil­dren using a dou­ble-blind con­trolled design. The results revealed that chil­dren’s fluid intel­li­gence was sig­nifi­cantly improved by mem­o­ry-up­dat­ing train­ing. Over­all, our find­ings sug­gest that the increase in fluid intel­li­gence achieved with WM train­ing is related to improv­ing cen­tral exec­u­tive func­tion.

Roughan & Hadwin 2011

“The impact of work­ing mem­ory train­ing in young peo­ple with social, emo­tional and behav­ioural diffi­cul­ties”, Laura Roughan & Julie A. Had­win 2011:

This study exam­ined the impact of a work­ing mem­ory (WM) train­ing pro­gramme on mea­sures of WM, IQ, behav­ioural inhi­bi­tion, self­-re­port test and trait anx­i­ety and teacher reported emo­tional and behav­ioural diffi­cul­ties and atten­tional con­trol before and after WM train­ing and at a 3 month fol­low-up. The WM train­ing group (N=7) showed sig­nifi­cantly bet­ter post-train­ing on mea­sures of IQ, inhi­bi­tion, test anx­i­ety and teacher-re­ported behav­iour, atten­tion and emo­tional symp­toms, com­pared with a non-in­ter­ven­tion pas­sive con­trol group (N=8). Group differ­ences in WM were also evi­dent at fol­low-up. The results indi­cated that WM train­ing has some poten­tial to be used to reduce the devel­op­ment of school related diffi­cul­ties and asso­ci­ated men­tal health prob­lems in young peo­ple. Fur­ther research using larger sam­ple sizes and mon­i­tor­ing over a longer time period is needed to repli­cate and extend these results.

The WM train­ing was done using Cogmed; it’s unclear whether the Cogmed tasks use DNB or not (they seem to have sim­i­lar tasks avail­able in it, at least), but the study did find IQ gains:

Con­sid­er­ing T1 T2 IQ differ­ence scores, the analy­sis revealed a sig­nifi­cant group effect with a large ES (F(1,14) = 10.37, p<.01, n = 0.44); the inter­ven­tion group showed increased IQ differ­ence scores (N = 7, mean=5.36, SD = 6.52, range= -2.5 to 17.5) com­pared with the con­trol group (N = 7, mean=-6.35, SD = 7.21, range = -15 to 5). T1 T3 analy­ses indi­cated that the T1 T3 differ­ence was not sig­nifi­cant (see Fig. 1).

Note the means as com­pared with the stan­dard devi­a­tion; these are very trou­bled young peo­ple.

Brehmer et al 2012

“Work­ing-mem­ory train­ing in younger and older adults: train­ing gains, trans­fer, and main­te­nance”:

Work­ing mem­ory (WM), a key deter­mi­nant of many high­er-order cog­ni­tive func­tions, declines in old age. Cur­rent research attempts to develop process-spe­cific WM train­ing pro­ce­dures, which may lead to gen­eral cog­ni­tive improve­ment. Adap­tiv­ity of the train­ing as well as the com­par­i­son of train­ing gains to per­for­mance changes of an active con­trol group are key fac­tors in eval­u­at­ing the effec­tive­ness of a spe­cific train­ing pro­gram. In the present study, 55 younger adults (20-30 years of age) and 45 older adults (60-70 years of age) received 5 weeks of com­put­er­ized train­ing on var­i­ous spa­tial and ver­bal WM tasks. Half of the sam­ple received adap­tive train­ing (i.e., indi­vid­u­ally adjusted task diffi­cul­ty), whereas the other half-worked on the same task mate­r­ial but on a low task diffi­culty level (ac­tive con­trol­s). Per­for­mance was assessed using cri­te­ri­on, near-trans­fer, and far-trans­fer tasks before train­ing, after 5 weeks of inter­ven­tion, as well as after a 3-month fol­low-up inter­val. Results indi­cate that (a) adap­tive train­ing gen­er­ally led to larger train­ing gains t/han low-level prac­tice, (b) train­ing and trans­fer gains were some­what greater for younger than for older adults in some tasks, but com­pa­ra­ble across age groups in other tasks, (c) far-trans­fer was observed to a test on sus­tained atten­tion and for a self­-rat­ing scale on cog­ni­tive func­tion­ing in daily life for both young and old, and (d) train­ing gains and trans­fer effects were main­tained across the 3-month fol­low-up inter­val across age.

Used Cogmed, which Jaeggi says is not dual n-back.


One fas­ci­nat­ing psy­chol­ogy result is that strongly peo­ple can improve their mem­ory (and pos­si­bly N-back per­for­mance) by sim­ply tak­ing 30 sec­onds and flick­ing (“”) their eyes left and right (for a sum­ma­ry, see “A quick eye­-ex­er­cise can improve your per­for­mance on mem­ory tests (but only if you’re right-hand­ed)”).

Ver­sion 4.5 of Brain Work­shop intro­duced a sac­cad­ing fea­ture: a dot alter­nates sides of the screen and one is to fol­low it with one’s eyes. You acti­vate it by press­ing ‘e’ while in fullscreen mode (set­ting WINDOW_FULLSCREEN = True in the con­fig­u­ra­tion file). It may or may not a bad idea to alter­nate rounds of N-back with rounds of sac­cad­ing. At my request, sac­cad­ing logs are now kept by BW so at some point in the future, it should be pos­si­ble to request logs from users and see whether sac­cad­ing in gen­eral cor­re­lates with N-back per­for­mance; I per­son­ally ran­dom­ized use of sac­cad­ing, but saw no ben­e­fits (see next sec­tion).

Ashirgo writes that her pre­vi­ous advice encom­passes this eye­-move­ment result; Pheonexia reports that after try­ing the sac­cad­ing before a BW ses­sion, he “per­formed bet­ter than I ever have before”.

The most recent study on this effect seems to be “Eye move­ments enhance mem­ory for indi­vid­u­als who are strongly right-handed and harm it for indi­vid­u­als who are not”. It says:

Sub­jects who make repet­i­tive sac­cadic eye move­ments before a mem­ory test sub­se­quently exhibit supe­rior retrieval in com­par­i­son with sub­jects who do not move their eyes. It has been pro­posed that eye move­ments enhance retrieval by increas­ing inter­ac­tion of the left and right cere­bral hemi­spheres. To test this, we com­pared the effect of eye move­ments on sub­se­quent recall (Ex­per­i­men­t1) and recog­ni­tion (Ex­per­i­men­t2) in two groups thought to differ in base­line degree of hemi­spheric inter­ac­tion-in­di­vid­u­als who are strongly right-handed (SR) and indi­vid­u­als who are not (nSR). For SR sub­jects, who nat­u­rally may expe­ri­ence less hemi­spheric inter­ac­tion than nSR sub­jects, eye move­ments enhanced retrieval. In con­trast, depend­ing on the mea­sure, eye move­ments were either incon­se­quen­tial or even detri­men­tal for nSR sub­jects. These results par­tially sup­port the hemi­spheric inter­ac­tion account, but demand an amend­ment to explain the harm­ful effects of eye move­ments for nSR indi­vid­u­als.

(Note that very impor­tant caveat: this is a use­ful tech­nique only for strongly right-handed peo­ple; weak right­ies and left­ies are out­right harmed by this tech­nique.)

See also “Inter­hemi­spheric Inter­ac­tion and Sac­cadic Hor­i­zon­tal Eye Move­ments: Impli­ca­tions for Episodic Mem­o­ry, EMDR, and PTSD; “The effi­cacy and psy­chophys­i­o­log­i­cal cor­re­lates of dual-at­ten­tion tasks in eye move­ment desen­si­ti­za­tion and repro­cess­ing (EMDR)”; “Hor­i­zon­tal sac­cadic eye move­ments enhance the retrieval of land­mark shape and loca­tion infor­ma­tion”; “Reduced mis­in­for­ma­tion effects fol­low­ing sac­cadic bilat­eral eye move­ments”; “Is sac­cade-in­duced retrieval enhance­ment a poten­tial means of improv­ing eye­wit­ness evi­dence?”


Brain Work­shop now has log­ging of sac­cad­ing imple­ment­ed; this was added at my request to make exper­i­ment­ing with sac­cad­ing eas­ier, since you can’t com­pare scores unless you know when you were sac­cad­ing or not. After this was added (thanks Jonathan etc), I began to ran­dom­ize each day to sac­cad­ing or not-sac­cad­ing before rounds with a coin flip. Blind­ing is impos­si­ble, so I did noth­ing about that. After 158 rounds over roughly 35 days between 10 Sep­tem­ber and 2012-11-05, the result is: no differ­ence. Not even close. So appar­ently though I am strongly right-handed as the orig­i­nal study’s mem­ory effect required, sac­cad­ing makes no differ­ence to my n-back per­for­mance.


My BW data had to be parsed by hand and some Emacs macros because I could­n’t fig­ure out a nice clean pro­gram­matic way to parse it and spit out scores divvied by whether they were on a sac­cade on or off day (so if you want to repli­cate my analy­sis, you’ll have to do that your­self). The analy­sis78 using BEST reveals a differ­ence of less than 1% right (+0.4%) per round, and the esti­mates of effect size are neg­a­tive almost as often as they are pos­i­tive:

Bayesian MCMC esti­mates of differ­ence in sac­cad­ing and non-sac­cad­ing scores

Since there’s hardly any evi­dence even though this looks like plenty of data, I think I’ll stop doing sac­cad­ing. I can only speak for myself, so I would be pleased if other right-handed n-back­ers could adopt a sim­i­lar pro­ce­dure and see whether per­haps I am an excep­tion.


“Sleep Accel­er­ates the Improve­ment in Work­ing Mem­ory Per­for­mance”, Kuriyama 2008:

Work­ing mem­ory (WM) per­for­mance, which is an impor­tant fac­tor for deter­min­ing prob­lem-solv­ing and rea­son­ing abil­i­ty, has been firmly believed to be con­stant. How­ev­er, recent find­ings have demon­strated that WM per­for­mance has the poten­tial to be improved by repet­i­tive train­ing. Although var­i­ous skills are reported to be improved by sleep, the ben­e­fi­cial effect of sleep on WM per­for­mance has not been clar­i­fied. Here, we show that improve­ment in WM per­for­mance is facil­i­tated by post­train­ing nat­u­ral­is­tic sleep. A spa­tial vari­ant of the n-back WM task was per­formed by 29 healthy young adults who were assigned ran­domly to three differ­ent exper­i­men­tal groups that had differ­ent time sched­ules of repet­i­tive n-back WM task ses­sions, with or with­out inter­ven­ing sleep. Inter­group and inter­s­es­sion com­par­isons of WM per­for­mance (ac­cu­racy and response time) pro­files showed that n-back accu­racy after post­train­ing sleep was sig­nifi­cantly improved com­pared with that after the same period of wake­ful­ness, inde­pen­dent of sleep tim­ing, sub­jec­t’s vig­i­lance lev­el, or cir­ca­dian influ­ences. On the other hand, response time was not influ­enced by sleep or repet­i­tive train­ing sched­ules. The present study indi­cates that improve­ment in n-back accu­ra­cy, which could reflect WM capac­i­ty, essen­tially ben­e­fits from post­train­ing sleep.

(In this test, the baseline/unpracticed per­for­mance of the two groups was the same; but the sched­ule in which sub­jects trained at 10 PM and went to bed resulted in greater improve­ments in per­for­mance than sched­ules in which sub­jects trained when they got up at 8 AM and went to bed ~10 PM.)

Lucid dreaming

, pio­neer of writes79

“Why then is CNS acti­va­tion nec­es­sary for lucid dream­ing? Evi­dently the high level of cog­ni­tive func­tion involved in lucid dream­ing requires a cor­re­spond­ingly high level of neu­ronal acti­va­tion. In terms of Antrobus’s (1986) adap­ta­tion of Ander­son’s (1983) ACT* model of cog­ni­tion to dream­ing, work­ing mem­ory capac­ity is pro­por­tional to cog­ni­tive acti­va­tion, which in turn is pro­por­tional to cor­ti­cal acti­va­tion. Becom­ing lucid requires an ade­quate level of work­ing mem­ory to active the presleep inten­tion to rec­og­nize that one is dream­ing. This level of acti­va­tion is appar­ently not always avail­able dur­ing sleep but nor­mally only dur­ing pha­sic REM.”

has appar­ently spec­u­lated80 that WM and the pre­frontal cor­tex is par­tially de-ac­ti­vated dur­ing REM sleep and this is why dream­ers do not real­ize they are dream­ing - the same region that n-back tasks acti­vate.81 The sug­ges­tion then goes that n-back train­ing will enable greater dream recog­ni­tion & recall, which are cru­cial skills for any would-be lucid dream­er. A num­ber of peo­ple have reported only dreams and lucid dreams as the result of n-back train­ing (eg. Boris & Michael).

On the other hand, I have seen anec­do­tal reports that any intense men­tal exer­cise or learn­ing causes increased dream­ing, even if the exer­cise is domain-spe­cific (eg. the famous ) or just mem­o­riza­tion (as in use of Mnemosyne for ), and LaBerge also remarks (pg 165 of Explor­ing the World of Lucid Dream­ing):

Most peo­ple assume that a major func­tion of sleep­ing and dream­ing is rest and recu­per­a­tion. This pop­u­lar con­cep­tion has been upheld by research. Thus, for humans, phys­i­cal exer­cise leads to more sleep, espe­cially delta sleep. Growth hor­mone, which trig­gers growth in chil­dren and the repair of stressed tis­sues, is released in delta sleep. On the other hand, men­tal exer­cise or emo­tional stress appears to result in increases in REM sleep and dream­ing.


Gen­eral cog­ni­tive fac­tors like work­ing mem­ory and pro­cess­ing speed (& per­cep­tual pro­cess­ing82) are traits that peak in early adult hood and then decline over a life­time; the fol­low­ing image was adapted by Giz­modo from a study of age-re­lated decline, “Mod­els of visu­ospa­tial and ver­bal mem­ory across the adult life span”83. The units are , units of stan­dard devi­a­tions (so for the 80 year olds to be two full units below the 20 year olds indi­cates a pro­found fall in the aver­ages84); the first image is from Park et al 2002:

Graph of mul­ti­ple men­tal traits due to age-re­lated decline (in stan­dard devi­a­tions) Schaie 1996, Intel­lec­tual Devel­op­ment…

A in the Cam­bridge brain-train­ing study found “Age, was by far the most sig­nifi­cant pre­dic­tor of per­for­mance, with the mean scores of indi­vid­u­als in their 60s ~1.7 SDs below those in their early 20s (Fig­ure 4a). (Note, in intel­li­gence test­ing, 1 SD is equiv­a­lent to 15 IQ points).” These declines in rea­son­ing affect valu­able real-world activ­i­ties like per­sonal finance85, and sim­ple every­day ques­tions:

from Agar­wal et al 2009, “The Age of Rea­son: Finan­cial Deci­sions over the Life-Cy­cle with Impli­ca­tions for Reg­u­la­tion”

These results may be sur­pris­ing because some stud­ies did not find such dra­matic decli­nes, but appar­ently part of the decline can be hid­den by prac­tice effects86, and they are con­sis­tent with other results like the life­long changes in ( in & 87, the lat­ter decline pos­si­bly ). Lon­gi­tu­di­nal stud­ies are pes­simistic, find­ing declines early on, in one’s 40s (Sing-Manoux et al 2011). The degra­da­tion of white mat­ter and its effects on episodic mem­ory retrieval have been using . Another 2011 study test­ing 2000 indi­vid­u­als between 18 and 60 found that “Top per­for­mances in some of the tests were accom­plished at the age of 22. A notable decline in cer­tain mea­sures of abstract rea­son­ing, brain speed and in puz­zle-solv­ing became appar­ent at 27.”88 (Of course, like the pre­vi­ous study, a cor­re­la­tion over many indi­vid­u­als of vary­ing ages is not as good as hav­ing a series of per­for­mance mea­sure­ments for one aging indi­vid­ual. But time will cure that fault, hope­ful­ly.) The of this Salt­house study says:

…Re­sults from three meth­ods of esti­mat­ing retest effects in this pro­ject, together with results from stud­ies com­par­ing non-hu­man ani­mals raised in con­stant envi­ron­ments and from stud­ies exam­in­ing neu­ro­bi­o­log­i­cal vari­ables not sus­cep­ti­ble to retest effects, con­verge on a con­clu­sion that some aspects of age-re­lated cog­ni­tive decline begin in healthy edu­cated adults when they are in their 20s and 30s.

From the opti­mistic per­spec­tive, Salt­house tested For­tune 500 CEOs and found that their mem­ber­ship by aver­age age did­n’t start drop­ping until their 60s, sug­gest­ing that they remained rea­son­ably men­tally sharp or were, in prac­tice, com­pen­sat­ing for the many insults of age;89 this way of think­ing has obvi­ous flaws for the rest of us.

There are a num­ber of results indi­cat­ing that the elder­ly, per­haps because they have so much sev­erer cog­ni­tive deficits than the young, respond bet­ter to treat­ment. (This is com­mon in nootrop­ics, find­ing that some­thing does not work in the young but does in the elder­ly: eg. cre­a­tine.) IQ gains in young adults are diffi­cult and min­i­mal even in Jaeggi 2008, but older adults improve about as much as young adults in Brehmer et al 2012 and instruct­ing older adults to think aloud dur­ing an IQ test boosts scores (yet not younger adults)90, and train­ing >65-year olds in one adap­tive WM task sim­i­lar to SNB lead to gains of ~6 IQ points on the which were still present 8 months lat­er; “Work­ing Mem­ory Train­ing in Older Adults: Evi­dence of Trans­fer and Main­te­nance Effects” & Car­retti et al 2012 makes for inter­est­ing read­ing91:

Few stud­ies have exam­ined work­ing mem­ory (WM) train­ing-re­lated gains and their trans­fer and main­te­nance effects in older adults. This present research inves­ti­gates the effi­cacy of a ver­bal WM train­ing pro­gram in adults aged 65-75 years, con­sid­er­ing spe­cific train­ing gains on a ver­bal WM (cri­te­ri­on) task as well as trans­fer effects on mea­sures of visu­ospa­tial WM, short­-term mem­o­ry, inhi­bi­tion, pro­cess­ing speed, and fluid intel­li­gence. Main­te­nance of train­ing ben­e­fits was eval­u­ated at 8-month fol­low-up. Trained older adults showed higher per­for­mance than did con­trols on the cri­te­rion task and main­tained this ben­e­fit after 8 months. Sub­stan­tial gen­eral trans­fer effects were found for the trained group, but not for the con­trol one. Trans­fer main­te­nance gains were found at fol­low-up, but only for fluid intel­li­gence and pro­cess­ing speed tasks. The results are dis­cussed in terms of cog­ni­tive plas­tic­ity in older adults.

For more on aging and the brain, rec­om­mends read­ing

Hed­den T, Gabrieli JD. Nat Rev Neu­ro­science. 2004 Feb;5(2):87-96. ‘Insights into the aging mind: a view from cog­ni­tive neu­ro­science’. PMID 14735112, which is avail­able as full text from this link: http://brainybehavior.com/blog/wp-content/uploads/2007/11/agingbrain.pdf. I can­not rec­om­mend this paper highly enough. Addi­tion­al­ly, the Salt Cog­ni­tive Aging Lab­o­ra­to­ry, which over­sees the Vir­ginia Cog­ni­tive Aging Project (VCAP) at the Uni­ver­sity of Vir­ginia, is the pre­mier facil­ity in the US (and arguably the world) under­tak­ing active, lon­gi­tu­di­nal stud­ies of aging. The VCAP study has done com­pre­hen­sive cog­ni­tive assess­ments in adults rang­ing from 18 to 98 years of age. Approx­i­mately 3,800 adults have par­tic­i­pated in their three­-ses­sion (6-8 hour) assess­ment at least once, with about 1,600 par­tic­i­pat­ing at least twice, and about 450 of them par­tic­i­pat­ing three or more times. The data from this project have served as the basis for a ver­i­ta­ble cor­nu­copia of sci­en­tific pub­li­ca­tions which are avail­able in the Resources Sec­tion of their web­site http://faculty.virginia.edu/cogage/links/publications/. Nearly 200 papers on the cog­ni­tive impact of aging are avail­able free of charge on their web­site. It is nec­es­sary to reg­is­ter with your name and email address to access the papers, but it is well worth it.


Oth­ers to fol­low up on:

There are sev­eral stud­ies show­ing that work­ing mem­ory and intel­li­gence are strongly relat­ed. How­ev­er, work­ing mem­ory tasks require simul­ta­ne­ous pro­cess­ing and stor­age, so the causes of their rela­tion­ship with intel­li­gence are cur­rently a mat­ter of dis­cus­sion. The present study exam­ined the simul­ta­ne­ous rela­tion­ships among short­-term mem­ory (STM), work­ing mem­ory (WM), and gen­eral intel­li­gence (g). Two hun­dred and eight par­tic­i­pants per­formed six ver­bal, quan­ti­ta­tive, and spa­tial STM tasks, six ver­bal, quan­ti­ta­tive, and spa­tial WM tasks, and eight tests mea­sur­ing flu­id, crys­tal­lized, spa­tial, and quan­ti­ta­tive intel­li­gence. Espe­cial care is taken to avoid mis­rep­re­sent­ing the rela­tions among the con­structs being stud­ied because of spe­cific task vari­ance. Struc­tural equa­tion mod­el­ing (SEM) results revealed that (a) WM and g are (al­most) iso­mor­phic con­structs, (b) the iso­mor­phism van­ishes when the stor­age com­po­nent of WM is par­tialed out, and (c) STM and WM (with its stor­age com­po­nent par­tialed out) pre­dict g.

  • Colom et al. “Gen­eral intel­li­gence and mem­ory span: Evi­dence for a com­mon neu­roanatomic frame­work”; Cog­ni­tive Neu­ropsy­chol­ogy, Vol­ume 24, Issue 2007-12-08 , pages 867 - 878

Gen­eral intel­li­gence (g) is highly cor­re­lated with work­ing-mem­ory capac­ity (WMC). It has been argued that these cen­tral psy­cho­log­i­cal con­structs should share com­mon neural sys­tems. The present study exam­ines this hypoth­e­sis using struc­tural mag­netic res­o­nance imag­ing to deter­mine any over­lap in brain areas where regional grey mat­ter vol­umes are cor­re­lated to mea­sures of gen­eral intel­li­gence and to mem­ory span. In nor­mal vol­un­teers (N = 48) the results (p < .05, cor­rected for mul­ti­ple com­par­isons) indi­cate that a com­mon anatomic frame­work for these con­structs impli­cates mainly frontal grey mat­ter regions belong­ing to Brod­mann area (BA) 10 (right supe­rior frontal gyrus and left mid­dle frontal gyrus) and, to a lesser degree, the right infe­rior pari­etal lob­ule (BA 40). These find­ings sup­port the nuclear role of a dis­crete pari­eto-frontal net­work.



There are many free imple­men­ta­tions in Flash etc. online:







See also Lucas Charles’s August 2011 review of 6 Android DNB apps.






Offline N-back

You can play N-back in the real world, with­out a com­put­er, if you like. See the ML thread “Non-elec­tronic game ver­sion of N-back task” and the Snap­Back rules. Jonathan Toomim points out that N-back can be eas­ily done with a deck of cards alone, and the FAQ’s author sug­gests a sim­ple men­tal arith­metic rou­tine suit­able for med­i­ta­tion that is much like SNB.

What else can I do?

may increase WM, although it remains unclear whether the per­for­mance gains per­sist after­wards. See Bog­gio et al 2005, Fregni et al 2005, Ohn et al 2007, Bog­gio et al 2008, Jo et al 2009, Andrews et al 2011, Zaehle et al 2011, Berry­hill & Jones 2012, Tseng et al 2012, Mar­tin et al 2014, Matzen & Trumbo 2014, Car­valho et al 2014, Moreno et al 2015, , de Put­ter et al 2015, Choe et al 2016 (but also Mar­shall et al 2005, Steen­ber­gen et al 2015, Hoy et al 2015, van Wes­sel et al 2015, Rethans et al 2015, , Nils­son et al 2017, , an infor­mal incom­plete tDCS-DNB exper­i­ment, and the meta-analy­sis Hill et al 2015).

Forum mem­bers have rec­om­mended a num­ber of other things for gen­eral men­tal fit­ness:


(see for the author’s own expe­ri­ences with them), may help boost per­for­mance. The rela­tion of caffeine to learn­ing & mem­ory is com­pli­cat­ed; for now, see the thread on it or my Nootrop­ics page.


A use­ful phar­ma­ceu­ti­cal is ; The­Q17 men­tions that “Per­son­al­ly, I have found pirac­etam be quite use­ful in help­ing me stay alert and focused dur­ing long study hours or doing redun­dant tasks.” Other mem­bers also swear by pirac­etam+­choline.

The author of this FAQ reports that pirac­etam and helped reduced men­tal fatigue and gave a small (~10%) increase in his D4B score.


Reece writes

I’ve tried [a chem­i­cal extracted from an herb] (ac­tu­ally been using it for about a year now) and it is quite effec­tive for both lucid dream­ing and increas­ing dream recall if taken shortly before bed, not to men­tion the other ben­e­fits you’d expect from a potent inhibitor. I haven’t had any­thing in the way of neg­a­tive side effects when I’ve stuck to a 5 day/week dosage of 200m­cg.

I’ve never tried pirac­etam, how­ever felt like a placebo when com­pared to the ben­e­fits I’ve received from huperzine A. At larger dos­es, I’ve found huperzine A to be far more pow­er­ful than any nootropic I’ve ever tried (haven’t tried any pre­scrip­tion meds such as ), how­ever the side effects such as blurry vision and light-head­ed­ness weren’t some­thing I could tol­er­ate.

He fur­ther com­pared their effects:

I found Oxirac­etam to have a some­what “speedy” effec­t—you would cer­tainly know you took some­thing if some­one slipped that in your drink! As for effects, Oxirac­etam seemed to help most with ver­bal flu­ency (au­di­tory work­ing mem­o­ry?) and cre­ativ­i­ty. Huperzine helped more with work­ing mem­ory although it did­n’t have some of the inter­est­ing effects Oxirac­etam had on cre­ativ­i­ty, nor the speedy rush that some­times seemed like a pow­er­ful moti­va­tor to get work done.

(Reece did not take the oxirac­etam with any choline sup­ple­ments, which is usu­ally rec­om­mend­ed.)


In the realm of unusual sup­ple­ments to n-back­ing, we can include , but the evi­dence is too weak to say much.

See Also

  • The author’s own Brain Work­shop sta­tis­tics can be found here


Flaws in mainstream science (and psychology)

2013 dis­cus­sion of how sys­temic biases in sci­ence, par­tic­u­larly med­i­cine and psy­chol­o­gy, have resulted in a research lit­er­a­ture filled with false pos­i­tives and exag­ger­ated effects, called ‘the Repli­ca­tion Cri­sis’.

Long-s­tand­ing prob­lems in stan­dard sci­en­tific method­ol­ogy have exploded as the “”: the dis­cov­ery that many results in fields as diverse as psy­chol­o­gy, eco­nom­ics, med­i­cine, biol­o­gy, and soci­ol­ogy are in fact false or quan­ti­ta­tively highly inac­cu­rately mea­sured. I cover here a hand­ful of the issues and pub­li­ca­tions on this large, impor­tant, and rapidly devel­op­ing topic up to about 2013, at which point the Repli­ca­tion Cri­sis became too large a topic to cover more than cur­so­ri­ly.

The cri­sis is caused by meth­ods & pub­lish­ing pro­ce­dures which inter­pret ran­dom noise as impor­tant results, far too small datasets, selec­tive analy­sis by an ana­lyst try­ing to reach expected/desired results, pub­li­ca­tion bias, poor imple­men­ta­tion of exist­ing best-prac­tices, non­triv­ial lev­els of research fraud, soft­ware errors, philo­soph­i­cal beliefs among researchers that false pos­i­tives are accept­able, neglect of known con­found­ing like genet­ics, and skewed incen­tives (fi­nan­cial & pro­fes­sion­al) to pub­lish ‘hot’ results.

Thus, any indi­vid­ual piece of research typ­i­cally estab­lishes lit­tle. Sci­en­tific val­i­da­tion comes not from small p-val­ues, but from dis­cov­er­ing a reg­u­lar fea­ture of the world which dis­in­ter­ested third par­ties can dis­cover with straight­for­ward research done inde­pen­dently on new data with new pro­ce­dures—repli­ca­tion.

Split out to .

  1. By , I mean fluid intel­li­gence, not crys­tal­lized intel­li­gence, since it’s unlikely that any generic train­ing would teach you Lati­nate vocab­u­lary terms or mid­dle-school geom­e­try. For those who object to the entire idea, please see Wikipedia or for a bal­anced overview what IQ can pre­dict and the excep­tions, see Stern­berg et al’s 2001 review, “The Pre­dic­tive Value of IQ”.↩︎

  2. After a large amount of train­ing, a task may become learned and cease to stress the bot­tle­neck: eg “Vir­tu­ally Per­fect Time Shar­ing in Dual-task Per­for­mance: Uncork­ing the Cen­tral Cog­ni­tive Bot­tle­neck”.↩︎

  3. See for exam­ple “Do work­ing mem­ory and sus­cep­ti­bil­ity to inter­fer­ence pre­dict indi­vid­ual differ­ences in fluid intel­li­gence?”, Borella 2006; WM pre­dicts IQ bet­ter than strong focus/attention, with the cor­re­la­tion com­ing mostly from focus with only a small load­ing on exec­u­tive con­trol (Chud­er­ski & Necka 2012).↩︎

  4. “Brain net­works for work­ing mem­ory and fac­tors of intel­li­gence assessed in males and females with fMRI and DTI, Tang 2010; it found that “indi­vid­ual differ­ences in acti­va­tion dur­ing the n-back task were cor­re­lated to the gen­eral intel­li­gence fac­tor (g), as well as to dis­tilled esti­mates (re­mov­ing g) of speed of rea­son­ing, numer­i­cal abil­i­ty, and spa­tial abil­i­ty, but not to mem­ory”. PDF avail­able in Group Files.

    A more recent result is the fMRI study Chein 2011, “Domain-gen­eral mech­a­nisms of com­plex work­ing mem­ory span”, which abstract says “For both ver­bal and spa­tial ver­sions of the task, com­plex work­ing mem­ory span per­for­mance increased the activ­ity in lat­eral pre­frontal, ante­rior cin­gu­late, and pari­etal cor­tices dur­ing the Encod­ing, Main­te­nance, and Coor­di­na­tion phase of task per­for­mance. Mean­while, over­lap­ping activ­ity in ante­rior pre­frontal and medial tem­po­ral lobe regions was asso­ci­ated with both ver­bal and spa­tial recall from work­ing mem­o­ry.”↩︎

  5. eg. “Rea­son­ing=­work­ing mem­ory ≠ atten­tion”, Buehner & Krummb & Pick 2005; more back­ground is avail­able on pg 10/92 of “Work­ing mem­o­ry, fluid intel­li­gence, and sci­ence learn­ing”. But see the meta-analy­ses in Ack­er­man et al 2005 which find that WM ≠ IQ.↩︎

  6. from Jaeggi et al 2010:

    The find­ings of Study 1 con­firm other find­ings from the lit­er­a­ture (Jaeg­gi, Buschkuehl, Per­rig, & Meier, 2010; Kane, Con­way, Miu­ra, & Colflesh, 2007): Con­sis­tent with our hypothe­ses, both n-back task vari­ants were highly cor­re­lat­ed, and both were best pre­dicted by Gf.

    In gen­er­al, matrix rea­son­ing tasks seem to be bet­ter pre­dic­tors for both the sin­gle and the dual n-back tasks than a mea­sure of work­ing mem­ory capac­i­ty. As the reli­a­bil­ity esti­mates were appro­pri­ate for the n-back tasks, the lack of cor­re­la­tion between the n-back tasks and the mea­sure of work­ing mem­ory capac­ity can­not be attrib­uted to insuffi­cient reli­a­bil­ity (Jaeg­gi, Buschkuehl, Per­rig, & Meier, 2010). Rather, it seems that per­for­mance for the two tasks relies on differ­ent sources of vari­ance, which might result from the differ­ent mem­ory processes that are involved in the two tasks: whereas the n-back task relies on pas­sive recog­ni­tion process­es, per­for­mance in work­ing mem­ory capac­ity tasks requires active and strate­gic recall processes (Kane, Con­way, Miu­ra, & Colflesh, 2007).

  7. “Work­ing mem­ory capac­ity and fluid abil­i­ties: Exam­in­ing the cor­re­la­tion between Oper­a­tion Span and Raven”, Unsworth, Intel­li­gence 2005:

    How­ev­er, as shown in Fig. 2, the cor­re­la­tions between solu­tion accu­racy for each item and Ospan, although fluc­tu­at­ing wide­ly, does not appear to increase in any sys­tem­atic man­ner as diffi­culty increas­es. Indeed, the cor­re­la­tion between Ospan and accu­racy on the first prob­lem was as high as with prob­lem 24 (i.e., prob­lem 1 r=0.26, prob­lem 24 r=0.26). These results are strik­ingly sim­i­lar to those of Salt­house (1993) who showed roughly the same pat­tern of cor­re­la­tions between solu­tion accu­racy and a WM com­pos­ite. Both sets of results sug­gest that there is not a clear rela­tion­ship between item vari­a­tions in diffi­culty on Raven and mea­sures of WM.

    …Although there seems to be ade­quate vari­abil­ity for quar­tile 4, this low cor­re­la­tion is prob­a­bly due to the fact that not as many sub­jects attempted these prob­lems. Indeed, 80% of par­tic­i­pants attempted the first 27 prob­lems, but only 47% of par­tic­i­pants fin­ished the test. Thus, only quar­tiles 1-3 should be inter­pret­ed. With this in mind, the results demon­strate that the cor­re­la­tion between solu­tion accu­racy and Ospan does not increase as diffi­culty increases but instead remains fairly con­stant across increas­ing lev­els of diffi­cul­ty.

    …One reviewer was con­cerned that only high work­ing mem­ory capac­ity indi­vid­u­als would fin­ish the test. How­ev­er, of those par­tic­i­pants clas­si­fied as high work­ing mem­ory (one stan­dard devi­a­tion above the mean on Ospan), only 25% of them actu­ally fin­ished the test, whereas 71% of those clas­si­fied as low work­ing mem­ory (one stan­dard devi­a­tion below the mean on Ospan) fin­ished the test. This results in some­what lower scores for these 76 indi­vid­u­als on the two mea­sures as com­pared the full sam­ple (i.e. M Ospan=11.12, S.D.=5.90; M Raven=17.50, S.D.=7.59).

  8. “Does work­ing mem­ory train­ing gen­er­al­ize?”, Ship­stead et al 2010; abstract:

    Recent­ly, attempts have been made to alter the capac­ity of work­ing mem­ory (WMC) through exten­sive prac­tice on adap­tive work­ing mem­ory tasks that adjust diffi­culty in response to user per­for­mance. We dis­cuss the design cri­te­ria required to claim valid­ity as well as gen­er­al­iz­abil­ity and how recent stud­ies do or do not sat­isfy those cri­te­ria. It is con­cluded that, as of yet, the results are incon­sis­tent and this is likely dri­ven by inad­e­quate con­trols and ineffec­tive mea­sure­ment of the cog­ni­tive abil­i­ties of inter­est.

  9. See Min­ear & Shah 2008:

    Per­for­mance on task switch­ing, a par­a­digm com­monly used to mea­sure exec­u­tive func­tion, has been shown to improve with prac­tice. How­ev­er, no study has tested whether these ben­e­fits are spe­cific to the tasks learned or are trans­fer­able to new sit­u­a­tions. We report evi­dence of trans­fer­able improve­ment in a cued, ran­domly switch­ing par­a­digm as mea­sured by mix­ing cost, but we report no con­sis­tent improve­ment for switch cost. Improve­ment in mix­ing costs arises from a rel­a­tive reduc­tion in time to per­form both switch and non­switch tri­als that imme­di­ately fol­low switch tri­als, impli­cat­ing the abil­ity to recover from unex­pected switches as the source of improve­ment. These results add to a grow­ing num­ber of stud­ies demon­strat­ing gen­er­al­iz­able improve­ment with train­ing on exec­u­tive pro­cess­ing.

  10. “Guest Column: Can We Increase Our Intel­li­gence?”; Sam Wang & San­dra Aamodt; The New York Times

    Differ­ences in work­ing mem­ory capac­ity account for 50-70% of indi­vid­ual differ­ences in fluid intel­li­gence (ab­stract rea­son­ing abil­i­ty) in var­i­ous meta-analy­ses, sug­gest­ing that it is one of the major build­ing blocks of I.Q. (Ack­er­man et al; Kane et al; Süss et al 2002) This idea is intrigu­ing because work­ing mem­ory can be improved by train­ing.

    See also the 2012 NYT fol­lowup, “Can You Make Your­self Smarter?”↩︎

  11. Is this right? I have no idea. But it is a curi­ous col­lec­tion of stud­ies and an inter­est­ing pro­posed mod­el: Hat­ton 1997:

    For years I sub­scribed to such a prin­ci­ple: that mod­u­lar­iza­tion, or struc­tural decom­po­si­tion, is a good design con­cept and there­fore always improves sys­tems. This belief is so wide­spread as to be almost unchal­lenge­able. It is respon­si­ble for the impor­tant pro­gram­ming lan­guage con­cept of com­pi­la­tion mod­el­s-which are either sep­a­rate, with guar­an­teed inter­face con­sis­tency (such as C++, Ada, and Mod­u­la-2), or inde­pen­dent, whereby a sys­tem is built in pieces and glued together later (C and For­tran, for exam­ple). It is a very attrac­tive con­cept with strong roots in the “divide and con­quer” prin­ci­ple of tra­di­tional engi­neer­ing. How­ev­er, this con­ven­tional wis­dom may be wrong. Only those com­po­nents that fit best into human short­-term mem­ory cache seem to use it effec­tive­ly, thereby pro­duc­ing the low­est fault den­si­ties. Big­ger and smaller aver­age com­po­nent sizes appear to degrade reli­a­bil­i­ty.

    …It is easy to get the impres­sion from these case his­to­ries that devel­op­ing soft­ware sys­tems with low fault den­si­ties is exceed­ingly diffi­cult. In fact, analy­sis of the lit­er­a­ture reveals graphs such as that depen­dent faults per KLOC will approach an asymp­tote as time increas­es. In real­i­ty, only this asymp­tote makes sense for com­par­ing the reli­a­bil­ity of differ­ent sys­tems. So, given that the asymp­tote can never be reached, the faults per KLOC and the rate of change of this value are required to com­pare such sys­tems effec­tive­ly. Of course, real sys­tems are sub­ject to con­tin­ual non­cor­rec­tive change, so things become rather more com­plex. No notion of rate of change of faults per KLOC was avail­able for any of the data in this study, although both mature and imma­ture sys­tems were pre­sent, with the same behav­ior observed. This would sug­gest that the observed defect behav­ior is present through the life cycle, sup­port­ing even fur­ther the con­jec­ture that it is a macro­scopic prop­er­ty. If only imma­ture sys­tems had been present in the stud­ies, it could have been argued that smaller com­po­nents may get exer­cised more. This does not seem to be the case.

    A fur­ther related point, also observed in the NAG library study, is that when com­po­nent fault den­si­ties are plot­ted as a func­tion of size, the usage of each com­po­nent must be taken into account. The mod­els dis­cussed in this arti­cle are essen­tially asymp­tot­ic, and the fault den­si­ties they pre­dict are there­fore an enve­lope to which com­po­nent fault den­si­ties will tend only as they are used suffi­ciently to begin to flush out faults. An unused com­po­nent has com­plex­ity but no faults, by defi­n­i­tion. The lit­er­a­ture reports appar­ently near-ze­ro-de­fect sys­tems that have turned out on closer inspec­tion to have been unused. shown in Fig­ure 2. This data was com­piled from NASA God­dard data by the Uni­ver­sity of Mary­land’s Soft­ware Engi­neer­ing Lab­o­ra­to­ry, as quoted in the Decem­ber 1991 spe­cial edi­tion of Busi­ness Week. First of all, in spite of NASA’s enor­mous resources and tal­ent pool, the aver­age was still five to six faults per KLOC. Other stud­ies have reported sim­i­lar fault den­si­ties.4,8 More telling is the obser­va­tion that in Fig­ure 2, improve­ment has been achieved mostly by improv­ing the bad process­es, not the good ones. This fact sug­gests that con­sis­ten­cy, a process issue, has improved much more than actual fault den­si­ty, a prod­uct issue. The sim­ple con­clu­sion is that the aver­age across many lan­guages and devel­op­ment efforts for “good” soft­ware is around six faults per KLOC, and that with our best tech­niques, we can achieve 0.5-1 fault per KLOC. Per­fec­tion will always elude us, of course, but the intractabil­ity of achiev­ing sys­tem­at­i­cally bet­ter fault den­si­ties than have been achieved so far also sug­gests that some other lim­i­ta­tion may be at work.

    THE PROPOSED MODEL …Re­cov­ery code scram­bling is an impor­tant fac­tor in my pro­posed mod­el. The evi­dence sug­gests that any­thing that fits in a short­-term or cache mem­ory is eas­ier to under­stand and less fault­-prone; pieces that are too large over­flow, involv­ing use of the more error-prone recov­ery code mech­a­nism used for long-term stor­age. Thus, if a pro­gram­mer is work­ing with a com­po­nent of com­plex­ity Ω, and that com­po­nent fits entirely into the cache or short­-term mem­o­ry, which in turn can be manip­u­lated with­out recourse to back­-up or long-term mem­o­ry, the incre­men­tal increase in bugs or dis­or­der dE due to an incre­men­tal increase of com­plex­ity of dΩ is sim­ply dE = (1/Ω) dΩ.

    This resem­bles the argu­ment lead­ing to Boltz­man­n’s law relat­ing entropy to com­plex­i­ty, where the ana­logue of equipar­ti­tion of energy in a phys­i­cal sys­tem is mir­rored by the appar­ently equal dis­tri­b­u­tion of rehearsal activ­ity in the short­-term mem­o­ry. In other words, because no part of the cache is favored and the cache accu­rately manip­u­lates sym­bols, the incre­men­tal increase in dis­or­der is inversely pro­por­tional to the exist­ing com­plex­i­ty, mak­ing the ideal case when pieces just fit into cache. It is assumed with­out loss of gen­er­al­ity that both E and Ω are con­tin­u­ously val­ued vari­ables. What hap­pens when we encounter com­plex­ity greater than Ω′ (the com­plex­ity which will just fit into the cache)? The increase in dis­or­der will cor­re­spond to the com­plex­ity in the (now-full) cache con­tents, plus a con­tri­bu­tion pro­por­tional to the num­ber of times the cache mem­ory must be reloaded from the long-term mem­o­ry. In other words, dE = (1/2Ω)’ (1 + Ω/Ω’) * dΩ

    The fac­tor of 1/2 matches Equa­tion 1 when Ω = Ω′, that is, when the com­plex­ity of the pro­gram is about to over­flow the cache mem­o­ry. The sec­ond term is directly pro­por­tional to the cache over­flow effect and mim­ics the scram­bling of the recov­ery codes. Inte­grat­ing Equa­tions 1 and 2 sug­gests that E = log Ω for Ω ≤ Ω′ and E = 1/2 * (Ω/Ω’ + Ω^2/2*Ω’^2) for Ω > Ω’

    …The Ada data and the assem­bly and macro-assem­bly data pro­vide strong empir­i­cal sup­port for this behav­ior, with about 200 to 400 lines cor­re­spond­ing to the com­plex­ity Ω′ at which cache mem­ory over­flows into long-term mem­o­ry. That such dis­parate lan­guages can pro­duce approx­i­mately the same tran­si­tion point from log­a­rith­mic to qua­dratic behav­ior sup­ports the view that Ω is not the under­ly­ing algo­rith­mic com­plex­ity but the sym­bolic com­plex­ity of the lan­guage imple­men­ta­tion, given that a line of Ada would be expected to gen­er­ate five or more lines of assem­bly. This is directly anal­o­gous to the obser­va­tion that it is fit, rather than the actual infor­ma­tion con­tent of the cache that is rel­e­van­t.9

    …To sum­ma­rize, if a sys­tem is decom­posed into pieces much smaller than the short­-term mem­ory cache, the cache is used ineffi­ciently because the inter­face of such a com­po­nent with its neigh­bors is not “rehearsed” explic­itly into the cache in the same way, and the result­ing com­po­nents tend to exhibit higher defect den­si­ties. If com­po­nents exceed the cache size, they are less com­pre­hen­si­ble because the recov­ery codes con­nect­ing com­pre­hen­sion with long-term mem­ory break down. Only those com­po­nents that match the cache size well use it effec­tive­ly, thereby pro­duc­ing the low­est fault den­si­ties.

    …Sup­pose that a par­tic­u­lar func­tion­al­ity requires 1,000 “lines” to imple­ment, where a “line” is some mea­sure of com­plex­i­ty. The imme­di­ate impli­ca­tion of the ear­lier dis­cus­sion is that, to be reli­able, we should imple­ment it as five 200-line com­po­nents (each fit­ting in cache) rather than as 50 20-line com­po­nents. The for­mer would lead to per­haps 5 log_10(200) = 25 bugs while the lat­ter would lead to 50 × log_10(20) = 150 bugs. This appar­ently inescapable but unpleas­ant con­clu­sion runs com­pletely counter to con­ven­tional wis­dom. …The addi­tional unre­li­a­bil­ity caused by split­ting up the sys­tem might be due to sim­ple inter­face incon­sis­ten­cies. The Basil­i-Per­ri­cone study con­sid­ered this a pos­si­ble expla­na­tion, as did Moller-Paul­ish. How­ev­er, it was not a fac­tor in the Hat­ton-Hop­kins study, since the inter­nally reusable com­po­nents in the NAG library (largely exter­nally used reusable com­po­nents) had high inter­face con­sis­ten­cy. Fur­ther­more, it is unlikely to explain the Comp­ton-With­row data because Ada man­dates inter­face con­sis­tency in lan­guage imple­men­ta­tions. (This may be respon­si­ble for the differ­ence in small com­po­nents in Fig­ure 4.)

  12. Jeff Atwood; “Mul­ti­ple Mon­i­tors and Pro­duc­tiv­ity”, “The Pro­gram­mer’s Bill of Rights”, “Join­ing the Pres­ti­gious Three Mon­i­tor Club”, “Does More Than One Mon­i­tor Improve Pro­duc­tiv­i­ty?” etc.↩︎

  13. See cita­tion roundup at the Skep­tics Stack­Ex­change.↩︎

  14. Jeff Atwood, “The Large Dis­play Para­dox”↩︎

  15. See his blog posts, pri­mar­ily “Pro­gram­ming’s Dirt­i­est Lit­tle Secret”. One dis­sent­ing view­point is John D. Cook’s “How much does typ­ing speed mat­ter?”, which takes an per­spec­tive—s­ince typ­ing speeds don’t vary by more than an order of mag­ni­tude or two or take up much time for the most part, you can’t expect the over­all pro­duc­tiv­ity boost of faster typ­ing to be too big (though it could still be well worth your while). Nev­er­the­less, I think the most com­pelling argu­ment for learn­ing typ­ing well or using a good input device in gen­eral is to sim­ply spend some time using a key­board with a bro­ken key or a worn-out mouse/trackball which once in a while misclicks: even an objec­tively ‘small’ error rate is enough to drive one batty and destroy ‘flow’. I’m reminded of Dan Luu’s arti­cles on : specifi­cal­ly, & & .↩︎

  16. Page 457, Coders at Work:

    Seibel: “Is there any­thing you would have done differ­ently about learn­ing to pro­gram? Do you have any regrets about the sort of path you took or do you wish you had done any­thing ear­lier?”

    : “Oh, sure, sure. In high school I wish I’d taken typ­ing. I suffer from poor typ­ing yet today, but who knew. I did­n’t plan any­thing or do any­thing. I have no dis­ci­pline. I did what I wanted to do next, peri­od, all the time. If I had some fore­sight or plan­ning or some­thing, there are things, like typ­ing, I would have done when I had the chance.”

  17. When I was younger, I rea­soned that early in life is the best time to learn to read fast since one reaps the great­est gains over the longest pos­si­ble period (I still agree with my for­mer rea­son­ing) and so did a great deal of read­ing on and the related aca­d­e­mic lit­er­a­ture, and spent more than a few hours work­ing with tachis­to­cop­ic-style soft­ware. My ulti­mate con­clu­sion was that it was a good use of my time as it bumped my WPM up to ~400-500 WPM from the ordi­nary 300 WPM, but the tech­niques were not going to give any use­ful abil­ity beyond that as greater speed becomes an indi­ca­tion one is read­ing too easy mate­r­ial or one should be using more sophis­ti­cated search capa­bil­i­ties. In par­tic­u­lar, weren’t very use­ful for non-prac­tice read­ing and were least use­ful on deep or heav­i­ly-hy­per­linked con­tent. “Pho­tore­ad­ing”, how­ev­er, is sim­ply a scam or very shal­low skim­ming. Unfor­tu­nate­ly, I omit­ted to take notes on spe­cific stud­ies or pro­grams, though, being too young to care about being able to explain & defend my beliefs later - but that is just as well since by now, all the web­sites would be gone, pro­grams bitrot­ten, and links bro­ken. Read­ers will just have to do their own research on the topic if they care (much eas­ier in this age of Wikipedi­a). One start­ing point: Scott Young.↩︎

  18. From the inter­view anthol­ogy (2009), pg 114:

    Peter Seibel: “Do you think that pro­gram­ming is at all biased toward being young?”

    : "I used to think so. A few years ago I had , but I did­n’t know it. I thought I was just get­ting tired and old, and I got to the point where it was so diffi­cult to con­cen­trate that I could­n’t pro­gram any­more because I just could­n’t keep enough stuff in my head. A lot of pro­gram­ming is you keep stuff in your head until you can get it writ­ten down and struc­tured prop­er­ly. And I just could­n’t do it.

    I had lost that abil­ity and I thought it was just because I was get­ting old­er. For­tu­nate­ly, I got bet­ter and it came back and so I’m pro­gram­ming again. I’m doing it well and maybe a lit­tle bit bet­ter now because I’ve learned how not to depend so much on my mem­o­ry. I’m bet­ter at doc­u­ment­ing my code now than I used to be because I’m less con­fi­dent that I’ll remem­ber next week why I did this. In fact, some­times I’ll be going through my stuff and I’m amazed at stuff that I had writ­ten: I don’t remem­ber hav­ing done it and it’s either really either awful or bril­liant. I had no idea I was capa­ble of that."

    From pg 154:

    Seibel: “How do you design code?”

    : "A lot of pro­to­typ­ing. I used to do sort of high­-level pseudocode, and then I’d start fill­ing in bot­tom up. I do less of the high­-level pseudocode because I can usu­ally hold it in my head and just do bot­tom-up until it joins.

    Often I’m work­ing with exist­ing pieces of code adding some new sub­sys­tem or some­thing on the side and I can almost do it bot­tom-up. When I get in trou­ble in the mid­dle I do still write pseudo-code and just start work­ing bot­tom up until I can com­plete it. I try not to let that take too long because you’ve got to be able to test it; you’ve got to be able to see it run and step through it and make sure it’s doing what it’s sup­posed to be doing."

    From pg 202, a cogent reminder that ’tis a good wind that blows no ill (and that as William T. Pow­ers wrote some­where on the CSGNet ML, “Some peo­ple revel in com­plex­i­ty, and what’s worse, they have the brain power to deal with vast sys­tems of arcane equa­tions. This abil­ity can be a hand­i­cap because it leads to over­look­ing sim­ple solu­tions.”):

    Seibel: “Speak­ing of writ­ing intri­cate code, I’ve noticed that peo­ple who are too smart, in a cer­tain dimen­sion any­way, make the worst code. Because they can actu­ally fit the whole thing in their head they can write these great reams of spaghetti code.”

    : “I agree with you that peo­ple who are both smart enough to cope with enor­mous com­plex­ity and lack empa­thy with the rest of us may fall prey to that. They think, ‘I can under­stand this and I can use it, so it has to be good.’”

    From pg 236:

    : “I read some­where, that you have to have a good mem­ory to be a rea­son­able pro­gram­mer. I believe that to be true.”

    Seibel: “Bill Gates once claimed that he could still go to a black­board and write out big chunks of the code to the BASIC that he writ­ten for the Altair, a decade or so after he had orig­i­nally writ­ten it. Do you think you can remem­ber your old code that way?”

    Arm­strong: “Yeah. Well, I could recon­struct some­thing. Some­times I’ve just com­pletely lost some old code and it does­n’t worry me in the slight­est.”

    From page 246:

    : “Yeah, that’s right. So essen­tially we wrote out our types by draw­ing them on large sheets of papers with arrows. That was our type sys­tem. That was a pretty large pro­gram-in fact it was over ambi­tious; we never com­pleted it.”

    Seibel: “Do you think you learned any lessons from that fail­ure?”

    Pey­ton Jones: “That was prob­a­bly when I first became aware that writ­ing a really big pro­gram you could end up with prob­lems of scale-you could­n’t keep enough of it in your head at the same time. Pre­vi­ously all the things I had writ­ten, you could keep the whole thing in your head with­out any trou­ble. So it was prob­a­bly the first time I’d done any seri­ous attempt at long-s­tand­ing doc­u­men­ta­tion.”

    Seibel: “But even that was­n’t enough, in this case…”

    From page 440:

    [:] “The sec­ond rea­son I like Python is that-and maybe this is just the way my brain has changed over the years-I can’t keep as much stuff in my head as I used to. It’s more impor­tant for me to have stuff in front of my face. So the fact that in Smalltalk you effec­tively can­not put more than one method on the screen at a time dri­ves me nuts. As far as I’m con­cerned the fact that I edit Python pro­grams with Emacs is an advan­tage because I can see more than ten lines’ worth at a time.”

  19. From an inter­view given by to Dikran Karagueuzian, the direc­tor of CSLI Pub­li­ca­tions:

    I could­n’t keep up with all my teach­ing at Stan­ford though, I’m not on sab­bat­i­cal but I found that doing soft­ware was much, was much harder than writ­ing books and doing research papers. It takes another level of com­mit­ment that you have to have so much in your head at the time when you’re doing soft­ware, that, that I had to take leave of absence from Stan­ford from my, from my ordi­nary teach­ing for sev­eral quar­ters dur­ing this peri­od.

  20. The best pro­gram­mers seem to suffer few dis­trac­tions and the worst had many, although it is hard to infer causal­ity from this strik­ing cor­re­la­tion. From “The Rise of the New Group­think”, Susan Cain, The New York Times (draw­ing on the 1987 book or per­haps the related excerpts “Why Mea­sure Per­for­mance”):

    Pri­vacy also makes us pro­duc­tive. In a fas­ci­nat­ing study known as the Cod­ing War Games, con­sul­tants Tom DeMarco and Tim­o­thy Lis­ter com­pared the work of more than 600 com­puter pro­gram­mers at 92 com­pa­nies. They found that peo­ple from the same com­pa­nies per­formed at roughly the same level - but that there was an enor­mous per­for­mance gap between orga­ni­za­tions. What dis­tin­guished pro­gram­mers at the top-per­form­ing com­pa­nies was­n’t greater expe­ri­ence or bet­ter pay. It was how much pri­va­cy, per­sonal work­space and free­dom from inter­rup­tion they enjoyed. 62% of the best per­form­ers said their work­space was suffi­ciently pri­vate com­pared with only 19% of the worst per­form­ers. 76% of the worst pro­gram­mers but only 38% of the best said that they were often inter­rupted need­less­ly.

  21. “Who is Likely to Acquire Pro­gram­ming Skills?”, Shute 1991; Shute mea­sured WM for stu­dents learn­ing and of course found that higher WM cor­re­lated with faster learn­ing, but despite using the g-loaded , unfor­tu­nately she appar­ently did not mea­sure against IQ direct­ly, so pos­si­bly it’s just IQ cor­re­lat­ing with the pro­gram­ming skill:

    Fol­low­ing instruc­tion, an online bat­tery of cri­te­rion tests was admin­is­tered mea­sur­ing pro­gram­ming knowl­edge and skills acquired from the tutor. Results showed that a large amount (68%) of the out­come vari­ance could be pre­dicted by a work­ing-mem­ory fac­tor, spe­cific word prob­lem solv­ing abil­i­ties (i.e., prob­lem iden­ti­fi­ca­tion and sequenc­ing of ele­ments) and some learn­ing style mea­sures (i.e., ask­ing for hints and run­ning pro­gram­s).

  22. In “Why Angry Birds is so suc­cess­ful and pop­u­lar: a cog­ni­tive tear­down of the user expe­ri­ence”, writer Charles L. Mauro sin­gles out selec­tive stress­ing of work­ing mem­ory as key to ’s man­age­ment of the diffi­culty of its puz­zles:

    It is a well-known fact of cog­ni­tive sci­ence that human short­-term mem­ory (SM), when com­pared to other attrib­utes of our mem­ory sys­tems, is exceed­ingly lim­it­ed….Where things get inter­est­ing is the point where poor user inter­face design impacts the demand placed on SM. For exam­ple, a user inter­face design solu­tion that requires the user to view infor­ma­tion on one screen, store it in short­-term mem­o­ry, and then reen­ter that same infor­ma­tion in a data field on another screen seems like a triv­ial task. Research shows that it is diffi­cult to do accu­rate­ly, espe­cially if some other form of stim­u­lus flows between the mem­o­riza­tion of the data from the first screen and before the user enters the data in the sec­ond. This dis­rup­tive data flow can be in almost any form, but as a gen­eral rule, any­thing that is engag­ing, such as con­ver­sa­tion, noise, motion, or worst of all, a com­bi­na­tion of all three, is likely to totally erase SM. When you encounter this type of data flow before you com­plete trans­fer of data using short­-term mem­o­ry, chances are very good that when you go back to retrieve impor­tant infor­ma­tion from short­-term mem­o­ry, it is gone!

    Angry Birds is a sur­pris­ingly smart man­ager of the play­er’s short­-term mem­o­ry.

    By sim­ple manip­u­la­tion of the user inter­face, Angry Birds design­ers cre­ated sig­nifi­cant short­-term mem­ory loss, which in turn increases game play com­plex­ity but in a way that is not per­ceived by the player as neg­a­tive and adds to the addic­tive nature of the game itself. The sub­tle, yet pow­er­ful con­cept employed in Angry Birds is to bend short­-term mem­ory but not to actu­ally break it. If you do break SM, make sure you give the user a very sim­ple, fast way to accu­rately reload. There are many exam­ples in the Angry Birds game model of this prin­ci­ple in action….

    One of the main ben­e­fits of play­ing Angry Birds on the iPad [rather than the smaller iPhone] is the abil­ity to pinch down the win­dow size so you can keep the entire game space (birds & pigs in hous­es) in full view all the time. Keep­ing all aspects of the game’s inter­face in full view pre­vents short­-term mem­ory loss and improves the rate at which you acquire skills nec­es­sary to move up to a higher game lev­el. Side note: If you want the ulti­mate Angry Birds expe­ri­ence use a POGO pen on the iPad with the dis­play pinched down to view the entire game space. This gives you finer con­trol, bet­ter tar­get­ing and rapidly chang­ing game play. The net impact in cog­ni­tive terms is a vastly supe­rior skill acqui­si­tion pro­file. How­ev­er, you will also find that the game is less inter­est­ing to play over extended peri­ods. Why does this hap­pen?

  23. “Reex­am­in­ing the Fault Den­si­ty-Com­po­nent Size Con­nec­tion”, Les Hat­ton (extended excerpts):

    For years I sub­scribed to such a prin­ci­ple: that mod­u­lar­iza­tion, or struc­tural decom­po­si­tion, is a good design con­cept and there­fore always improves sys­tems. This belief is so wide­spread as to be almost unchal­lenge­able. It is respon­si­ble for the impor­tant pro­gram­ming lan­guage con­cept of com­pi­la­tion mod­el­s-which are either sep­a­rate, with guar­an­teed inter­face con­sis­tency (such as C++, Ada, and ), or inde­pen­dent, whereby a sys­tem is built in pieces and glued together later (C and For­tran, for exam­ple). It is a very attrac­tive con­cept with strong roots in the “divide and con­quer” prin­ci­ple of tra­di­tional engi­neer­ing. How­ev­er, this con­ven­tional wis­dom may be wrong. Only those com­po­nents that fit best into human short­-term mem­ory cache seem to use it effec­tive­ly, thereby pro­duc­ing the low­est fault den­si­ties. Big­ger and smaller aver­age com­po­nent sizes appear to degrade reli­a­bil­i­ty.

    …The Ada data and the assem­bly and macro-assem­bly data pro­vide strong empir­i­cal sup­port for this behav­ior, with about 200 to 400 lines cor­re­spond­ing to the com­plex­ity Ω′ at which cache mem­ory over­flows into long-term mem­o­ry. That such dis­parate lan­guages can pro­duce approx­i­mately the same tran­si­tion point from log­a­rith­mic to qua­dratic behav­ior sup­ports the view that Ω is not the under­ly­ing algo­rith­mic com­plex­ity but the sym­bolic com­plex­ity of the lan­guage imple­men­ta­tion, given that a line of Ada would be expected to gen­er­ate five or more lines of assem­bly. This is directly anal­o­gous to the obser­va­tion that it is fit, rather than the actual infor­ma­tion con­tent of the cache that is rel­e­vant.9

    …To sum­ma­rize, if a sys­tem is decom­posed into pieces much smaller than the short­-term mem­ory cache, the cache is used ineffi­ciently because the inter­face of such a com­po­nent with its neigh­bors is not “rehearsed” explic­itly into the cache in the same way, and the result­ing com­po­nents tend to exhibit higher defect den­si­ties. If com­po­nents exceed the cache size, they are less com­pre­hen­si­ble because the recov­ery codes con­nect­ing com­pre­hen­sion with long-term mem­ory break down. Only those com­po­nents that match the cache size well use it effec­tive­ly, thereby pro­duc­ing the low­est fault den­si­ties.

    …Sup­pose that a par­tic­u­lar func­tion­al­ity requires 1,000 “lines” to imple­ment, where a “line” is some mea­sure of com­plex­i­ty. The imme­di­ate impli­ca­tion of the ear­lier dis­cus­sion is that, to be reli­able, we should imple­ment it as five 200-line com­po­nents (each fit­ting in cache) rather than as 50 20-line com­po­nents. The for­mer would lead to per­haps bugs while the lat­ter would lead to bugs. This appar­ently inescapable but unpleas­ant con­clu­sion runs com­pletely counter to con­ven­tional wis­dom. …The addi­tional unre­li­a­bil­ity caused by split­ting up the sys­tem might be due to sim­ple inter­face incon­sis­ten­cies. The Basil­i-Per­ri­cone study con­sid­ered this a pos­si­ble expla­na­tion, as did Moller-Paul­ish. How­ev­er, it was not a fac­tor in the Hat­ton-Hop­kins study, since the inter­nally reusable com­po­nents in the NAG library (largely exter­nally used reusable com­po­nents) had high inter­face con­sis­ten­cy. Fur­ther­more, it is unlikely to explain the Comp­ton-With­row data because Ada man­dates inter­face con­sis­tency in lan­guage imple­men­ta­tions. (This may be respon­si­ble for the differ­ence in small com­po­nents in Fig­ure 4.)

  24. “Walk­ing is free, but Amer­i­cans spent $13 mil­lion on brain-fit­ness soft­ware and games last year [2009]…”; from Newsweek↩︎

  25. See for exam­ple Nature’s cov­er­age of the Cam­bridge study, “No gain from brain train­ing: Com­put­er­ized men­tal work­outs don’t boost men­tal skills, study claims”; or Dis­cover’s blog dis­cus­sion.↩︎

  26. Newsweek:

    Train­ing your mem­o­ry, rea­son­ing, or speed of pro­cess­ing improves that skill, found a large gov­ern­men­t-spon­sored study called Active. Unfor­tu­nate­ly, there is no trans­fer: improv­ing pro­cess­ing speed does not improve mem­o­ry, and improv­ing mem­ory does not improve rea­son­ing. Sim­i­lar­ly, doing cross­word puz­zles will improve your abil­ity to…do cross­words. “The research so far sug­gests that cog­ni­tive train­ing ben­e­fits only the task used in train­ing and does not gen­er­al­ize to other tasks,” says Columbi­a’s Stern.

  27. “This Is Your Brain. Aging. Sci­ence is reshap­ing what we know about get­ting old­er. (The news is bet­ter than you think.)”, Newsweek:

    Doing cross­word puz­zles would seem to be ideal brain exer­cise since avid puz­zlers do them daily and say it keeps them men­tally sharp, espe­cially with vocab­u­lary and mem­o­ry. But this may be con­fus­ing cause and effect. It is mostly peo­ple who are good at fig­ur­ing out “Dole’s run­ning mate” who do cross­words reg­u­lar­ly; those who aren’t, don’t. In a recent study, Salt-house and col­leagues found “no evi­dence” that peo­ple who do cross­words have “a slower rate of age-re­lated decline in rea­son­ing.” As he put it in a 2006 analy­sis, there is “lit­tle sci­en­tific evi­dence that engage­ment in men­tally stim­u­lat­ing activ­i­ties alters the rate of men­tal aging,” an idea that is “more of an opti­mistic hope than an empir­i­cal real­i­ty.” (P.S.: Bob Dole’s 1996 VP choice was Jack Kem­p.)

  28. Music cor­re­lates with increased SAT scores, which has been cited as a jus­ti­fi­ca­tion for teach­ing stu­dents music, but it exhibit a com­mon pat­tern for claims of far trans­fer: it appears in sim­ple analy­ses, dis­ap­pears in ran­dom­ized exper­i­ments (eg ), and finally a thor­ough analy­sis includ­ing a wide range of covari­ates like Elpus 2013 finds the cor­re­la­tion dis­ap­pears because it was due to some con­found like the high­er-per­form­ing stu­dents also being wealth­i­er. The back­ground for music:

    An entire spe­cial issue of the Jour­nal of Aes­thetic Edu­ca­tion (JAE) in 2000, titled “The Arts and Aca­d­e­mic Achieve­ment: What the Evi­dence Shows”, was ded­i­cated to exam­in­ing the aca­d­e­mic per­for­mance of arts and non-arts stu­dents. In that vol­ume, Win­ner and Cooper (2000) meta-an­a­lyzed some 31 pub­lished and unpub­lished stud­ies, yield­ing 66 sep­a­rate effect sizes exam­in­ing the gen­eral research ques­tion of whether arts edu­ca­tion, broadly defined, pos­i­tively influ­enced aca­d­e­mic achieve­ment. Results of the meta-analy­sis showed that arts edu­ca­tion was mod­er­ately pos­i­tively asso­ci­ated with higher achieve­ment in math, ver­bal, and com­pos­ite math­-ver­bal out­comes. In the same jour­nal issue, Vaughan and Win­ner (2000) sought to ana­lyze the link between arts course work and SAT scores specifi­cal­ly. Using data from 12 years of national SAT means reported by the Col­lege Board in the annual Pro­files of Col­lege Bound Seniors report, Vaughan and Win­ner found that stu­dents who self­-re­ported on the SAT’s Stu­dent Descrip­tive Ques­tion­naire that they had pur­sued arts course work outscored stu­dents who reported they had not taken any arts course work. Meta-analy­ses of music stu­dents’ per­for­mance on ver­bal (But­zlaff, 2000) and math­e­mat­i­cal (Vaugh­an, 2000) stan­dard­ized tests were some­what incon­clu­sive: Although pos­i­tive asso­ci­a­tions were found in the cor­re­la­tional research lit­er­a­ture, meta-analy­ses of results from the few exper­i­men­tal stud­ies located in the lit­er­a­ture showed lit­tle to no influ­ence of music on ver­bal or math test scores…In British Columbia, Canada (Gouzoua­sis, Guhn, & Kishor, 2007), results of an obser­va­tional study indi­cated an asso­ci­a­tion between music enroll­ment and higher sub­jec­t-area stan­dard­ized test scores among high school stu­dents. The results of a ran­dom­ized exper­i­ment in Mon­tre­al, Canada, showed no effects of piano instruc­tion on sub­jec­t-area stan­dard­ized tests among ele­men­tary school chil­dren from low socioe­co­nomic back­grounds (Costa-Giomi, 2004).

  29. A spe­cific exam­ple: —chess-play­ing had supe­rior chess­board recall than adults, but adults still had bet­ter recall of num­bers. Exactly as expected from train­ing with no trans­fer.↩︎

  30. National mem­ory cham­pion Tatiana Coo­ley: “I’m incred­i­bly absen­t-mind­ed. I live by Post-its.” Or the Wash­ing­ton Post, review­ing Joshua Foer’s 2011 :

    Foer sets out to meet the leg­endary “Brain­man,” who learned Span­ish in a sin­gle week­end, could instantly tell if any num­ber up to 10,000 was prime, and saw dig­its in col­ors and shapes, enabling him to hold long lists of them in mem­o­ry. The author also tracks down “Rain Man” , the famous savant whose aston­ish­ing abil­ity to recite all of Shake­speare’s works, repro­duce scores from a vast canon of clas­si­cal music and retain the con­tents of 9,000 books was immor­tal­ized in the star­ring Dustin Hoff­man. When Foer is told that the Rain Man had an IQ of merely 87 - that he was actu­ally miss­ing a part of his brain; that mem­ory cham­pi­ons have no more intel­li­gence than you or I; that build­ing a mem­ory is a mat­ter of ded­i­ca­tion and train­ing - he decides to try for the U.S. him­self. Here is where the book veers sharply from sci­ence jour­nal­ism to a mem­oir of a sin­gu­lar adven­ture.

  31. “Get­ting a Grip on Drink­ing Behav­ior: Train­ing Work­ing Mem­ory to Reduce Alco­hol Abuse”, Houben et al 2011:

    Alco­hol abuse dis­rupts core exec­u­tive func­tions, includ­ing work­ing mem­ory (WM)-the abil­ity to main­tain and manip­u­late goal-rel­e­vant infor­ma­tion. When exec­u­tive func­tions like WM are weak­ened, drink­ing behav­ior gets out of con­trol and is guided more strongly by auto­matic impuls­es. This study inves­ti­gated whether train­ing WM restores con­trol over drink­ing behav­ior. Forty-eight prob­lem drinkers per­formed WM train­ing tasks or con­trol tasks dur­ing 25 ses­sions over at least 25 days. Before and after train­ing, we mea­sured WM and drink­ing behav­ior. Train­ing WM improved WM and reduced alco­hol intake for more than 1 month after the train­ing. Fur­ther, the indi­rect effect of train­ing on alco­hol use through improved WM was mod­er­ated by par­tic­i­pants’ lev­els of auto­matic impuls­es: Increased WM reduced alco­hol con­sump­tion in par­tic­i­pants with rel­a­tively strong auto­matic pref­er­ences for alco­hol. These find­ings are con­sis­tent with the the­o­ret­i­cal frame­work and demon­strate that train­ing WM may be an effec­tive strat­egy to reduce alco­hol use by increas­ing con­trol over auto­matic impulses to drink alco­hol.

  32. , Bickel et al 2011; WM tasks were digit span, reverse digit span, and a list-of-word­s-match­ing task. Decreas­ing their does not actu­ally show any reduced drug abuse or bet­ter odds of reha­bil­i­ta­tion, but it is hope­ful.↩︎

  33. “Self­-Dis­ci­pline Out­does IQ in Pre­dict­ing Aca­d­e­mic Per­for­mance of Ado­les­cents”, Duck­worth 2006; abstract:

    In a lon­gi­tu­di­nal study of 140 eighth-grade stu­dents, self­-dis­ci­pline mea­sured by self­-re­port, par­ent report, teacher report, and mon­e­tary choice ques­tion­naires in the fall pre­dicted final grades, school atten­dance, stan­dard­ized achieve­men­t-test scores, and selec­tion into a com­pet­i­tive high school pro­gram the fol­low­ing spring. In a repli­ca­tion with 164 eighth graders, a behav­ioral delay-of-grat­i­fi­ca­tion task, a ques­tion­naire on study habits, and a group-ad­min­is­tered IQ test were added. Self­-dis­ci­pline mea­sured in the fall accounted for more than twice as much vari­ance as IQ in final grades, high school selec­tion, school atten­dance, hours spent doing home­work, hours spent watch­ing tele­vi­sion (in­verse­ly), and the time of day stu­dents began their home­work. The effect of self­-dis­ci­pline on final grades held even when con­trol­ling for first-mark­ing-pe­riod grades, achieve­men­t-test scores, and mea­sured IQ. These find­ings sug­gest a major rea­son for stu­dents falling short of their intel­lec­tual poten­tial: their fail­ure to exer­cise self­-dis­ci­pline.

  34. This is prob­a­bly not sur­pris­ing, since even in adults, those with higher WMs are bet­ter at con­trol­ling their emo­tions when asked to do so; abstract of “Work­ing mem­ory capac­ity and spon­ta­neous emo­tion reg­u­la­tion: High capac­ity pre­dicts self­-en­hance­ment in response to neg­a­tive feed­back”:

    Although pre­vi­ous evi­dence sug­gests that work­ing mem­ory capac­ity (WMC) is impor­tant for suc­cess at emo­tion reg­u­la­tion, that evi­dence may reveal sim­ply that peo­ple with higher WMC fol­low instruc­tions bet­ter than those with lower WMC. The present study tested the hypoth­e­sis that peo­ple with higher WMC more effec­tively engage in spon­ta­neous emo­tion reg­u­la­tion fol­low­ing neg­a­tive feed­back, rel­a­tive to those with lower WMC. Par­tic­i­pants were ran­domly assigned to receive either no feed­back or neg­a­tive feed­back about their emo­tional intel­li­gence. They then com­pleted a dis­guised mea­sure of self­-en­hance­ment and a self­-re­port mea­sure of affect. Exper­i­men­tal con­di­tion and WMC inter­acted such that higher WMC pre­dicted more self­-en­hance­ment and less neg­a­tive affect fol­low­ing neg­a­tive feed­back. This research pro­vides novel insight into the con­se­quences of indi­vid­ual differ­ences in WMC and illus­trates that cog­ni­tive capac­ity may facil­i­tate the spon­ta­neous self­-reg­u­la­tion of emo­tion.

  35. “Inves­ti­gat­ing the pre­dic­tive roles of work­ing mem­ory and IQ in aca­d­e­mic attain­ment”, Alloway 2010:

    …The find­ings indi­cate that chil­dren’s work­ing mem­ory skills at 5 years of age were the best pre­dic­tor of lit­er­acy and numer­acy 6 years lat­er. IQ, in con­trast, accounted for a smaller por­tion of unique vari­ance to these learn­ing out­comes. The results demon­strate that work­ing mem­ory is not a proxy for IQ but rather rep­re­sents a dis­so­cia­ble cog­ni­tive skill with unique links to aca­d­e­mic attain­ment. Crit­i­cal­ly, we find that work­ing mem­ory at the start of for­mal edu­ca­tion is a more pow­er­ful pre­dic­tor of sub­se­quent aca­d­e­mic suc­cess than IQ….

    Less strik­ing but still rel­e­vant is “Work­ing Mem­o­ry, but Not IQ, Pre­dicts Sub­se­quent Learn­ing in Chil­dren with Learn­ing Diffi­cul­ties”, Alloway 2009:

    The pur­pose of the present study was to com­pare the pre­dic­tive power of work­ing mem­ory and IQ in chil­dren iden­ti­fied as hav­ing learn­ing diffi­cul­ties…Chil­dren aged between 7 and 11 years were tested at Time 1 on mea­sures of work­ing mem­o­ry, IQ, and learn­ing. They were then retested 2 years later on the learn­ing mea­sures. The find­ings indi­cated that work­ing-mem­ory capac­ity and domain-spe­cific knowl­edge at Time 1, but not IQ, were sig­nifi­cant pre­dic­tors of learn­ing at Time 2.

  36. “Com­put­er­ized Train­ing of Work­ing Mem­ory in Chil­dren With ADHD - A Ran­dom­ized, Con­trolled Trial”, Kling­berg et al 2005; abstract:

    …For the span-board task, there was a sig­nifi­cant treat­ment effect both post-in­ter­ven­tion and at fol­low-up. In addi­tion, there were sig­nifi­cant effects for sec­ondary out­come tasks mea­sur­ing ver­bal WM, response inhi­bi­tion, and com­plex rea­son­ing. Par­ent rat­ings showed sig­nifi­cant reduc­tion in symp­toms of inat­ten­tion and hyperactivity/impulsivity, both post-in­ter­ven­tion and at fol­low-up. Con­clu­sion­s:This study shows that WM can be improved by train­ing in chil­dren with ADHD. This train­ing also improved response inhi­bi­tion and rea­son­ing and resulted in a reduc­tion of the par­en­t-rated inat­ten­tive symp­toms of ADHD.

    See also Green et al 2012.↩︎

  37. “Train­ing and trans­fer effects of exec­u­tive func­tions in preschool chil­dren”, Thorell et al 2009↩︎

  38. “Differ­en­tial effects of rea­son­ing and speed train­ing in chil­dren” (the list of rea­son­ing games, page 5, does not seem to include any direct ana­logues to n-back):

    The goal of this study was to deter­mine whether inten­sive train­ing can ame­lio­rate cog­ni­tive skills in chil­dren. Chil­dren aged 7 to 9 from low socioe­co­nomic back­grounds par­tic­i­pated in one of two cog­ni­tive train­ing pro­grams for 60 min­utes ⁄ day and 2 days ⁄ week, for a total of 8 weeks. Both train­ing pro­grams con­sisted of com­mer­cially avail­able com­put­er­ized and non- com­put­er­ized games. Rea­son­ing train­ing empha­sized plan­ning and rela­tional inte­gra­tion; speed train­ing empha­sized rapid visual detec­tion and rapid motor respons­es. Stan­dard assess­ments of rea­son­ing abil­ity - the Test of Non-Ver­bal Intel­li­gence (TONI-3) and cog­ni­tive speed (Cod­ing B from WISC IV) - were admin­is­tered to all chil­dren before and after train­ing. Nei­ther group was exposed to these stan­dard­ized tests dur­ing train­ing. Chil­dren in the rea­son­ing group improved sub­stan­tially on TONI (Co­hen’s d = 1.51), exhibit­ing an aver­age increase of 10 points in Per­for­mance IQ, but did not improve on Cod­ing. By con­trast, chil­dren in the speed group improved sub­stan­tially on Cod­ing (d = 1.15), but did not improve on TONI. Counter to wide­spread belief, these results indi­cate that both fluid rea­son­ing and pro­cess­ing speed are mod­i­fi­able by train­ing.

  39. See again Stern­berg et al’s 2001 review, “The Pre­dic­tive Value of IQ”:

    Evi­dence from stud­ies of the nat­ural course of devel­op­ment: Some get more intel­li­gent, oth­ers get less intel­li­gent. The Berke­ley Guid­ance Study (Honzik, Mac­far­lane, & Allen, 1948) inves­ti­gated the sta­bil­ity of IQ test per­for­mance over 12 years. The authors reported that nearly 60% of the sam­ple changed by 15 IQ points or more from 6 to 18 years of age. A sim­i­lar result was found in the Fels study (Son­tag, Bak­er, & Nel­son, 1958): Nearly two thirds of the chil­dren changed more than 15 IQ points from age 3 to age 10. Researchers also inves­ti­gated the so-called intel­li­gence labil­ity score, which is a child’s stan­dard devi­a­tion from his or her own grand mean IQ. Bay­ley (1949), in the Berke­ley Growth study, detected very large indi­vid­ual differ­ences in labil­ity across the span of 18 years. Rees and Palmer (1970) com­bined the data from five large-s­cale lon­gi­tu­di­nal stud­ies, select­ing those par­tic­i­pants who had scores at both age 6 and age 12 or at both age 12 and age 17. They found that about 30% of the selected par­tic­i­pants changed by 10 or more IQ points.

    Stern­berg et al also dis­cusses the dra­matic IQ gains pos­si­ble dur­ing infancy when adoptees mov­ing from a bad envi­ron­ment (Third or Sec­ond World orphan­ages) to good ones (First World home­s), but also the dis­cour­ag­ing exam­ples of early inter­ven­tion pro­grams in the USA where ini­tial IQ gains often fade away over the years.↩︎

  40. See (me­dia cov­er­age: “IQ Isn’t Set In Stone, Sug­gests Study That Finds Big Jumps, Dips In Teens”);

    Neu­roimag­ing allows us to test whether unex­pected lon­gi­tu­di­nal fluc­tu­a­tions in mea­sured IQ are related to brain devel­op­ment. Here we show that ver­bal and non-ver­bal IQ can rise or fall in the teenage years, with these changes in per­for­mance val­i­dated by their close cor­re­la­tion with changes in local brain struc­ture. A com­bi­na­tion of struc­tural and func­tional imag­ing showed that ver­bal IQ changed with grey mat­ter in a region that was acti­vated by speech, whereas non-ver­bal IQ changed with grey mat­ter in a region that was acti­vated by fin­ger move­ments. By using lon­gi­tu­di­nal assess­ments of the same indi­vid­u­als, we obvi­ated the many sources of vari­a­tion in brain struc­ture that con­found cross-sec­tional stud­ies. This allowed us to dis­so­ci­ate neural mark­ers for the two types of IQ and to show that gen­eral ver­bal and non-ver­bal abil­i­ties are closely linked to the sen­so­ri­mo­tor skills involved in learn­ing.

    It’s worth not­ing that sub­stan­tial changes in the brain con­tinue to take place towards the end of ado­les­cence and early adult­hood, and at least some are about reduc­ing one’s men­tal flex­i­bil­i­ty; from National Geo­graphic, “Beau­ti­ful Brains: Moody. Impul­sive. Mad­den­ing. Why do teenagers act the way they do? Viewed through the eyes of evo­lu­tion, their most exas­per­at­ing traits may be the key to suc­cess as adults”:

    Mean­while, in times of doubt, take inspi­ra­tion in one last dis­tinc­tion of the teen brain-a final key to both its clum­si­ness and its remark­able adapt­abil­i­ty. This is the pro­longed plas­tic­ity of those late-de­vel­op­ing frontal areas as they slowly mature. As noted ear­lier, these areas are the last to lay down the fatty insu­la­tion-the brain’s white mat­ter-that speeds trans­mis­sion. And at first glance this seems like bad news: If we need these areas for the com­plex task of enter­ing the world, why aren’t they run­ning at full speed when the chal­lenges are most daunt­ing?

    The answer is that speed comes at the price of flex­i­bil­i­ty. While a myelin coat­ing greatly accel­er­ates an axon’s band­width, it also inhibits the growth of new branches from the axon. Accord­ing to Dou­glas Fields, an NIH neu­ro­sci­en­tist who has spent years study­ing myelin, “This makes the period when a brain area lays down myelin a sort of cru­cial period of learn­ing-the wiring is get­ting upgrad­ed, but once that’s done, it’s harder to change.”

    The win­dow in which expe­ri­ence can best rewire those con­nec­tions is highly spe­cific to each brain area. Thus the brain’s lan­guage cen­ters acquire their insu­la­tion most heav­ily in the first 13 years, when a child is learn­ing lan­guage. The com­pleted insu­la­tion con­sol­i­dates those gain­s-but makes fur­ther gains, such as sec­ond lan­guages, far harder to come by. So it is with the fore­brain’s myeli­na­tion dur­ing the late teens and early 20s. This delayed com­ple­tion-a with­hold­ing of readi­ness-height­ens flex­i­bil­ity just as we con­front and enter the world that we will face as adults.

  41. Jaeg­gi, S. M., Seew­er, R., Nirkko, A. C., Eck­stein, D., Schroth, G., Groner, R., et al, (2003). “Does exces­sive mem­ory load atten­u­ate acti­va­tion in the pre­frontal cor­tex? Load­-de­pen­dent pro­cess­ing in sin­gle and dual tasks: func­tional mag­netic res­o­nance imag­ing study”, Neu­roim­age 19(2) 210-225.↩︎

  42. 22 = 4; 4-1 = 3. For DNB, the 3 responses are:

    1. audio match
    2. visual match
    3. audio & visual matches
  43. or 23 - 1↩︎

  44. or 24 - 1↩︎

  45. or 25 - 1↩︎

  46. Spread­ing one’s efforts over a vari­ety of activ­i­ties is not nec­es­sar­ily a good thing, and can be sub­-op­ti­mal; con­sider the char­ity exam­ple (“Giv­ing Your All”, Steven E. Lands­burg):

    Peo­ple con­stantly ignore my good advice by con­tribut­ing to the Amer­i­can Heart Asso­ci­a­tion, the Amer­i­can Can­cer Soci­ety, CARE, and pub­lic radio all in the same year–as if they were think­ing, “OK, I think I’ve pretty much wrapped up the prob­lem of heart dis­ease; now let’s see what I can do about can­cer.”

  47. see eg. McNab or West­er­berg.↩︎

  48. I’m not the only one to notice this. ‘y offs et’ men­tions dur­ing a dis­cus­sion of TNB that:

    It’s inter­est­ing how doing n-back proves that time is rel­a­tive and based upon our per­cep­tion of its pass­ing

    When I’m doing well, the next instance comes with metronome exact­ness as expected from a machine. When I’m reset­ting after a tricky dou­ble-back, the next instance always comes way too quick­ly, as if a sec­ond had been removed. The same per­cep­tion hap­pens on an upped lev­el, and it is so per­sis­tent. It’s like some time had van­ished.

    For the longest time I thought the pro­gram had a bug, being the mere human.

  49. ↩︎

  50. Sleep affects IQ, not just vig­i­lance or ener­gy: , John­stone et al 2010; abstract:

    Fluid intel­li­gence involves novel prob­lem-solv­ing and may be sus­cep­ti­ble to poor sleep. This study exam­ined rela­tion­ships between ado­les­cent sleep, fluid intel­li­gence, and aca­d­e­mic achieve­ment. Par­tic­i­pants were 217 ado­les­cents (42% male) aged 13 to 18 years (mean age, 14.9 years; SD = 1.0) in grades 9-11. Fluid intel­li­gence was pre­dicted to medi­ate the rela­tion­ship between ado­les­cent sleep and aca­d­e­mic achieve­ment. Stu­dents com­pleted online ques­tion­naires of self­-re­ported sleep, fluid intel­li­gence (Let­ter Sets and Num­ber Series), and self­-re­ported grades. Total sleep time was not sig­nifi­cantly related to fluid intel­li­gence nor aca­d­e­mic achieve­ment (both p > 0.05); how­ev­er, sleep diffi­culty (e.g. diffi­culty ini­ti­at­ing sleep, unre­fresh­ing sleep) was related to both (P < 0.05)…

    Fur­ther, we can eas­ily delude our­selves about our own men­tal states:

    Still, while it’s tempt­ing to believe we can train our­selves to be among the five-hour group - we can’t, Dinges says - or that we are nat­u­rally those five-hour sleep­ers, con­sider a key find­ing from Van Don­gen and Dinges’s study: after just a few days, the four- and six-hour group reported that, yes, they were slightly sleepy. But they insisted they had adjusted to their new state. Even 14 days into the study, they said sleepi­ness was not affect­ing them. In fact, their per­for­mance had tanked. In other words, the sleep­-de­prived among us are lousy judges of our own sleep needs.

  51. NASA Naps: NASA-supported sleep researchers are learn­ing new and sur­pris­ing things about naps.”, 2005-06-03:

    “To our amaze­ment, work­ing mem­ory per­for­mance ben­e­fited from the naps, [but] vig­i­lance and basic alert­ness did not ben­e­fit very much,” says Dinges.

  52. has been shown to improve men­tal fit­ness. One small study with old dia­bet­ics found improve­ment in work­ing memory/executive func­tion caused by an aer­o­bic exer­cise reg­i­men, and another found increased brain vol­ume and increased hip­pocam­pal vol­ume & BDNF secre­tion in healthy old peo­ple; a found ben­e­fits in 8 of 11 aer­o­bic inter­ven­tions in the elder­ly. And exer­cise improves work­ing mem­ory (or at least cor­re­lated with intel­li­gence & edu­ca­tion in twin­s), and there is some sug­ges­tive evi­dence that or may help as well. One pos­si­ble mech­a­nism (in rats, any­way) is increases in chem­i­cal energy stor­age in the brain. For fur­ther read­ing, see the review & reviews cited in and .↩︎

  53. See for exam­ple “Zinc sta­tus and cog­ni­tive func­tion of preg­nant women in South­ern Ethiopia” or “Zinc sup­ple­men­ta­tion improved cog­ni­tive per­for­mance and taste acu­ity in Indian ado­les­cent girls”↩︎

  54. “Acute hypo­glycemia impairs non­ver­bal intel­li­gence: impor­tance of avoid­ing ceil­ing effects in cog­ni­tive func­tion test­ing.”. While we’re at it, blood sugar seems to be closely linked to attention/self-control/self-discipline (see LW dis­cus­sions: “The Phys­i­ol­ogy of Willpower”, “Willpow­er: not a lim­ited resource?”, “What would you do if blood glu­cose the­ory of willpower was true?”, Vladimir/ Golovin, and “Super­stim­uli and the Col­lapse of West­ern Civ­i­liza­tion”). For a roundup of all the research, read Baumeis­ter & Tier­ney’s 2011 book Willpower.↩︎

  55. Quotes from “Do You Suffer From Deci­sion Fatigue?”, NYT, itself quot­ing from Baumeis­ter & Tier­ney 2011:

    Once you’re men­tally deplet­ed, you become reluc­tant to make trade-offs, which involve a par­tic­u­larly advanced and tax­ing form of deci­sion mak­ing. In the rest of the ani­mal king­dom, there aren’t a lot of pro­tracted nego­ti­a­tions between preda­tors and prey. To com­pro­mise is a com­plex human abil­ity and there­fore one of the first to decline when willpower is deplet­ed. You become what researchers call a cog­ni­tive mis­er, hoard­ing your ener­gy. If you’re shop­ping, you’re liable to look at only one dimen­sion, like price: just give me the cheap­est. Or you indulge your­self by look­ing at qual­i­ty: I want the very best (an espe­cially easy strat­egy if some­one else is pay­ing). Deci­sion fatigue leaves you vul­ner­a­ble to mar­keters who know how to time their sales, as Jonathan Lev­av, the Stan­ford pro­fes­sor, demon­strated in exper­i­ments involv­ing tai­lored suits and new cars.

    Most of us in Amer­ica won’t spend a lot of time ago­niz­ing over whether we can afford to buy soap, but it can be a deplet­ing choice in rural India. Dean Spears, an econ­o­mist at Prince­ton, offered peo­ple in 20 vil­lages in Rajasthan in north­west­ern India the chance to buy a cou­ple of bars of brand-name soap for the equiv­a­lent of less than 20 cents. It was a steep dis­count off the reg­u­lar price, yet even that sum was a strain for the peo­ple in the 10 poor­est vil­lages. Whether or not they bought the soap, the act of mak­ing the deci­sion left them with less willpow­er, as mea­sured after­ward in a test of how long they could squeeze a hand grip. In the slightly more afflu­ent vil­lages, peo­ple’s willpower was­n’t affected sig­nifi­cant­ly…To estab­lish cause and effect, researchers at Baumeis­ter’s lab tried refu­el­ing the brain in a series of exper­i­ments involv­ing lemon­ade mixed either with sugar or with a diet sweet­en­er. The sug­ary lemon­ade pro­vided a burst of glu­cose, the effects of which could be observed right away in the lab; the sug­ar­less vari­ety tasted quite sim­i­lar with­out pro­vid­ing the same burst of glu­cose. Again and again, the sugar restored willpow­er, but the arti­fi­cial sweet­ener had no effect. The glu­cose would at least mit­i­gate the ego deple­tion and some­times com­pletely reverse it. The restored willpower improved peo­ple’s self­-con­trol as well as the qual­ity of their deci­sions: they resisted irra­tional bias when mak­ing choic­es, and when asked to make finan­cial deci­sions, they were more likely to choose the bet­ter long-term strat­egy instead of going for a quick pay­off. The ego-de­ple­tion effect was even demon­strated with dogs in two stud­ies by Holly Miller and Nathan DeWall at the Uni­ver­sity of Ken­tucky. After obey­ing sit and stay com­mands for 10 min­utes, the dogs per­formed worse on self­-con­trol tests and were also more likely to make the dan­ger­ous deci­sion to chal­lenge another dog’s turf. But a dose of glu­cose restored their willpow­er. The results of the exper­i­ment were announced in Jan­u­ary, dur­ing Heather­ton’s speech accept­ing the lead­er­ship of the Soci­ety for Per­son­al­ity and Social Psy­chol­o­gy, the world’s largest group of social psy­chol­o­gists. In his pres­i­den­tial address at the annual meet­ing in San Anto­nio, Heather­ton reported that admin­is­ter­ing glu­cose com­pletely reversed the brain changes wrought by deple­tion - a find­ing, he said, that thor­oughly sur­prised him. Heather­ton’s results did much more than pro­vide addi­tional con­fir­ma­tion that glu­cose is a vital part of willpow­er; they helped solve the puz­zle over how glu­cose could work with­out global changes in the brain’s total energy use. Appar­ently ego deple­tion causes activ­ity to rise in some parts of the brain and to decline in oth­ers. Your brain does not stop work­ing when glu­cose is low. It stops doing some things and starts doing oth­ers. It responds more strongly to imme­di­ate rewards and pays less atten­tion to long-term prospects.

    …The psy­chol­o­gists gave pre­pro­grammed Black­Ber­rys to more than 200 peo­ple going about their daily rou­tines for a week. The phones went off at ran­dom inter­vals, prompt­ing the peo­ple to report whether they were cur­rently expe­ri­enc­ing some sort of desire or had recently felt a desire. The painstak­ing study, led by Wil­helm Hof­mann, then at the Uni­ver­sity of Würzburg, col­lected more than 10,000 momen­tary reports from morn­ing until mid­night.

    Desire turned out to be the norm, not the excep­tion. Half the peo­ple were feel­ing some desire when their phones went off - to snack, to goof off, to express their true feel­ings to their bosses - and another quar­ter said they had felt a desire in the past half-hour. Many of these desires were ones that the men and women were try­ing to resist, and the more willpower peo­ple expend­ed, the more likely they became to yield to the next temp­ta­tion that came along. When faced with a new desire that pro­duced some I-wan­t-to-but-I-re­al­ly-should­n’t sort of inner con­flict, they gave in more read­ily if they had already fended off ear­lier temp­ta­tions, par­tic­u­larly if the new temp­ta­tion came soon after a pre­vi­ously reported one. The results sug­gested that peo­ple spend between three and four hours a day resist­ing desire. Put another way, if you tapped four or five peo­ple at any ran­dom moment of the day, one of them would be using willpower to resist a desire. The most com­monly resisted desires in the phone study were the urges to eat and sleep, fol­lowed by the urge for leisure, like tak­ing a break from work by doing a puz­zle or play­ing a game instead of writ­ing a memo. Sex­ual urges were next on the list of most-re­sisted desires, a lit­tle ahead of urges for other kinds of inter­ac­tions, like check­ing Face­book. To ward off temp­ta­tion, peo­ple reported using var­i­ous strate­gies. The most pop­u­lar was to look for a dis­trac­tion or to under­take a new activ­i­ty, although some­times they tried sup­press­ing it directly or sim­ply tough­ing their way through it. Their suc­cess was decid­edly mixed. They were pretty good at avoid­ing sleep, sex and the urge to spend mon­ey, but not so good at resist­ing the lure of tele­vi­sion or the Web or the gen­eral temp­ta­tion to relax instead of work.

    …‘Good deci­sion mak­ing is not a trait of the per­son, in the sense that it’s always there,’ Baumeis­ter says. ‘It’s a state that fluc­tu­ates.’ His stud­ies show that peo­ple with the best self­-con­trol are the ones who struc­ture their lives so as to con­serve willpow­er. They don’t sched­ule end­less back­-to-back meet­ings. They avoid temp­ta­tions like all-y­ou-can-eat buffets, and they estab­lish habits that elim­i­nate the men­tal effort of mak­ing choic­es. Instead of decid­ing every morn­ing whether or not to force them­selves to exer­cise, they set up reg­u­lar appoint­ments to work out with a friend. Instead of count­ing on willpower to remain robust all day, they con­serve it so that it’s avail­able for emer­gen­cies and impor­tant deci­sion­s….’Even the wis­est peo­ple won’t make good choices when they’re not rested and their glu­cose is low,’ Baumeis­ter points out. That’s why the truly wise don’t restruc­ture the com­pany at 4 p.m. They don’t make major com­mit­ments dur­ing the cock­tail hour. And if a deci­sion must be made late in the day, they know not to do it on an empty stom­ach. ‘The best deci­sion mak­ers,’ Baumeis­ter says, ‘are the ones who know when not to trust them­selves.’

  56. Although that said, note that the blood sugar par­a­digm appears to have fallen vic­tim to : “An oppor­tu­nity cost model of sub­jec­tive effort and task per­for­mance”, Kurzban et al 2013; “A Meta-Analy­sis of Blood Glu­cose Effects on Human Deci­sion Mak­ing”, Orquin & Kurzban 2016; “Is Ego-De­ple­tion a Replic­a­ble Effect? A Foren­sic Meta-Analy­sis of 165 Ego Deple­tion Arti­cles”.↩︎

  57. See “Effects of prior light expo­sure on early evening per­for­mance, sub­jec­tive sleepi­ness, and hor­monal secre­tion” (cov­er­age), Münch et al 2012:

    …For cog­ni­tive per­for­mance we found a sig­nifi­cant inter­ac­tion between light con­di­tions, men­tal load (2- or 3-back task) and the order of light admin­is­tra­tion. On their first evening, sub­jects per­formed with sim­i­lar accu­racy after both light con­di­tions, but on their sec­ond evening, sub­jects per­formed sig­nifi­cantly more accu­rately after the DL in both n-back ver­sions and com­mit­ted fewer false alarms in the 2-back task com­pared to the AL group. Lower sleepi­ness in the evening was sig­nifi­cantly cor­re­lated with bet­ter cog­ni­tive per­for­mance (p < .05).

  58. “With regards to changes in n-back lev­el, I went up about 1 solid level on all the tasks that I trained. That is, I went from 7 to 8 for dual, 6 to 7 for posi­tion-sound-col­or, 6 to 7 for posi­tion-sound-shape, and 4 to 5 on quad. I don’t use any strate­gies.”↩︎

  59. “Train­ing of Work­ing Mem­ory Impacts Struc­tural Con­nec­tiv­ity”, Takeuchi 2010 in .↩︎

  60. , “Lies, Damn Lies, and Chi­nese Sci­ence: The Peo­ple’s Repub­lic is becom­ing a tech­no­log­i­cal super­pow­er, but who’s check­ing the facts? Sam Geall seeks out the Chi­nese sci­ence cops” (see also Lancet, Nature, NYT):

    This pub­lish-or-per­ish cul­ture has led to unre­al­is­tic tar­gets at Chi­nese uni­ver­si­ties - and as a pre­dictable con­se­quence, ram­pant pla­gia­rism. In Jan­u­ary, the peer-re­viewed inter­na­tional jour­nal Sec­tion E announced the retrac­tion of more than 70 papers by Chi­nese sci­en­tists who had fal­si­fied data. Three months lat­er, the same pub­li­ca­tion announced the removal of another 39 arti­cles “as a result of prob­lems with the data sets or incor­rect atom assign­ments”, 37 of which were entirely pro­duced in Chi­nese uni­ver­si­ties. The New Jer­sey-based Cen­te­nary Col­lege closed its affil­i­ated Chi­nese busi­ness school pro­gramme in July after a review “revealed evi­dence of wide­spread pla­gia­rism, among other issues, at a level that ordi­nar­ily would have resulted in stu­dents’ imme­di­ate dis­missal from the col­lege.” A gov­ern­ment study, cited by Nature, found that about one-third of over 6,000 sci­en­tists sur­veyed at six top Chi­nese insti­tu­tions had prac­tised “pla­gia­rism, fal­si­fi­ca­tion or fab­ri­ca­tion”. But it’s not only the empha­sis on quan­tity that dam­ages sci­en­tific qual­ity in Chi­na. Pub­li­ca­tion bias - the ten­dency to priv­i­lege the results of stud­ies that show a sig­nifi­cant find­ing, rather than incon­clu­sive results - is noto­ri­ously per­va­sive. One sys­tem­atic review of acupunc­ture stud­ies from 1998, pub­lished in Con­trolled Clin­i­cal Tri­als, found that every sin­gle clin­i­cal trial orig­i­nat­ing in China was pos­i­tive - in other words, no trial pub­lished in China had found a treat­ment to be ineffec­tive.

    , “Tra­di­tional Chi­nese med­i­cine: Big ques­tions: Jour­nal reports of ben­e­fits often lack method­olog­i­cal rigor or details”:

    focuses exclu­sively on reports pub­lished since 1999 in Chi­nese aca­d­e­mic jour­nals, roughly half of which were spe­cialty pub­li­ca­tions. Clin­i­cians authored half of the papers. Almost 85% of the reports focused on herbal reme­dies - any­thing from bulk herbs or pills to “decoc­tions”. Most of the remain­ing reviews assessed the value of acupunc­ture, although about 1% of the reports dealt with Tuina mas­sage…The papers were reviews, or what are typ­i­cally referred to in West­ern jour­nals as meta-analy­ses…­Many of the papers were incom­plete, roughly one-third con­tained sta­tis­ti­cal errors and oth­ers pro­vided data or com­par­isons that the authors termed mis­lead­ing. Fewer than half of the sur­veyed papers described how the data they were pre­sent­ing had been col­lect­ed, how those data had been ana­lyzed or how a deci­sion had been made about which stud­ies to com­pare. The major­ity of papers also did not assess the risk of bias across stud­ies or offer any infor­ma­tion on poten­tial con­flic­t-of-in­ter­est fac­tors (such as who funded or oth­er­wise offered sup­port for the research being reviewed)….Over­all, “the qual­ity of these reviews is trou­bling,” the Lanzhou researchers con­clude in the May 25 PLoS One.

  61. “Pla­gia­rism Plague Hin­ders Chi­na’s Sci­en­tific Ambi­tion”, NPR:

    In 2008, when her sci­en­tific pub­li­ca­tion, the Jour­nal of Zhe­jiang Uni­ver­si­ty-Science, became the first in China to use Cross­Check text analy­sis soft­ware to spot pla­gia­rism, Zhang was pleased to be a trail­blaz­er. But when the first set of results came in, she was upset and hor­ri­fied. “In almost 2 years, we find about 31% of papers with unrea­son­able copy­[ing] and pla­gia­rism,” she says, shak­ing her head. “This is true.” For com­puter sci­ence and life sci­ence papers, that fig­ure went up to almost 40 per­cent…De­spite the out­pour­ing of Chi­nese papers, Chi­nese research isn’t that influ­en­tial glob­al­ly. Thom­son Reuters’ Sci­ence Watch web­site notes that China isn’t even in the top 20 when mea­sur­ing the num­ber of times a paper is cited on a national basis. Sci­enceNet’s Zhao says he fears Chi­nese research is still about quan­tity rather than qual­i­ty….How­ev­er, Chi­na’s lead­ers have com­mit­ted to fight­ing sci­en­tific fraud. And Zhang, the jour­nal edi­tor, says that one year on, pla­gia­rism at her pub­li­ca­tion has fallen notice­ably, to 24% of all sub­mis­sions.

    “Chi­na’s aca­d­e­mic scan­dal: call tol­l-free hot­lines to get your name pub­lished”; “Looks good on paper: A flawed sys­tem for judg­ing research is lead­ing to aca­d­e­mic fraud”; SAGE Pub­li­ca­tions busts ‘peer review and cita­tion ring’, 60 papers retracted”; out­sourced meta-analy­sis writ­ing.

    Pour­ing more money in seems to not be help­ing (“Fraud Scan­dals Sap Chi­na’s Dream of Becom­ing a Sci­ence Super­power”), and the fox is in charge of the hen house.↩︎

  62. Abstract:

    We inves­ti­gated whether and how indi­vid­ual differ­ences in per­son­al­ity deter­mine cog­ni­tive train­ing out­comes. 47 par­tic­i­pants were either trained on a sin­gle or on a dual n-back task for a period of 4 weeks. 52 addi­tional par­tic­i­pants did not receive any train­ing and served as a no-con­tact con­trol group. We assessed neu­roti­cism and con­sci­en­tious­ness as per­son­al­ity traits as well as per­for­mance in near and far trans­fer mea­sures. The results indi­cated a sig­nifi­cant inter­ac­tion of neu­roti­cism and inter­ven­tion in terms of train­ing effi­ca­cy. Whereas dual n-back train­ing was more effec­tive for par­tic­i­pants low in neu­roti­cism, sin­gle n-back train­ing was more effec­tive for par­tic­i­pants high in neu­roti­cism. Con­sci­en­tious­ness was asso­ci­ated with high train­ing scores in the sin­gle n-back and improve­ment in near trans­fer mea­sures, but lower far trans­fer per­for­mance, sug­gest­ing that sub­jects scor­ing high in this trait devel­oped task-spe­cific skills pre­vent­ing gen­er­al­iz­ing effects. We con­clude by propos­ing that indi­vid­ual differ­ences in per­son­al­ity should be con­sid­ered in future cog­ni­tive inter­ven­tion stud­ies to opti­mize the effi­cacy of train­ing.

  63. Research pro­gram­mer Jonathan Graehl writes on the LW dis­cus­sion of Jaeggi 2011:

    …If you sep­a­rated the “active con­trol” group into high and low improvers post-hoc just like was done for the n-back group, you might see that the active con­trol “high improvers” are even smarter than the n-back “high improvers”. We should expect some 8-9 year olds to improve in intel­li­gence or moti­va­tion over the course of a month or two, with­out any inter­ven­tion. Basi­cal­ly, this result sucks, because of the arti­fi­cial post-hoc divi­sion into high- and low- respon­ders to n-back train­ing, needed to show a strong “effect”. I’m not cer­tain that the effect is arti­fi­cial; I’d have to spend a lot of time doing some kind of sam­pling to show how well the data is explained by my alter­na­tive hypoth­e­sis.

  64. The DNB groups gain ~1 point (ques­tion), and the con­trol group falls ~2 points after start­ing off ~2 points high­er. In other words, if the con­trol group had not fallen so much, the DNB groups would at no point have scored high­er!

    Repli­cat­ing their results, we found a sig­nifi­cant gain in Gf scores in the train­ing group over and above gains on the digit span task F(1, 26) = 3.00, P = 0.05, ηp2 = 0.10. In con­trast, the con­trol group showed a non-sig­nifi­cant decrease in Gf, F<1, and the crit­i­cal group by time inter­ac­tion was sig­nifi­cant, F(1, 40) = 7.47, P = 0.01, ηp2 = 0.16. As can be seen in Fig­ure 3, there was a trend toward a sig­nifi­cant group differ­ence in Gf (RPM scores) at pre-train­ing, p≤0.10. This raises the pos­si­bil­ity that the rel­a­tive gains in Gf in the train­ing ver­sus con­trol groups may be to some extent an arte­fact of base­line differ­ences. How­ev­er, the inter­ac­tive effect of trans­fer as a func­tion of group remained sig­nifi­cant even after more closely match­ing the train­ing and con­trol groups for pre-train­ing RPM scores (by remov­ing the high­est scor­ing con­trols) F(1, 30) = 3.66, P = 0.032, ηp2 = 0.10. The adjusted means (stan­dard devi­a­tions) for the con­trol and train­ing groups were now 27.20 (1.93), 26.63 (2.60) at pre-train­ing (t(43) = 1.29, P>0.05) and 26.50 (4.50), 27.07 (2.16) at post-train­ing, respec­tive­ly. More­over, there was a trend for the gain in Gf to be pos­i­tively cor­re­lated with improve­ments in n-back per­for­mance across train­ing r(29) = 0.36 at P = 0.057, sug­gest­ing that such gains were indeed a func­tion of train­ing….Al­though the Gf trans­fer­able gains we found appear to be some­what related to train­ing gains and the effects remain when we trim the groups to pro­vide a bet­ter match for pre-train­ing Gf, it is impor­tant to note that some degree of regres­sion to the mean may be influ­enc­ing the results.

  65. At least, they seem to admin­is­ter the whole thing with no men­tion of such a vari­a­tion:

    We assessed Gf with the Raven’s Pro­gres­sive Matri­ces (RPM; [35]) - a stan­dard mea­sure in the lit­er­a­ture. Each RPM item pre­sented par­tic­i­pants with a matrix of visual pat­terns with one pat­tern miss­ing. The par­tic­i­pant chose how the matrix should be com­pleted by select­ing a pat­tern from a series of alter­na­tives. We used par­al­lel ver­sions of the RPM (even and uneven num­bered pages), which we coun­ter­bal­anced across par­tic­i­pants and pre- and post-train­ing. The RPM is scored on a scale from 0-30, with each cor­rect matrix earn­ing par­tic­i­pants one point.

  66. From the paper:

    The fig­ure depicts a block of the emo­tional ver­sion of the dual n-back task (train­ing task) where n = 1. The top row shows the sequence across tri­als (A, B, C, D, etc.) of visu­ally pre­sented stim­uli in a 4×4 grid (the visual stim­uli were pre­sented on a stan­dard 1280×1024 pixel com­puter dis­play). A pic­ture of a face appeared in one of the 16 pos­si­ble grid posi­tions on each tri­al. Simul­ta­ne­ous­ly, with the pre­sen­ta­tion of these visual stim­uli on the com­puter dis­play, par­tic­i­pants heard words over head­phones (sec­ond row in the fig­ure). Par­tic­i­pants were required to indi­cate, by but­ton press, whether the trial was a ‘tar­get trial’ or not. Tar­gets could be visual or audi­to­ry. In the exam­ple here, Trial C is a visual tar­get. That is, the face in Trial C is pre­sented in the same loca­tion as the face in Trial B (i.e., n = 1 posi­tions back). Note, the faces are of differ­ent actors. For visual stim­uli par­tic­i­pants were asked to ignore the con­tent of the image and solely attend to the loca­tion in which the images were pre­sent­ed. In the cur­rent exam­ple, Trial D was an audi­tory tar­get trial because ‘Evil’ is the same word as the word pre­sented in Trial C - n posi­tions back (where n = 1). Each block con­sisted of 20+n tri­als.

    (If you look at Fig­ure 1, exam­ple stim­uli words are ‘dead’, ‘hate’, ‘evil’, ‘rape’, ‘slum’, and a pic­ture of a very angry male face.)↩︎

  67. The differ­ence does­n’t seem to change progress on n-back in either group, which is good since if there were differ­ences, that would be trou­bling eg. if the affec­tive n-back group did­n’t increase as many lev­els, that would make any fol­low­ing results more dubi­ous:

    Per­for­mance of the two n-back groups pre- to post- train­ing did not differ sig­nifi­cantly on either the neu­tral F(1, 27) = 1.02, P>0.05 or affec­tive F (1, 27)<1 n-back tasks. Sim­i­lar­ly, the con­trol group showed a sig­nifi­cantly greater pre- to post-train­ing improve­ment on the fea­ture match task they trained on, com­pared with the n-back groups F(1, 42) = 41.09, P<0.001, ηp2 = 0.67.

    And as one would hope, both DNB groups increased their WM scores:

    As pre­dict­ed, par­tic­i­pants in the train­ing group showed a sig­nifi­cant improve­ment on digit span F(1, 28) = 33.96, p < 0.001, ηp2 = 0.55. How­ev­er, this was not true of con­trols F(1, 15) = 1.89, p = 0.19, ηp2 = 0.11, and the gain was sig­nifi­cantly greater in the train­ing group par­tic­i­pants com­pared to con­trols F(1,43) = 5.92, p = 0.02, ηp2 = 0.12.

  68. Alloway 2009, “The effi­cacy of work­ing mem­ory train­ing in improv­ing crys­tal­lized intel­li­gence” (PDF) 7 chil­dren with learn­ing dis­abil­i­ties received the train­ing for 8 weeks; Gc was mea­sured using the vocab­u­lary & math sec­tions of the Wech­sler IQ test.↩︎

  69. The prac­tice effect can last for many years. “Influ­ence of Age on Prac­tice Effects in Lon­gi­tu­di­nal Neu­rocog­ni­tive Change”, Salt­house 2010:

    of neu­rocog­ni­tive func­tion­ing often reveal sta­bil­ity or age-re­lated increases in per­for­mance among adults under about 60 years of age. Because nearly monot­o­nic declines with increas­ing age are typ­i­cally evi­dent in , there is a dis­crep­ancy in the inferred age trends based on the two types of com­par­ison­s….In­creased age was asso­ci­ated with sig­nifi­cantly more neg­a­tive lon­gi­tu­di­nal changes with each abil­i­ty. All of the esti­mated prac­tice effects were pos­i­tive, but they var­ied in mag­ni­tude across neu­rocog­ni­tive abil­i­ties and as a func­tion of age. After adjust­ing for prac­tice effects the lon­gi­tu­di­nal changes were less pos­i­tive at younger ages and slightly less neg­a­tive at older ages. Con­clu­sions: It was con­cluded that some, but not all, of the dis­crep­ancy between cross-sec­tional and lon­gi­tu­di­nal age trends in neu­rocog­ni­tive func­tion­ing is attrib­ut­able to prac­tice effects pos­i­tively bias­ing the lon­gi­tu­di­nal trends.

  70. Tofu: “I should also add, my score on the num­ber test jumped dra­mat­i­cally from the first test to the sec­ond test prob­a­bly because I taught myself how to do long divi­sion before the sec­ond test (which was the only study­ing I did for all 3 test­s).”↩︎

  71. Ship­stead, Redick, & Engle 2012 men­tion an amus­ing study I had­n’t heard of before:

    Green­wald et al. (1991) pro­vided a use­ful demon­stra­tion of the prob­lems asso­ci­ated with sub­jec­tive reports. Par­tic­i­pants in this study received com­mer­cially pro­duced audio­tapes that con­tained sub­lim­i­nal mes­sages intended to improve either self­-es­teem or mem­o­ry. Unknown to the par­tic­i­pants, half of the tapes that were designed to improve mem­ory were rela­beled “self­-es­teem” and vice ver­sa. At a 5-week posttest, par­tic­i­pants’ scores on sev­eral stan­dard mea­sures of self­-es­teem and mem­ory were improved, but this change was inde­pen­dent of the mes­sage and the label on the audio­tape (i.e., par­tic­i­pants showed across the board improve­men­t). How­ev­er, in response to sim­ple ques­tions regard­ing per­ceived effects, roughly 50% of par­tic­i­pants reported expe­ri­enc­ing improve­ments that were con­sis­tent with the label on the audio­tape, while only 15% reported improve­ments in the oppo­site domain. The self­-re­port mea­sures were nei­ther related to actual improve­ments in trans­fer task per­for­mance nor related to the con­tent of the inter­ven­tion. Instead, they were attrib­ut­able to expec­ta­tion of out­come.

  72. from at 90 days see­ing lit­tle effect, to 2.5 months later pro­duc­ing the sec­ond tes­ta­ment↩︎

  73. “Atten­tion and Work­ing Mem­ory in Insight Prob­lem-Solv­ing”, Mur­ray 2011. The study does not seem to have con­trolled for IQ, so it’s hard to say whether the WM/attention are respon­si­ble for increased per­for­mance or not.↩︎

  74. From Sanderberg/Bostrom 2006:

    Giv­ing , a dopamine pre­cur­sor, to healthy vol­un­teers did not affect direct seman­tic prim­ing (faster recog­ni­tion of words directly seman­ti­cally related to a pre­vi­ous word, such as “black­-white”) but did inhibit indi­rect prim­ing (faster recog­ni­tion of more seman­ti­cally dis­tant words, such as “sum­mer-s­now”) (Kischka et al. 1996). This was inter­preted by the authors of the study as dopamine inhibit­ing the spread of acti­va­tion within the seman­tic net­work, that is, a focus­ing on the task.

  75. “Tem­pera­ment and char­ac­ter cor­re­lates of neu­ropsy­cho­log­i­cal per­for­mance”, June 2010, Psy­cho­log­i­cal Soci­ety of South Africa↩︎

  76. Jaeggi 2008’s notes say the daily train­ing was ~25 min­utes; the longest group was 19 days; hours.↩︎

  77. See the crit­i­cal review of WM train­ing research, “Does work­ing mem­ory train­ing gen­er­al­ize?” (Ship­stead et al 2010).↩︎

  78. The R code:

    on <- c(35,31,27,66,25,38,35,43,60,47,38,58,50,23,50,45,60,37,22,28,50,20,41,42,47,55,47,42,35,
    off <- c(17,43,46,50,36,31,38,33,66,30,68,42,40,29,69,40,41,45,37,18,44,60,31,46,46,45,27,35,45,
    # [1] 158
    mcmcChain = BESTmcmc(off, on)
    postInfo = BESTplot(off, on, mcmcChain) # image
    #            SUMMARY.INFO
    # PARAMETER       mean   median     mode  HDIlow  HDIhigh pcgtZero
    #   mu1       40.96178 40.95536 40.93523 38.1887  43.7104       NA
    #   mu2       41.37400 41.37365 41.39874 38.8068  44.0550       NA
    #   muDiff    -0.41222 -0.41368 -0.45968 -4.2497   3.3593    41.54
    #   sigma1    12.32844 12.27614 12.28283 10.3024  14.4116       NA
    #   sigma2    11.21408 11.15464 10.99812  9.2924  13.1895       NA
    #   sigmaDiff  1.11436  1.10736  0.94511 -1.6011   3.9756    78.73
    #   nu        45.65240 37.49245 22.16426  5.3586 108.1555       NA
    #   nuLog10    1.56504  1.57394  1.61572  0.9956   2.1157       NA
    #   effSz     -0.03528 -0.03519 -0.03547 -0.3588   0.2851    41.54

    For those who pre­fer a reg­u­lar two-sam­ple test:

    #     Wilcoxon rank sum test with continuity correction
    # data:  off and on
    # W = 3004, p-value = 0.7066
    # alternative hypothesis: true location shift is not equal to 0
  79. “The Psy­chophys­i­ol­ogy of Lucid Dream­ing”, col­lected in Con­scious Mind, Sleep­ing Brain.↩︎

  80. pg 97 of his book The Dream Drug­store (2001)↩︎

  81. ; abstract:

    Func­tional neu­roimag­ing stud­ies car­ried out on healthy vol­un­teers while per­form­ing differ­ent n-back tasks have shown a com­mon pat­tern of bilat­eral fron­topari­etal acti­va­tion, espe­cially of the dor­so­lat­eral pre­frontal cor­tex (DLPFC). Our objec­tive was to use func­tional mag­netic res­o­nance imag­ing (fMRI) to com­pare the pat­tern of brain acti­va­tion while per­form­ing two sim­i­lar n-back tasks which differed in their pre­sen­ta­tion modal­i­ty. Thir­teen healthy vol­un­teers com­pleted a ver­bal 2-back task pre­sent­ing audi­tory stim­uli, and a sim­i­lar 2-back task pre­sent­ing visual stim­uli. A con­junc­tion analy­sis showed bilat­eral acti­va­tion of fron­topari­etal areas includ­ing the DLPFC. The left DLPFC and the supe­rior tem­po­ral gyrus showed a greater acti­va­tion in the audi­tory than in the visual con­di­tion, whereas pos­te­rior brain regions and the ante­rior cin­gu­late showed a greater acti­va­tion dur­ing the visual than dur­ing the audi­tory task. Thus, brain areas involved in the visual and audi­tory ver­sions of the n-back task showed an impor­tant over­lap between them, reflect­ing the supramodal char­ac­ter­is­tics of work­ing mem­o­ry. How­ev­er, the differ­ences found between the two modal­i­ties should be con­sid­ered in order to select the most appro­pri­ate task for future clin­i­cal stud­ies.

  82. Schnei­der, B, Pichora-Fuller, MK. “Impli­ca­tions of per­cep­tual dete­ri­o­ra­tion for cog­ni­tive aging Research”. In: Craik, FI, Salt­house, TA, edi­tors. The hand­book of aging and cog­ni­tion, Psy­chol­ogy Press, 2000. ISBN-10: 080585990X↩︎

  83. Abstract: “The authors inves­ti­gated the dis­tinc­tive­ness and inter­re­la­tion­ships among visu­ospa­tial and ver­bal mem­ory processes in short­-term, work­ing, and long-term mem­o­ries in 345 adults. Begin­ning in the 20s, a con­tin­u­ous, reg­u­lar decline occurs for pro­cess­ing-in­ten­sive tasks (e.g., speed of pro­cess­ing, work­ing mem­o­ry, and long-term mem­o­ry), whereas ver­bal knowl­edge increases across the life span [Be­sides Salt­house, for the ver­bal flu­ency claim see Schaie, K. W. Intel­lec­tual Devel­op­ment in Adult­hood: The Seat­tle Lon­gi­tu­di­nal Study. Cam­bridge Uni­ver­sity Press, 1996]. There is lit­tle differ­en­ti­a­tion in the cog­ni­tive archi­tec­ture of mem­ory across the life span. Visu­ospa­tial and ver­bal work­ing mem­ory are dis­tinct but highly inter­re­lated sys­tems with domain-spe­cific short­-term mem­ory sub­sys­tems. In con­trast to recent neu­roimag­ing data, there is lit­tle evi­dence for ded­iffer­en­ti­a­tion of func­tion at the behav­ioral level in old com­pared with young adults.” That the neu­roimag­ing shows no change in gen­eral loca­tions of activ­ity is prob­a­bly inter­pretable as the lower per­for­mance being due to gen­eral low-level prob­lems and ineffi­cien­cies of age, and not the elder­ly’s brains start­ing to ‘unlearn’ spe­cific tasks.↩︎

  84. “The Z-s­core rep­re­sents the age-con­tin­gent mean, mea­sured in units of stan­dard devi­a­tion rel­a­tive to the pop­u­la­tion mean. More pre­cise­ly, the Z-s­core is (age-con­tin­gent mean minus pop­u­la­tion mean) / (pop­u­la­tion stan­dard devi­a­tion).” –Agar­wal et al 2009↩︎

  85. , Agar­wal et al 2009 (slides):

    …The preva­lence of demen­tia explodes after age 60, dou­bling with every 5 years of age.5 In the cohort above age 85, the preva­lence of demen­tia exceeds 30%. More­over, many older adults with­out a strict diag­no­sis of demen­tia, still expe­ri­ence sub­stan­tial cog­ni­tive impair­ment. For exam­ple, the preva­lence of the diag­no­sis “cog­ni­tive impair­ment with­out demen­tia” is nearly 30% between ages 80 and 89.6 Draw­ing these facts togeth­er, among the pop­u­la­tion between ages 80 and 89, about half of the pop­u­la­tion either has a diag­no­sis of demen­tia or cog­ni­tive impair­ment with­out demen­tia.

    …Third, using a new dataset, we doc­u­ment a link between age and the qual­ity of finan­cial deci­sion-mak­ing in debt mar­kets. In a cross-sec­tion of prime bor­row­ers, mid­dle-aged adults bor­row at lower inter­est rates and pay fewer fees rel­a­tive to younger and older adults. Aver­ag­ing across ten credit mar­kets, fee and inter­est pay­ments are min­i­mized around age 53. The mea­sured effects are not explained by observed risk char­ac­ter­is­tics. Com­bin­ing mul­ti­ple data sets we do not find evi­dence that selec­tion effects and cohort effects explain our results. The lead­ing expla­na­tion for the pat­terns that we observe is that expe­ri­ence rises with age, but ana­lyt­i­cal abil­i­ties decline with it.

    …Neu­ro­log­i­cal patholo­gies rep­re­sent one impor­tant path­way for age effects in older adults. For instance, demen­tia is pri­mar­ily attrib­ut­able to Alzheimer’s Dis­ease (60%) and vas­cu­lar dis­ease (25%). The preva­lence of demen­tia dou­bles with every five addi­tional years of life­cy­cle age (Ferri et al., 2006; Fratiglioni, De Ronchi, and Agüero-Tor­res, 1999).10 For exam­ple, Table 1 reports that the preva­lence of demen­tia in North Amer­ica rises from 3.3% for adults ages 70–74, to 6.5% for adults ages 75–79, to 12.8% for adults ages 80–84, to 30.1% for adults at least 85 years of age (Ferri et al. 2006). Many older adults also suffer from a less severe form of cog­ni­tive impair­ment, which is diag­nosed as “cog­ni­tive impair­ment with­out demen­tia.” For exam­ple, the preva­lence of this diag­no­sis rises from 16.0% for adults ages 71-79, to 29.2% for adults ages 80-89.

    • 10: There is also grow­ing lit­er­a­ture that iden­ti­fies age-re­lated changes in the nature of cog­ni­tion (see Park and Schwarz, 1999 [Cog­ni­tive Aging: A Primer]; and Den­burg, Tranel, and Bechara 2005). Mather and Carstensen (2005) and iden­tify age-vari­a­tion in cog­ni­tive pref­er­ences. Sub­jects with short time hori­zons or older ages attend to neg­a­tive infor­ma­tion rel­a­tively less than sub­jects with long time hori­zons or younger ages.

    …Fig­ure 4d plots naive and con­trol per­for­mance in the Tele­phone Inter­view of Cog­ni­tive Sta­tus (TICS) task. This task asks the respon­dent ten triv­ial ques­tions and assigns one point for each cor­rect answer: What is the cur­rent year? Mon­th? Day? Day of the week? What do you usu­ally use to cut paper? What do you call the kind of prickly plant that grows in the desert? Who is the cur­rent pres­i­dent? Vice pres­i­dent? Count back­wards from twenty to ten (twice). At age 63, the aver­age score is 9.2 out of 10. By age 90, the aver­age (con­trol) score is 7.5. Final­ly, we present two mea­sures of prac­ti­cal numer­a­cy. 4e plots naive and con­trol per­for­mance in response to the ques­tion: If the chance of get­ting a dis­ease is 10 per­cent, how many peo­ple out of 1,000 would be expected to get the dis­ease? At age 53, 79% answer cor­rect­ly. By age 90, 50% answer cor­rect­ly. Fig­ure 4f plots naive and con­trol per­for­mance in response to the ques­tion: If 5 peo­ple all have the win­ning num­bers in the lot­tery and the prize is two mil­lion dol­lars, how much will each of them get? We believe that this ques­tion is impre­cisely posed, since the log­i­cal answer could be either $2,000,000 or $400,000. How­ev­er, the results are still inter­est­ing, since the frac­tion answer­ing $400,000 (the offi­cial cor­rect answer) drops pre­cip­i­tous­ly. At age 53, 52% answer $400,000. By age 90, 10% give this answer.

    …For the 1989, 1998, 2001, and 2004 sur­veys, we com­pute the ratios of income, edu­ca­tion, and net worth for bor­row­ers to the pop­u­la­tion as a whole, by age group; results are pre­sented in the online appen­dix. We find that within age groups, bor­row­ers almost always have higher lev­els of income and edu­ca­tion than the pop­u­la­tion as a whole, and often have higher lev­els of net worth. More­over, older bor­row­ers appear to have rel­a­tively higher lev­els of income and edu­ca­tion rel­a­tive to their peers than mid­dle-aged bor­row­ers do. Hence these data sug­gest that selec­tion effects by age go in the oppo­site direc­tion: older bor­row­ers appear to be a bet­ter pool than mid­dle-aged bor­row­ers. We present addi­tional results in the online appen­dix show­ing that bor­row­ing by age does not appear to vary by race, and that older bor­row­ers do not appear to have dis­pro­por­tion­ately lower incomes, FICO score, or higher debt lev­els. None of these analy­ses lend sup­port to the idea that sam­ple selec­tion effects con­tribute to the U-shape pat­terns that we see in the data.

    …The effects we find have a wide range of dol­lar mag­ni­tudes, reported in Table 4. We esti­mate that, for home­-e­quity lines of cred­it, 75-year-olds pay about $265 more each year than 50-year-olds, and 25-year-olds pay about $295 more. For other quan­ti­ties, say, credit card fees, the implied age differ­en­tials are small - roughly $10-$20 per year for each kind of fee. The impor­tance of the U-shaped effects we esti­mate goes beyond the eco­nomic sig­nifi­cance of each indi­vid­ual choice, how­ev­er: it lies in the fact that the appear­ance of a U-shaped pat­tern of costs in such a wide vari­ety of cir­cum­stances points to a phe­nom­e­non that might apply to many areas.

  86. The prac­tice effect can last for many years. “Influ­ence of Age on Prac­tice Effects in Lon­gi­tu­di­nal Neu­rocog­ni­tive Change”, Salt­house 2010:

    of neu­rocog­ni­tive func­tion­ing often reveal sta­bil­ity or age-re­lated increases in per­for­mance among adults under about 60 years of age. Because nearly monot­o­nic declines with increas­ing age are typ­i­cally evi­dent in , there is a dis­crep­ancy in the inferred age trends based on the two types of com­par­ison­s….In­creased age was asso­ci­ated with sig­nifi­cantly more neg­a­tive lon­gi­tu­di­nal changes with each abil­i­ty. All of the esti­mated prac­tice effects were pos­i­tive, but they var­ied in mag­ni­tude across neu­rocog­ni­tive abil­i­ties and as a func­tion of age. After adjust­ing for prac­tice effects the lon­gi­tu­di­nal changes were less pos­i­tive at younger ages and slightly less neg­a­tive at older ages. Con­clu­sions: It was con­cluded that some, but not all, of the dis­crep­ancy between cross-sec­tional and lon­gi­tu­di­nal age trends in neu­rocog­ni­tive func­tion­ing is attrib­ut­able to prac­tice effects pos­i­tively bias­ing the lon­gi­tu­di­nal trends.

  87. Per­haps sur­pris­ing­ly, the com­mon wis­dom that peo­ple adopt con­ser­v­a­tive atti­tudes as part of the aging process may not be cor­rect, and the observed con­ser­vatism of old peo­ple due to their com­ing from a more con­ser­v­a­tive time (ie. the past, as the 20th cen­tury saw a grand sweep of lib­eral beliefs through First World soci­eties); “Pop­u­la­tion Aging, Intra­co­hort Aging, and Sociopo­lit­i­cal Atti­tudes”, Danige­lis et al 2007’s abstract (excerpts):

    Pre­vail­ing stereo­types of older peo­ple hold that their atti­tudes are inflex­i­ble or that aging tends to pro­mote increas­ing con­ser­vatism in sociopo­lit­i­cal out­look. In spite of mount­ing sci­en­tific evi­dence demon­strat­ing that learn­ing, adap­ta­tion, and reassess­ment are behav­iors in which older peo­ple can and do engage, the stereo­type per­sists. We use U.S. Gen­eral Social Sur­vey data from 25 sur­veys between 1972 and 2004 to for­mally assess the mag­ni­tude and direc­tion of changes in atti­tudes that occur within cohorts at differ­ent stages of the life course. We decom­pose changes in sociopo­lit­i­cal atti­tudes into the pro­por­tions attrib­ut­able to cohort suc­ces­sion and intra­co­hort aging for three cat­e­gories of items: atti­tudes toward his­tor­i­cally sub­or­di­nate groups, civil lib­er­ties, and pri­va­cy. We find that sig­nifi­cant intra­co­hort change in atti­tudes occurs in cohort­s-in-later-stages (age 60 and old­er) as well as cohort­s-in-ear­lier-stages (ages 18 to 39), that the change for cohort­s-in-later-stages is fre­quently greater than that for cohort­s-in-ear­lier-stages, and that the direc­tion of change is most often toward increased tol­er­ance rather than increased con­ser­vatism. These find­ings are dis­cussed within the con­text of pop­u­la­tion aging and devel­op­ment.

  88. “Cog­ni­tive Decline Begins In Late 20s, Study Sug­gests”, Sci­ence Daily↩︎

  89. “This Is Your Brain. Aging. Sci­ence is reshap­ing what we know about get­ting old­er. (The news is bet­ter than you think.)”, Newsweek:

    The [Salt­house] graph shows two roller-coas­t­er­ing lines. One rep­re­sents the pro­por­tion of peo­ple of each age who are in the top 25% on a stan­dard lab test of rea­son­ing abil­i­ty-think­ing. The other shows the pro­por­tion of CEOs of com­pa­nies of each age. Rea­son­ing abil­ity peaks at about age 28 and then plum­mets, trac­ing that well-known plunge that makes those older than 30 (OK, fine, 40) cringe: only 6% of top scor­ers are in their 50s, and only 4% are in their 60s. But the age dis­tri­b­u­tion of CEOs is an almost per­fect mir­ror image: it peaks just before age 60. About half are older than 55. And the num­ber under 40 is about zero.

    …Salt-house deduces more coun­ter­in­tu­itive, and hope­ful, lessons. The first is that in real life, rather than in psych labs, peo­ple rely on men­tal abil­i­ties that stand up very well to age and dis­cover work-arounds for the men­tal skills that do fade.

  90. “How to Gain Eleven IQ Points in Ten Min­utes: Think­ing Aloud Improves Raven’s Matri­ces Per­for­mance in Older Adults”, Fox et al 2009:

    “Few stud­ies have exam­ined the impact of age on reac­tiv­ity to con­cur­rent think-aloud (TA) ver­bal reports. An ini­tial study with 30 younger and 31 older adults revealed that think­ing aloud improves older adult per­for­mance on a short form of the Raven’s Matri­ces (Bors & Stokes, 1998, Edu­ca­tional and Psy­cho­log­i­cal Mea­sure­ment, 58, p. 382) but did not affect other tasks. In the repli­ca­tion exper­i­ment, 30 older adults (mean age = 73.0) per­formed the Raven’s Matri­ces and three other tasks to repli­cate and extend the find­ings of the ini­tial study. Once again older adults per­formed sig­nifi­cantly bet­ter only on the Raven’s Matri­ces while think­ing aloud. Per­for­mance gains on this task were sub­stan­tial (d = 0.73 and 0.92 in Exper­i­ments 1 and 2, respec­tive­ly), cor­re­spond­ing to a fluid intel­li­gence increase of nearly one stan­dard devi­a­tion.”

  91. Some rel­e­vant excerpts:

    Buschkuehl et al. (2008) pro­posed an adap­tive visual WM train­ing pro­gram to old-old adults: Their results showed sub­stan­tial gains in the WM trained tasks. Short and long-term trans­fer effects were found only for tasks with the same stim­uli con­tent. Sim­i­lar­ly, Li et al. (2008) found in young and older adults spe­cific improve­ment in the task prac­ticed-a spa­tial 2 n-back WM task-that involved two con­di­tions: one stan­dard, one more demand­ing. Trans­fer effects were found on a more demand­ing 3 n-back visual task as well as on numer­i­cal n-back tasks. Although near trans­fer effects to the same (vi­su­al) and also differ­ent (nu­mer­i­cal) modal­ity were shown, no far trans­fer effects to more com­plex WM tasks (op­er­a­tion and rota­tion span tests) were found. With regard to main­te­nance effects, Buschkuehl et al. (2008) failed to find any main­te­nance 1 year after com­ple­tion of train­ing, in com­par­i­son with pretest. In con­trast, Li et al. (2008) showed a main­te­nance of prac­tice gains and of near-trans­fer effects at 3-month fol­low-up; nonethe­less, in con­trast with young adults, older par­tic­i­pants showed a per­for­mance decre­ment from post­prac­tice to fol­low-up.

    …Com­mon mea­sures used in cog­ni­tive aging research, and the­o­ret­i­cally related to WM, were cho­sen: short­-term mem­o­ry, fluid intel­li­gence, inhi­bi­tion, and pro­cess­ing speed (Craik & Salt­house, 2000; Ver­haeghen, Steitz, Sli­win­ski, & Cerel­la, 2003). For near­est-trans­fer effects, a visu­ospa­tial WM task (Dot Matrix task; adapted from Miyake, Fried­man, Ret­tinger, Shah, & Hegar­ty, 2001) was includ­ed. This task involves processes (elab­o­ra­tion and pro­cess­ing phase) sim­i­lar to the one prac­ticed. How­ev­er, the nature of the mate­r­ial and the sec­ondary require­ment are differ­ent from those of the trained task. The For­ward and Back­ward Digit Span tests were used to assess near-trans­fer effects because they are part of the gen­eral mem­ory fac­tor, but the task requests were differ­ent from those of the WM tasks (see Bopp & Ver­haeghen, 2005). Because these tasks mea­sure the same nar­row or same broad abil­i­ty, we expect trans­fer effects onto them. To deter­mine the pres­ence of far trans­fer effects, we chose clas­sic tasks: the Cat­tell task to mea­sure non­ver­bal rea­son­ing abil­i­ty; the Stroop Color test to index inhi­bi­tion-re­lated mech­a­nisms; and the Pat­tern Com­par­i­son test to assess pro­cess­ing speed. The trans­fer abil­i­ties were cho­sen with con­sid­er­a­tion of their rela­tion­ship to WM process­es. Work­ing mem­ory impair­ment in older adults is gen­er­ally attrib­uted to gen­eral mech­a­nisms such as inhi­bi­tion and pro­cess­ing speed (Borella et al., 2008). Fur­ther­more, WM is fre­quently advanced as one of the mech­a­nisms that also accounts for age-re­lated differ­ences in intel­li­gence tasks (de Rib­aupierre & Lecerf, 2006; Rab­bitt & Lowe, 2000; Schaie & Hert­zog, 1986)…

    The Cat­e­go­riza­tion Work­ing Mem­ory Span task (CWMS; Borella et al. 2008; De Beni, Borel­la, Car­ret­ti, Marigo, & Nava, 2008) is sim­i­lar to the clas­sic WM tasks, such as the Lis­ten­ing Span test (Borella et al., 2008), the only differ­ence being that it involves pro­cess­ing lists of words rather than sen­tences, lim­it­ing the role of seman­tic pro­cess­ing. The mate­ri­als con­sisted of 10 sets of words, each set com­pris­ing 20 lists of words, which were orga­nized in series of word lists of differ­ent lengths (from 2 to 6). Each list con­tained 5 words of high­-medium fre­quen­cy. Fur­ther­more, the lists con­tained zero, one, or two ani­mal nouns, present in any posi­tion, includ­ing last. An exam­ple list is house, moth­er, dog, word, night. Of the total num­ber of words (200) in the task, 28% were ani­mal words. Par­tic­i­pants lis­tened to the lists of words audiorecorded pre­sented at a rate of 1 s per word and had to tap their hand on the table when­ever they heard an ani­mal noun (pro­cess­ing phase). The inter­val between series of word lists was 2 s (the pre­sen­ta­tion was thus paced by the exper­i­menter). At the end of the series, par­tic­i­pants recalled the last word of each string in ser­ial order (main­te­nance phase). Two prac­tice tri­als of 2-word length were given before the exper­i­ment start­ed. Words recalled were writ­ten down by the exper­i­menter on a pre­pared form. The total num­ber of cor­rectly recalled words was used as the mea­sure of WM per­for­mance (max­i­mum score 20). This score has been demon­strated to show large cor­re­la­tions with visu­ospa­tial (Jig­saw Puz­zle test) and ver­bal (Lis­ten­ing Span test) WM tasks (Borella et al., 2008), and mea­sures of fluid intel­li­gence (Borella et al., 2006).

    …Cul­ture Fair test, Scale 3 (Cat­tell & Cat­tell, 1963). Scale 3 of the Cat­tell test con­sists of two par­al­lel forms (A and B), each con­tain­ing four sub­tests to be com­pleted in 2.5 to 4 min, depend­ing on the sub­test. In the first sub­test, Series, par­tic­i­pants saw an incom­plete series of abstract shapes and fig­ures and had to choose from six alter­na­tives that best com­pleted the series. In the sec­ond sub­test, Clas­si­fi­ca­tions, par­tic­i­pants saw 14 prob­lems com­pris­ing abstract shapes and fig­ures and had to choose which 2 of the 5 differed from the other 3. In the third sub­test, Matri­ces, par­tic­i­pants were pre­sented with 13 incom­plete matri­ces con­tain­ing four to nine boxes of abstract fig­ures and shapes plus an empty box and six choic­es: Their task was to select the answer that cor­rectly com­pleted each matrix. In the final sub­test, Con­di­tions, par­tic­i­pants were pre­sented with 10 sets of abstract fig­ures, lines, and a sin­gle dot, along with five alter­na­tives: Their task was to assess the rela­tion­ship among the dot, fig­ures, and lines, then choose the alter­na­tive in which a dot could be posi­tioned in the same rela­tion­ship. The depen­dent vari­able was the num­ber of cor­rectly solved items across the four sub­sets (max­i­mum score of 50). One of the two par­al­lel forms (A or B) was admin­is­tered at pretest, the other at posttest in coun­ter­bal­anced fash­ion across test­ing ses­sions.

    …Far-trans­fer effect. For the Cat­tell test, results indi­cated that trained par­tic­i­pants per­formed sig­nifi­cantly bet­ter than did con­trols (Md­iff ϭ 3.22, p Ͻ .001). Posttest and fol­low-up per­for­mances were sig­nifi­cantly bet­ter than on pretest (Md­iff ϭ 3.40, p Ͻ .001, and Mdiff ϭ 2.75, p Ͻ .001, respec­tive­ly). No sig­nifi­cant differ­ence was found between posttest and fol­low-up. Post hoc com­par­isons revealed that only the trained group showed sig­nifi­cant improve­ment in per­for­mance between pretest and both posttest ( p Ͻ .001) and fol­low-up ( p Ͻ .001), although posttest per­for­mance was not differ­ent from that of fol­low-up. By con­trast, no sig­nifi­cant differ­ence was found for the con­trol group. The trained group per­formed bet­ter at both posttest and fol­low-up than did the con­trol group ( p Ͻ .001).

    …First, the par­tic­i­pants involved in our study were young-old (mean age of 69 years), whereas in Buschkuehl et al.’s (2008) study as well as that of Li et al. (2008), they were old-old adults (mean age of 80.1 and 74.5 years, respec­tive­ly). In the con­text of episodic mem­o­ry, the meta-analy­sis by Ver­haeghen et al. (1992) has pointed out that the ben­e­fit of inter­ven­tions is neg­a­tively related to par­tic­i­pant age (see also Singer, Lin­den­berg­er, & Bal­tes, 2003). It has been shown that cog­ni­tive plas­tic­ity is reduced over the adult life span (Jones et al., 2006), with young-old exhibit­ing larger train­ing-re­lated gains than old-old (Singer et al., 2003). The impor­tance of par­tic­i­pant age is evi­dent from con­sid­er­ing the results of train­ing focused on exec­u­tive con­trol tasks-for exam­ple, task-switch­ing (Buch­ler, Hoy­er, & Cerel­la, 2008; Kar­bach & Kray, 2009; Kramer, Hahn, & Gopher, 1999), dual tasks (Bherer et al., 2005, 2008), or gen­eral exec­u­tive func­tions (Basak et al., 2008)-for which trans­fer effects emerged with a sam­ple com­pris­ing young-old (age range between 60 and 75 years, mean age between 65 and 71 years; Basak et al., 2008; Bherer et al., 2005, 2008; Kar­bach & Kray, 2009; Kramer et al., 1995). The ques­tion of whether trans­fer effects of WM train­ing can also be deter­mined by par­tic­i­pant age range is of inter­est and should be addressed in fur­ther research.

    Sec­ond, as is men­tioned at the begin­ning of this sec­tion, the task and the pro­ce­dure used to train par­tic­i­pants can be con­sid­ered an impor­tant source of differ­ence. For exam­ple, Buschkuehl et al. (2008) reported that trained par­tic­i­pants claimed to have gen­er­ated task-spe­cific strate­gies in one of the vari­ants of the WM task in which they were trained, lead­ing to greater train­ing gains (62%) with respect to the other two vari­ants (44% and 15%, respec­tive­ly). The diffi­culty of trans­fer­ring the gains obtained in a spe­cific task to other tasks sug­gests that the WM train­ing by Buschkuehl et al. did not fos­ter an increase in flex­i­bil­i­ty, but sim­ply the ten­dency to find a strat­egy to recall as many items as pos­si­ble but in the con­text of each WM task. In the case of Li et al. (2008), the mod­est trans­fer effects to the WM task can be explained by reflect­ing on the nature of the trained task: n-back task, which involves the manip­u­la­tion and main­te­nance of infor­ma­tion as well as updat­ing of tem­po­ral order and con­tex­tual infor­ma­tion and bind­ing processes between stim­uli and cer­tain rep­re­sen­ta­tion (Ober­auer, 2005). Although the n-back shares com­mon pro­cess­ing mech­a­nisms with com­plex span tasks, the under­ly­ing mech­a­nisms of the n-back are not com­pletely under­stood (Schmiedek, Hilde­brandt, Löv­den, Wil­helm, & Lin­den­berg­er, 2009). More­over, the few stud­ies that used it with other WM tasks- com­plex span tasks- have shown vari­able cor­re­la­tions (from very low or nul­l-Kane, Con­way, Miu­ra, & Colflesh, 2007; Roberts & Gib­son, 2002-to large-Schmiedek et al., 2009; Shamosh et al., 2008).

  92. Specifi­cal­ly, per­for­mance on ; see (Slagter 2007); cf.“Study Sug­gests Med­i­ta­tion Can Help Train Atten­tion” (New York Times).↩︎

  93. “Can Med­i­ta­tion Curb Heart Attacks?” (New York Times)↩︎

  94. Psy­cho­nomic Bul­letin & Review 2008 Aug;15(4):763-71. “Train­ing gen­er­al­ized spa­tial skills.” Wright R, Thomp­son WL, Ganis G, New­combe NS, Koss­lyn SM.

    …The present study inves­ti­gated whether inten­sive long-term prac­tice leads to change that tran­scends stim­u­lus and task para­me­ters. Thir­ty-one par­tic­i­pants (14 male, 17 female) were tested on three cog­ni­tive tasks: a com­put­er­ized ver­sion of the Shep­ard-Met­zler (1971) men­tal rota­tion task (MRT), a men­tal paper-fold­ing task (MPFT), and a ver­bal analo­gies task (VAT). Each indi­vid­ual then par­tic­i­pated in daily prac­tice ses­sions with the MRT or the MPFT over 21 days. Post­prac­tice com­par­isons revealed trans­fer of prac­tice gains to novel stim­uli for the prac­ticed task, as well as trans­fer to the oth­er, non­prac­ticed spa­tial task. Thus, prac­tice effects were process based, not instance based. Improve­ment in the non­prac­ticed spa­tial task was greater than that in the VAT; thus, improve­ment was not merely due to greater ease with com­put­er­ized test­ing.

  95. [A pre­vi­ous ver­sion of this foot­note dis­cussed method­olog­i­cal prob­lems & trans­ferra­bil­ity of ani­mal research; this has been moved else­where.] “First Direct Evi­dence of Neu­ro­plas­tic Changes Fol­low­ing Brain­wave Train­ing” & “Mind­ful­ness Med­i­ta­tion Train­ing Changes Brain Struc­ture in Eight Weeks”, Sci­ence Daily↩︎