Zeo sleep self-experiments

EEG recordings of sleep and my experiments with things affecting sleep quality or durations: melatonin, potassium, vitamin D etc
experiments, biology, psychology, nootropics, statistics, predictions, Zeo, Haskell, R, power-analysis, Bayes
2010-12-282018-02-28 in progress certainty: likely importance: 8

I dis­cuss my beliefs about Quan­ti­fied Self, and demon­strate with a series of self­-ex­per­i­ments using a Zeo. A Zeo records sleep via EEG; I have made many mea­sure­ments and per­formed many exper­i­ments. This is what I have learned so far:

  1. the Zeo head­band is wear­able long-term
  2. mela­tonin improves my sleep
  3. one-legged stand­ing does lit­tle
  4. Vit­a­min D at night dam­ages my sleep & Vit­a­min D in morn­ing does not affect my sleep
  5. potas­sium (over the day but not so much the morn­ing) dam­ages my sleep and does not improve my mood/productivity
  6. small quan­ti­ties of alco­hol appear to make lit­tle differ­ence to my sleep qual­ity
  7. I may be bet­ter off chang­ing my sleep tim­ing by wak­ing up some­what ear­lier & going to bed some­what ear­lier
  8. lithium oro­tate does not affect my sleep
  9. Red­shift causes me to go to bed ear­lier
  10. ZMA: incon­clu­sive results slightly sug­ges­tive of ben­e­fits

(QS) is a move­ment with many faces and as many vari­a­tions as par­tic­i­pants, but the core of every­thing is this: exper­i­ment with things that can improve your life.

What is QS?

Quan­ti­fied Self is not expen­sive devices, or meet-ups, or videos, or even ebooks telling you what to do. Those are tools to an end. If read­ing this page does any­thing, my hope is to pass on to some read­ers the Quan­ti­fied Self atti­tude: a play­ful thought­ful atti­tude, of won­der­ing whether this thing affects that other thing and what impli­ca­tions could be eas­ily test­ed. “Sci­ence” with­out the cap­i­tal “S” or the belief that only sci­en­tists are allowed to think.

That’s all Quan­ti­fied Self is, no mat­ter how sim­ple or com­pli­cated your devices, no mat­ter how auto­mated your data col­lec­tion, no mat­ter whether you found a pedome­ter lying around or hand-engi­neered your own EEG head­set.

Quan­ti­fied Self is sim­ply about hav­ing ideas, gath­er­ing some data, see­ing what it says, and improv­ing one’s life based on the data. If gath­er­ing data is too hard and would make your life worse off—then don’t do it! If the data can’t make your life bet­ter—then don’t do it! Not every idea can or should be test­ed.

The QS cycle is straight­for­ward and flex­i­ble:

  1. Have an idea
  2. Gather data
  3. Test the data
  4. Make a change; GOTO 1

Any of these steps can over­lap: you may be col­lect­ing sleep data long before you have the idea (in the expec­ta­tion that you will have an idea), or you may be mak­ing the change as part of the data in an exper­i­men­tal design, or you may inad­ver­tently engage in a “nat­ural exper­i­ment” before won­der­ing what the effects were (per­haps the baby wakes you up on ran­dom nights and lets you infer the costs of poor sleep).

The point is not pub­lish­able sci­en­tific rig­or. If you are the sort of per­son who wants to run such rig­or­ous self­-ex­per­i­ments, fan­tas­tic! The point is mak­ing your life bet­ter, for which sci­en­tific cer­tainty is not nec­es­sary: imag­ine you are choos­ing between equally priced sleep pills and equal safe­ty; the first sleep pill will make you go to sleep faster by 1 minute and has been val­i­dated in count­less sci­en­tific tri­als, and while the sec­ond sleep pill has in the past week has ended the sweaty night­mares that have plagued you every few days since child­hood but alas has only a few small tri­als in its favor—which would you choose? I would choose the sec­ond pill!

To put it in more economic/statistical terms, what we want from a self­-ex­per­i­ment is for it to give us a con­fi­dence just good enough to tell whether the expected value of our idea is more than the idea will cost. But we don’t need more con­fi­dence unless we want to per­suade other peo­ple! (So from this per­spec­tive, it is pos­si­ble to do a QS self­-ex­per­i­ment which is “too good”. Much like one can over­pay for safety and buy too much insur­ance—­like on elec­tron­ics such as video game con­soles, a noto­ri­ous rip-off.)

What QS Is Not: (Just) Data Gathering

One fail­ure mode which is par­tic­u­larly dan­ger­ous for QSers is to overdo the data col­lec­tion and col­lect masses of data they never use. Famous com­puter entre­pre­neur & math­e­mati­cian exem­pli­fied this for me in March 2012 with his lengthy blog post “The Per­sonal Ana­lyt­ics of My Life” in which he did some impres­sive graph­ing and explo­ration of data from 1989 to 2012: a third of a mil­lion (!) emails, full key­board log­ging, cal­en­dar, phone call logs (with missed calls include), a pedome­ter, revi­sion his­tory of his tome A New Kind of Sci­ence, file types accessed per date, pars­ing scanned doc­u­ments for dates, a tread­mill, and per­haps more he did­n’t men­tion.

Wol­fram’s dataset is well-de­picted in infor­ma­tive graphs, breath­tak­ing in its thor­ough­ness, and even more impres­sive for its dura­tion. So why do I read his post with sor­row? I am sad for him because I have read the post sev­eral times, and as far as I can see, he has not ben­e­fited in any way from his data col­lec­tion, with one minor excep­tion:

Very early on, back in the 1990s, when I first ana­lyzed my e-mail archive, I learned that a lot of e-mail threads at my com­pany would, by a cer­tain time of day, just resolve them­selves. That was a use­ful thing to know, because if I jumped in too early I was just wast­ing my time.

Noth­ing else in his life was bet­ter 1989-2012 because he did all this, and he shows no indi­ca­tion that he will ben­e­fit in the future (be­sides hav­ing a very nifty blog post). And just read­ing through his post with a lit­tle imag­i­na­tion sug­gests plenty of exper­i­ments he could do:

  1. He men­tions that 7% of his key­strokes are the Back­space key.

    This seems remark­ably high­—­vastly higher than my own use of back­space—and must be slow­ing down his typ­ing by a non­triv­ial amount. Why does­n’t he try a typ­ing tutor to see if he can improve his typ­ing skill, or learn the key­board short­cuts in his text edi­tor? If he is wasted >7% of all his typ­ing (be­cause he had to type what he is Backspac­ing over, of course), then he is wast­ing typ­ing time, slow­ing things done, adding frus­tra­tion to his com­puter inter­ac­tions and worst, putting him­self at greater risk of crip­pling RSI.

  2. How often does he access old files? Since he records access to all files, he can ask whether all the log­ging is pay­ing for itself.

  3. Is there any con­nec­tion between the steps his pedome­ter records and things like his mood or email­ing? Exer­cise has been linked to many ben­e­fits, both phys­i­cal and men­tal, but on the other hand, walk­ing isn’t a very quick form of exer­cise. Which effect pre­dom­i­nates? This could have the prac­ti­cal con­se­quence of sched­ul­ing a daily walk just as he tries to make sure he can have din­ner with his fam­i­ly.

  4. Does a flurry of emails or phone calls dis­rupt his other forms of pro­duc­tiv­ity that day? For exam­ple, while writ­ing his book would he have been bet­ter off bar­ri­cad­ing him­self in soli­tude or work­ing on it in between other tasks?

  5. His email counts are aston­ish­ingly high in gen­er­al:

    Is answer­ing so many emails really nec­es­sary? Per­haps he has put too much empha­sis on email com­mu­ni­ca­tion, or per­haps this indi­cates he should del­e­gate more—or if run­ning is so time-con­sum­ing, per­haps he should re-e­val­u­ate his life and ask whether that is what he truly wants to do now. I have no idea what the answer to any of these ques­tions are or whether an exper­i­ment of any kind could be run on them, but these are key life deci­sions which could be prompted by the data—but weren’t.

Another QS piece(“It’s Hard to Stay Friends With a Dig­i­tal Exer­cise Mon­i­tor”) struck me when the author, Jenna Wortham, reflected on her expe­ri­ence with her motion sen­sor:

The for­get­ful­ness and guilt I expe­ri­enced as my Fuel­Band hon­ey­moon wore off is not uncom­mon, accord­ing to peo­ple who study behav­ioral sci­ence. The col­lected data is often inter­est­ing, but it is hard to ana­lyze and use in a way that spurs change. “It does­n’t trig­ger you to do any­thing habit­u­al­ly,” said Michael Kim, who runs Kairos Labs, a Seat­tle-based com­pany spe­cial­iz­ing in design­ing social soft­ware to influ­ence behav­ior…Mr. Kim, whose résumé includes a stint as direc­tor of Xbox Live, the online gam­ing sys­tem cre­ated by Microsoft, said the game-like mech­a­nisms of the Nike device and oth­ers like it were “not enough” for the aver­age user. “Points and badges do not lead to behav­ior change,” he said.

Final­ly, Neal Stephen­son, in dis­cussing his tread­mill desk use focuses on esti­mat­ing mileage & caloric expen­di­ture and show­ing the effects of bad pos­ture he devel­ope­d—but he entirely ignores issues of whether it affected his typ­ing, his writ­ing, or any­thing that might actu­ally mat­ter.

One thinks of a say­ing of : “Expe­ri­ence by itself teaches noth­ing.” Indeed. A QS exper­i­ment is a 4-legged beast: if any leg is far too short or far too long, it can’t carry our bur­dens.

And with Wol­fram and Wortham, we see that 2 legs of the poor beast have been ampu­tat­ed. They col­lected data, but they had no ideas and they made no changes in their life; and because QS was not part of their life, it soon left their life. Wortham seems to have dropped the approach entire­ly, and Wol­fram may only per­se­vere for as long as the data con­tin­ues to be use­ful in demon­strat­ing the abil­i­ties of his com­pa­ny’s prod­ucts.

Zeo QS

On Christ­mas 2010, I received one of ’s (founded 2003, shut­ting down 2013) Zeo bed­side unit after long cov­et­ing it and dream­ing of using it for all sorts of sleep­-re­lated ques­tions. (As of Feb­ru­ary 2013, the bed­side unit seems to’ve been dis­con­tin­ued; the most com­pa­ra­ble Zeo Inc. prod­uct seems to be the Zeo Sleep Man­ager Pro, ~$90.) With it, I begin to apply my thoughts about Quan­ti­fied Self.

A Zeo is a scaled-down (one-elec­trode) sen­sor-head­band, which hap­pens to have an alarm clock attached. The EEG data is processed to esti­mate whether one is asleep and what of sleep one is in. Zeo breaks sleep down into wak­ing, , , and . (The phases aren’t nec­es­sar­ily that phys­i­o­log­i­cally dis­tinc­t.) It’s been com­pared with reg­u­lar by Zeo Inc and oth­ers1 and seems to be rea­son­ably accu­rate. (Since reg­u­lar sleep tests cost hun­dreds to thou­sands of dol­lars per ses­sion and are of ques­tion­able exter­nal valid­ity since they are a differ­ent uncom­fort­able set­ting than your own bed­room, I am fine with a Zeo being just “rea­son­ably” accu­rate in pre­dict­ing PSG rat­ings.)

The data is much bet­ter than what you would get from more pop­u­lar meth­ods like cell­phones with accelerom­e­ters, since an accelerom­e­ter only knows if you are mov­ing or not, which isn’t a very reli­able indi­ca­tor of sleep2. (You could just be lying there star­ing at the ceil­ing, wide awake. Or per­haps the cat is knead­ing you while you are in light sleep.) As well, half the inter­est is how exactly sleep phases are arranged and how long the cycles are; you could use that infor­ma­tion to devise a cus­tom polypha­sic sched­ule or just fig­ure out a bet­ter nap length than the rule-of-thumb of 20 min­utes. And the price isn’t too bad—$150 for the nor­mal Zeo as of Feb­ru­ary 2012. (The basic mobile Zeo is much cheap­er, but I’ve seen peo­ple com­plain about it and appar­ently it does­n’t col­lect the same data as more expen­sive mobile ver­sion or the orig­i­nal bed­side unit.)


“A thinker sees his own actions as exper­i­ments & ques­tion­s—as attempts to find out some­thing. Suc­cess and fail­ure are for him answers above all.”

, §41

I per­son­ally want the data for a few dis­tinct pur­pos­es, but in the best Quan­ti­fied Self vein, mostly exper­i­ment­ing:

  1. more thor­oughly quan­ti­fy­ing the ben­e­fits of

    • and dose lev­els: 1.5mg may be too much. I should exper­i­ment with a vari­ety: 0.1, 0.5, 1.0, 1.5, and 3mg?
  2. quan­ti­fy­ing the costs of

  3. test­ing ben­e­fits of 3

  4. design­ing & start­ing

  5. assist­ing

  6. reduc­ing sleep time in gen­eral (bet­ter & less sleep)

  7. inves­ti­gat­ing effects of :

    • do n-back­ing just before sleep, and see whether per­cent­ages shift (more deep sleep as the brain grows/changes?) or whether one sleeps bet­ter (fewer awak­en­ings, less light sleep).
    • do n-back­ing after wak­ing up, to look for cor­re­la­tion between good/bad sleeps and per­for­mance (one would expect good sleep → good scores).
    • test the costs of polypha­sic sleep on mem­ory4
  8. (pos­i­tive) effect of one-legged stand­ing on sleep depth/efficiency

  9. pos­si­ble sleep reduc­tions due to med­i­ta­tion

  10. ser­ial cable uses:

    • quan­ti­fy­ing med­i­ta­tion (eg. length of gamma fre­quen­cies)
    • rank music by dis­tractibil­i­ty?
    • mea­sure focus over the day and dur­ing spe­cific activ­i­ties (eg. cor­re­late fre­quen­cies against n-back­ing per­for­mance)
  11. Mea­sure neg­a­tive effect of nico­tine on sleep & deter­mine appro­pri­ate buffer

  12. test claims of sleep ben­e­fits from mag­ne­sium

  13. caffeine pill wake-up trick

I have tried to do my lit­tle self­-ex­per­i­ments as well as I know how to, and hope­fully my results are less bogus than the usual anec­dotes one runs into online. What I would really like is for other peo­ple (espe­cially Zeo own­ers) to repli­cate my results. To that end I have taken pains to describe my setups in com­plete detail so oth­ers can use it, and pro­vided the data and com­plete or pro­grams used in analy­sis. If any­one repli­cates my results in any fash­ion, please con­tact me and I would be happy to link your self­-ex­per­i­ment here!

First impressions

First night

Christ­mas morn­ing, I unpacked it and admired the pack­ag­ing, and then looked through the man­u­al. The base-station/alarm-clock seems pretty sturdy and has a large clear screen. The head­band seemed com­fort­able enough that it would­n’t bother me. The var­i­ous writ­ings with it seemed rather fluffy and prep­py, but I did my tech­ni­cal home­work before hand, so could ignore their crap.

Late that night (quite late, since the girls stayed up play­ing and Xbox danc­ing games and what not), I turn in weari­ly. I had noticed that the alarm seemed to be set for ~3:30 AM, but I was very tired from the long day and tak­ing my mela­ton­in, and did­n’t inves­ti­gate fur­ther—I mean, what elec­tronic would ship with the alarm both enabled and enabled for a bizarre time? It was­n’t worth both­er­ing the other sleeper by turn­ing on the light and mess­ing with it. I put on the head­band, ver­i­fied that the Zeo seemed to be doing stuff, and turned in. Come 3 AM, and the damn music goes off! I hit snooze, too dis­com­bob­u­lated to fig­ure out how to turn off the alarm.

So that explains the strange Zeo data for the first day:

First night

The major sur­prise in this data was how quickly I fell asleep: 18 min­utes. I had always thought that I took much longer to fall asleep, more like 45 min­utes, and had bud­geted accord­ing­ly; but appar­ently being deluded about when you are awake and asleep is com­mon—which leads into : if your mem­o­ries dis­agree with the Zeo, who should you believe? The rest of the data seemed too messed up by the alarm to learn any­thing from.



One pos­si­ble appli­ca­tion for Zeo was med­i­ta­tion. Most med­i­ta­tion stud­ies are very small & method­olog­i­cally weak, so it might be worth­while to ver­ify for one­self any inter­est­ing claims. If Zeo’s mea­sur­ing via EEG, then pre­sum­ably it’s learn­ing some­thing about how relaxed and activ­i­ty-less one’s mind is. I’m not seek­ing enlight­en­ment, just calm­ness, which would seem to be in the purview of an EEG sig­nal. (As Charles Bab­bage said. errors made using insuffi­cient data are still less than errors made using no data at all.) But alas, I med­i­tated for a solid 25 min­utes and the Zeo stub­bornly read at the same wake level the entire time; I then read my book, Mod­ern Japan­ese diaries, for a sim­i­lar period with no change at all. It is pos­si­ble that the 5-minute aver­ag­ing (Zeo mea­sures every 2 sec­onds) is hid­ing use­ful changes, but prob­a­bly it’s sim­ply not pick­ing up any real differ­ences. Oh well.

Smart alarm

The sec­ond night I had set the alarm to a more rea­son­able time, and also enabled its smart alarm mode (“Smart­Wake”), where the alarm will go off up to 30 min­utes early if you are ever detected to be awake or in light sleep (as opposed to REM or deep sleep). One thing I for­got to do was take my mela­ton­in; I keep my sup­ple­ments in the car and there was a howl­ing bliz­zard out­side. It did­n’t bother me since I am not addicted to mela­tonin.

In the morn­ing, the smart alarm mode seemed to work pretty well. I woke up early in a good mode, thought clearly and calmly about the sit­u­a­tion—and went back to sleep. (It’s a hol­i­day, after all.)

Replacing headband

Around 2011-05-15, I gave up on the orig­i­nal head­band—it was get­ting too dirty to get good read­ings—and decided to rip it apart to see what it was made of, and to order a new set of three for $35 (which seems rea­son­able given the expen­sive mate­r­ial that the con­tacts are made of—sil­ver fab­ric); they then cost $50. A lit­tle googling found me a coupon, FREESHIP, but appar­ently it only applied to the Zeo itself and so the pads were actu­ally $40, or ~$13 a piece. I won’t say that buy­ing replace­ment head­bands semi­-an­nu­ally is some­thing that thrills me, but $20 a year for sleep data is a small sum. Cer­tainly it’s more cost-effec­tive than most of the I have used. (Full dis­clo­sure: 9 months after start­ing this page, Zeo offered me a free set of head­bands. I used them and when the news broke about Zeo going out of busi­ness, I bought another set.)

The old head­band, with elec­tri­cal tape residue

The dis­pos­able head­band with the cloth cov­er­ing removed/

Said head­band with plas­tic removed; notice dis­col­oration of metal despite clean­ing

The reverse side/

The new head­band’s wrap­per

The new head­band/

In the future, I might try to make my own; eok.g­nah claims that buy­ing the sil­ver fab­ric is appar­ently cheaper than order­ing from Zeo, mar­ciot reports suc­cess in mak­ing head­bands, and it seems one can even hook up other sen­sors to the head­band. Another alter­na­tive is, since the Zeo head­band is a one-elec­trode EEG head­set, to take an approach sim­i­lar to the EEG peo­ple and occa­sion­ally add small dabs of con­duc­tive paste, since fairly large quan­ti­ties are cheap (eg. 12oz for $30). There was a dis­pos­able adhe­sive gel ECG elec­trodes with off­set press-s­tud con­nec­tions being exper­i­mented with by Zeo Inc, but they never entered wide use before it shut down.

One prob­lem with the sen­sor mounted on the head­band is that the lithium bat­tery inside it can stop hold­ing a charge. The cas­ing is extremely diffi­cult to open with­out dam­ag­ing the cir­cuitry or con­nec­tions, and the bat­tery inside is sol­dered to the cir­cuit board:

An opened Zeo head­band sen­sor

Once safely opened, the bat­tery can be replaced by another one of sim­i­lar size. For details, see the Quan­ti­fied Self forum thread.


Before writ­ing my arti­cle, I had used mela­tonin reg­u­larly for 6+ years, ever since I dis­cov­ered (some­when in high school or col­lege) that it was use­ful for enforc­ing bed­times and seemed to improve sleep qual­i­ty; when I posted my writeup to Less­Wrong peo­ple were nat­u­rally a lit­tle skep­ti­cal of my spe­cific claim that it improved the qual­ity of my sleep such that I could reduce sched­uled time by an hour or so. Now that I had a Zeo, would­n’t it be a good idea to see whether it did any­thing, lo these many years lat­er?

The fol­low­ing sec­tion rep­re­sents 5 or 6 months of data (raw CSV data; guide to Zeo CSV). My basic dosage was 1.5mg of mela­tonin taken 0-30 min­utes before going to sleep.


Deep sleep and ‘time in wake’ were both appar­ently unaffect­ed; ‘time in wake’ appar­ently had too small a sam­ple to draw much con­clu­sion:

Mela­ton­in, time in deep over five months,

Sur­pris­ing­ly, total REM sleep fell:

Mela­ton­in, time in REM

While the raw ZQ falls, the regres­sion takes into account the cor­re­lated vari­ables and indi­cates that this is some­thing of an

Mela­ton­in, ZQ

REM’s aver­age fell by 29 min­utes, deep sleep fell by 1 min­ute, but total sleep fell by 54 min­utes; this implies that light sleep fell by 24 min­utes. (The aver­ages were 254.2 & 233.3) I am not sure what to make of this. While my orig­i­nal heuris­tic of a one hour reduc­tion turns out to be sur­pris­ingly accu­rate, I had expected light and deep sleep to take most of the time hit. Do I get enough REM sleep? I don’t know how I would answer that.

I did feel fine on the days after mela­tonin use, but I did­n’t track it very sys­tem­at­i­cal­ly. The best I have is the ‘morn­ing feel’ para­me­ter, which the Zeo asks you on wak­ing up; in prac­tice I entered the val­ues as: a ‘2’ means I woke feel­ing poor or unrest­ed, ‘3’ was fine or medioc­re, and ‘4’ was feel­ing good. When we graph the aver­age of morn­ing feel against mela­tonin use or non-use, we find that mela­tonin was notice­ably bet­ter (2.95 vs 3.17):

Mela­ton­in, Morn­ing Feel

Graph­ing some more of the raw data:

Mela­ton­in, Total Time Asleep (To­tal Z)
Mela­ton­in, Times Woken per Night

Unfor­tu­nate­ly, dur­ing this peri­od, I did­n’t reg­u­larly do my either, so there’d be lit­tle point try­ing to graph that. What I spent a lot of my free time doing was edit­ing Gwern.net, so it might be worth look­ing at whether nights on mela­tonin cor­re­spond to increased edits the next day. In this graph of edits, the red dots are days with­out mela­tonin and the green are days with mela­ton­in; I don’t see any clear trend, although it’s worth not­ing almost all of the very busy days were mela­tonin days:

Days ver­sus # of edits ver­sus mela­tonin on/off

Melatonin analysis

The data is very noisy (espe­cially towards the end, per­haps as the head­band got dirty) and the response vari­ables are inter­cor­re­lated which makes inter­pre­ta­tion diffi­cult, but hope­fully the over­all con­clu­sions from the are not entirely untrust­wor­thy. Let’s look at some aver­age. Zeo’s web­site lets you enter in a 3-val­ued vari­able and then graph the aver­age day for each vari­able against a par­tic­u­lar recorded prop­erty like ZQ or total length of REM sleep. I defined one dummy vari­able, and decided that a ‘0’ would cor­re­spond to not using mela­ton­in, ‘1’ would cor­re­spond to using it, and ‘2’ would cor­re­spond to using a dou­ble-dose or more (on the rare occa­sions I felt I needed sleep insur­ance). The fol­low­ing addi­tional -style5 analy­ses of p-val­ues is done by import­ing the CSV into R; given all the issues with self­-ex­per­i­men­ta­tion (these mela­tonin days weren’t even blind­ed), the p-val­ues should be treated as gross guess­es, where <0.01 indi­cates I should take it seri­ous­ly, <0.05 is pretty good, <0.10 means I should­n’t sweat it, and any­thing big­ger than 0.20 is, at most, inter­est­ing while >0.5 means ignore it; we’ll also look at cor­rect­ing for mul­ti­ple com­par­isons6, for the heck of it. A mnemon­ic: p-val­ues are about whether the effect exists, and d-val­ues are whether we care. For a visu­al­iza­tion of effect sizes, see “Win­dow­pane as a Jar of Mar­bles”.

The analy­sis ses­sion in the R inter­preter:

# Read in data w/ variable names in header; uninteresting columns deleted in OpenOffice.org
zeo <- read.csv("https://www.gwern.net/docs/zeo/2011-zeo-melatonin.csv")

# "Melatonin" was formerly "SSCF 10";
# I also edited the CSV to convert all '3' to '1' (& so a binary)

R> l <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM,
                 Time.in.Deep, Awakenings, Morning.Feel, Time.in.Light)
            ~ Melatonin, data=zeo)
R> summary(manova(l))
#           Df Pillai approx F num Df den Df Pr(>F)
# Melatonin    1  0.102    0.717      9     57   0.69
# Residuals 65
R> summary(l)
# Response ZQ :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    83.52       4.13   20.21   <2e-16
# Melatonin      2.43       4.99    0.49     0.63
# Response Total.Z :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   452.38      22.86   19.79   <2e-16
# Melatonin       9.68      27.59    0.35     0.73
# Response Time.to.Z :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    19.48       2.59    7.52  2.1e-10
# Melatonin      -5.04       3.13   -1.61     0.11
# Response Time.in.Wake :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    7.095      1.521    4.66  1.6e-05
# Melatonin     -0.247      1.836   -0.13     0.89
# Response Time.in.REM :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   144.62       9.38   15.41   <2e-16
# Melatonin      -3.73      11.32   -0.33     0.74
# Response Time.in.Deep :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    54.33       3.26   16.68   <2e-16
# Melatonin       5.56       3.93    1.41     0.16
# Response Awakenings :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    3.095      0.524    5.90  1.4e-07
# Melatonin     -0.182      0.633   -0.29     0.77
# Response Morning.Feel :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    2.952      0.142   20.78   <2e-16
# Melatonin      0.222      0.171    1.29      0.2
# Response Time.in.Light :
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   253.86      13.59   18.68   <2e-16
# Melatonin       7.93      16.40    0.48     0.63

The indi­cates no sta­tis­ti­cal­ly-sig­nifi­cant differ­ence between the groups of days, tak­ing all vari­ables into account (p = 0.69). To sum­ma­rize the regres­sion:

Vari­able Correlate/Effect p-value Coeffi­cien­t’s sign is…
Time.to.Z - 5.04 0.11 bet­ter
Awakenings - 0.18 0.77 bet­ter
Time.in.Wake - 0.25 0.89 bet­ter
Time.in.Deep 5 .56 0.16 bet­ter
Time.in.Light 7 .93 0.63 worse
Time.in.REM - 3.73 0.74 worse
Total.Z 9.68 0.73 bet­ter
ZQ 2.43 0.63 bet­ter
Morning.Feel 0 .22 0.20 bet­ter

Part of the prob­lem is that too many days wound up being use­less, and each day costs us infor­ma­tion and reduces our true sam­ple size. (None of the met­rics are strong enough to sur­vive mul­ti­ple cor­rec­tion7, sad­ly.)

And also unfor­tu­nate­ly, this dataseries does­n’t dis­tin­guish between addi­tion to mela­tonin or ben­e­fits from mela­ton­in—per­haps the 3.2 is my ‘nor­mal’ sleep qual­ity and the 2.9 comes from a ‘with­drawal’ of sorts. The research on mela­tonin does­n’t indi­cate any addic­tion effect, but who knows?

If I were to run fur­ther exper­i­ments, I would defi­nitely run it dou­ble-blind, and maybe even test <1.5mg doses as well to see if I’ve been tak­ing too much; 3mg turned out to be exces­sive, and there are one or two stud­ies indi­cat­ing that <1mg doses are best for nor­mal peo­ple. I wound up using 1.5mg dos­es. (There could be 3 con­di­tions: place­bo, 0.75mg, and 1.5mg. For look­ing at mela­tonin effect in gen­er­al, the data on 2 dosages could be com­bined. Mela­tonin has a short half-life, so prob­a­bly there would be no point in ran­dom of more than 2-3 days8: we can ran­dom­ize each day sep­a­rately and assume that days are inde­pen­dent of each oth­er.)

Worth com­par­ing are Jayson Viris­si­mo’s pre­lim­i­nary results:

Accord­ing to the pre­lim­i­nary [Zeo] data, while on mela­ton­in, I seemed to get more total sleep, more REM sleep, less deep sleep, and wake up about the same num­ber of times each night. Because this isn’t enough data to be very con­fi­dent in the results, I plan on con­tin­u­ing this exper­i­ment for at least another 4 months (2 on and 2 off of mela­ton­in) and will ana­lyze the results for the [sta­tis­ti­cal] sig­nifi­cance and mag­ni­tude of the effects (if there really are any) while throw­ing out the out­liers (since my sleep sched­ule is so errat­ic).

Value of Information (VoI)

See also the dis­cus­sion as applied to order­ing modafinil and test­ing nootrop­ics

We all know it’s pos­si­ble to spend more time fig­ur­ing out how to “save time” on a task than we would actu­ally save time like rear­rang­ing books on a shelf or clean­ing up in the name of effi­ciency (xkcd even has a cute chart list­ing the break-even points for var­i­ous pos­si­bil­i­ties,“Is It Worth The Time?”), and sim­i­lar­ly, it’s pos­si­ble to spend more money try­ing to “save money” than one would actu­ally save; less appre­ci­ated is that the same thing is also pos­si­ble to do with gain­ing infor­ma­tion.

The value of an exper­i­ment is the infor­ma­tion it pro­duces. What is the value of infor­ma­tion? Well, we can take the eco­nomic tack and say value of infor­ma­tion is the value of the deci­sions it changes. (Would you pay for a weather fore­cast about some­where you are not going to? No. Or a weather fore­cast about your trip where you have to make that trip, come hell or high water? Only to the extent you can make prepa­ra­tions like bring­ing an umbrel­la.)

says that for a risk-neu­tral per­son, value of per­fect infor­ma­tion is “value of deci­sion sit­u­a­tion with per­fect infor­ma­tion”—“value of cur­rent deci­sion sit­u­a­tion”. (Im­per­fect infor­ma­tion is just weak­ened per­fect infor­ma­tion: if your infor­ma­tion was not 100% reli­able but 99% reli­able, well, that’s still worth a lot.)

The deci­sion is the binary take or not take. Mela­tonin costs ~$10 a year (if you buy in bulk dur­ing sales, as I did). Sup­pose I had per­fect infor­ma­tion it worked; I would not change any­thing, so the value is $0. Sup­pose I had per­fect infor­ma­tion it did not work; then I would stop using it, sav­ing me $10 a year in per­pe­tu­ity, which has a net present value9 (at 5% dis­count­ing) of $205. So the best-case value of per­fect infor­ma­tion—the case in which it changes my action­s—is $205, because it would save me from blow­ing $10 every year for the rest of my life. My mela­tonin exper­i­ment is not per­fect since I did­n’t ran­dom­ize or dou­ble-blind it, but I had a lot of data and it was well pow­ered, with some­thing like a >90% chance of detect­ing the decent effect size I expect­ed, so the imper­fec­tion is just a loss of 10%, down to $184. From my pre­vi­ous research and per­sonal use over years, I am highly con­fi­dent it work­s—say, 80%10.

If the exper­i­ment says mela­tonin works, the infor­ma­tion is use­less to me since I con­tinue using mela­ton­in, and if the exper­i­ment says it does­n’t, then let’s assume I decide to quit mela­tonin11 and then save $10 a year or $184 total. What’s the expected value of obtain­ing the infor­ma­tion, given these two out­comes? . Or another way, redo­ing the net present val­ue: At min­i­mum wage oppor­tu­nity cost of $7 an hour, $36.8 is worth 5.25 hours of my time. I spent much time on screen­shots, sum­ma­riz­ing, and analy­sis, and I’d guess I spent closer to 10-15 hours all told.

This worked out exam­ple demon­strates that when a sub­stance is cheap and you are highly con­fi­dent it works, a long costly exper­i­ment may not be worth it. (Of course, I would have done it any­way due to fac­tors not included in the cal­cu­la­tion: to try out my Zeo, learn a bit about sleep exper­i­men­ta­tion, do some­thing cool, and have some­thing neat to show every­one.)

Melatonin data

The data looked much bet­ter than the first night, except for a big 2-hour gap where I vaguely recall the sen­sor head­band hav­ing slipped off. (I don’t think it was because it was uncom­fort­able but due to shift­ing posi­tions or some­thing.) Judg­ing from the cycle of sleep phas­es, I think I lost data on a REM peak. The REM peaks inter­est me because it’s a stan­dard the­ory of polypha­sic sleep­ing that thriv­ing on 2 or 3 hours of sleep a day is pos­si­ble because REM (and deep sleep) is the only phase that truly mat­ters, and REM can dom­i­nate sleep time through and train­ing.

Sec­ond night

Besides that, I noticed that time to sleep was 19 min­utes that night. I also had for­got­ten to take my mela­tonin. Hmm…

Since I’ve begun this inad­ver­tent exper­i­ment, I’ll try con­tin­u­ing it, alter­nat­ing days of mela­tonin usage. I claim in my mela­tonin arti­cle that usage seems to save about 1 hour of sleep/time, but there’s sev­eral pos­si­ble avenues. One could be quicker to fall asleep; one could awake fewer times; and one could have greater per­cent­age of REM or deep sleep, reduc­ing light sleep. (Light sleep does­n’t seem very use­ful; I some­times feel worse after light sleep.)

Dur­ing the after­noon, I took a quick nap. I’m not a very good nap­per, it seem­s—only the first 5 min­utes reg­is­tered as even light sleep.

A dose of mela­tonin (1.5mg) and off to bed a bit ear­ly. I’m a lit­tle more impressed with the smart alarm; since I’m hard-of-hear­ing and audio alarms rarely if ever work, I usu­ally use a Sonic Alert vibrat­ing alarm clock. But in the morn­ing I woke up within a minute of the alarm, despite the lack of vibra­tion or flash­ing lights. (The chart does­n’t reflect this, but as a pre­vi­ous link says, dis­tin­guish­ing wak­ing from sleep­ing can be diffi­cult and the tran­si­tions are the least trust­wor­thy parts of the data.)

The data was espe­cially good today, with no big gaps:

2010-12-27 ZQ sleep logs

You can see an impres­sively reg­u­lar sleep cycle, cycling between REM and light sleep. What’s dis­turb­ing is the rel­a­tive lack of deep sleep­—­down 4-5% (and there was­n’t a lot to begin with). I sus­pect that the lack of deep sleep indi­cates I was­n’t sleep­ing very well, but not badly enough to wake up, and this is prob­a­bly due either to light from the Zeo itself—I only fig­ured out how to turn it off a few days lat­er—or my lack of reg­u­lar blan­kets and use of a sleep­ing bag. But the awak­en­ings around 4-6 AM and on other days has made me sus­pi­cious that one of the cats is both­er­ing me around here and I’m just for­get­ting it as I fall asleep.

The next night is another no-me­la­tonin night. This time it took 79 min­utes to fall asleep. Very bad, but far from unprece­dent­ed; this sort of thing is why I was inter­ested in mela­tonin in the first place. Deep sleep is again lim­ited in dis­per­sion, with a block at the begin­ning and end, but mostly a reg­u­lar cycle between light and REM:

2010-12-28 ZQ sleep logs

Mela­tonin night, and 32 min­utes to sleep. (I’m start­ing to notice a trend here.) Another fairly reg­u­lar cycle of phas­es, with some deep sleep at the begin­ning and end; 32 min­utes to fall asleep isn’t great but much bet­ter than 79 min­utes.

2010-12-29 ZQ sleep logs

Per­haps I should try a bipha­sic sched­ule where I sleep for an hour at the begin­ning and end? That’d seem to pick up most of my deep sleep, and REM would hope­fully take care of itself with REM rebound. Need to sum my aver­age REM & deep sleep times (that sum seems to differ quite a bit, eg one fel­low needs 4+ hours. My own need seems to be sim­i­lar) so I don’t try to pick a sched­ule doomed to fail.

Another night, no mela­tonin. Time to sleep, just 18 min­utes and the ZQ sets a new record even though my cat Stormy woke me up in the morn­ing12:

2010-12-30 ZQ sleep logs

I per­son­ally blame this on being exhausted from 10 hours work­ing on my tran­scrip­tion of . But a data point is a data point.

I spend New Year’s Eve pretty much fin­ish­ing The Notenki Mem­oirs (tran­scrib­ing the last of the biogra­phies, the round-table dis­cus­sion, and edit­ing the images for inclu­sion), which exhausts me a fair bit as well; the cham­pagne does­n’t help, but between that and the mela­ton­in, I fall asleep in a record-set­ting 7 min­utes. Unfor­tu­nate­ly, the head­band came off some­where around 5 AM:

2010-12-31 ZQ sleep logs

A cat? Wak­ing up? Dun­no.

Another rel­a­tively quick falling asleep night at 20 min­utes. Which then gets screwed up as I sim­ply can’t stay asleep and then the cat begins both­er­ing the heck out of me in the early morn­ing:

2011-01-01 ZQ sleep logs

Mela­tonin night, which sub­jec­tively did­n’t go too bad­ly; 20 min­utes to sleep. But lots of wake time (long enough wakes that I remem­bered them) and 2 or 3 hours not recorded (prob­a­bly from adjust­ing my scarf and the head­band):

2011-01-03 ZQ sleep logs

Acci­den­tally did another mela­tonin night (thought Mon­day was a no-me­la­tonin night). Very good sleep­—set records for REM espe­cially towards the late morn­ing which is curi­ous. (The dreams were also very curi­ous. I was an Evan­ge­lion char­ac­ter (Ka­woru) tasked with rid­ing that kind of car­ni­val-like ride that goes up and drops straight down.) Also another quick falling asleep:

2011-01-04 ZQ sleep logs

Rather than 3 mela­tonin nights in a row, I skipped mela­tonin this night (and thus will have it the next one). Per­haps because I went to sleep so very late, and despite some awak­en­ings, this was a record-set­ting night for ZQ and TODO deep sleep or REM sleep? :

2011-01-05 ZQ sleep logs

I also switched the alarm sounds 2 or 3 days ago to ‘for­est’ sounds; they seem some­what more pleas­ant than the beep­ing musi­cal tones. The next night, data is all screwed up. What hap­pened there? It did­n’t even record the start of the night, though it seemed to be active and work­ing when I checked right before going to sleep. Odd.

Next 2 days aren’t very inter­est­ing; first is no-me­la­ton­in, sec­ond is mela­ton­in:

2011-01-07 ZQ sleep logs
2011-01-08 ZQ sleep logs

One of my chief Zeo com­plaints was the bright blue-white LCD screen. I had resorted to turn­ing the base sta­tion over and sur­round­ing it with socks to block the light. Then I looked closer at the labels for the but­tons and learned that the up-down but­tons changed the bright­ness and the LCD screen could be turned off. And I had read the part of the man­ual that explained that. D’oh!

Off (for­got)

Off, but no data on the 22nd. No idea what the prob­lem is—the head­set seems to have been on all night.

On with a dou­ble-dose of mela­tonin because I was going to bed ear­ly; as you can see, did­n’t work:

2011-01-23 ZQ sleep logs

Off, no data on the 24th. On, no data on the 25th. I don’t know what went wrong on these two nights.


The 27th (on for mela­ton­in) yielded no data because, frus­trat­ing­ly, the Zeo was print­ing a ‘write-pro­tected’ error on its screen; I assumed it had some­thing to do with upload­ing ear­lier that day—per­haps I had yanked it out too quick­ly—and put it back in the com­put­er, unmounted and went to eject it. But the mem­ory card splin­tered on me! It was stuck and the end was splin­ter­ing and lit­tle nee­dles of plas­tic break­ing off. I could­n’t get it out and gave up. The next day (I slept rea­son­ably well) I went back with a pair of needle-nose pli­ers. I had a backup mem­ory card. After much trial and error, I fig­ured out the card had to be FAT-formatted and have a direc­tory struc­ture that looked like ZEO/ZEOSLEEP.DAT. So that’s that.

  • Off
  • On
  • 30: on
  • 31: off
  • 1: on
  • 2: off
  • 3: on

Unfor­tu­nate­ly, this night con­tin­ues a long run of no data. Look­ing back, it does­n’t seem to have been the fault of the new mem­ory card, since some nights did have enough data for the Zeo web­site to gen­er­ate graphs. I sus­pect that the issue is the pad get­ting dirty after more than a month of use. I hope so, any­way. I’ll look around for rub­bing alco­hol to clean it. That night ini­tially starts bad­ly—the rub­bing alco­hol seemed to do noth­ing. After some mess­ing around, I fig­ure out that the head­band seems to have loos­ened over the weeks and so while the sen­sor felt rea­son­ably snug and tight and was trans­mit­ting, it was­n’t snug enough. I tighten it con­sid­er­ably and actu­ally get some decent data:

  • Off
  • 5: on
  • Off
  • 7: on
  • 8: off
  • 9: on
  • Off
  • 11: on?

The pre­vi­ous night, I began pay­ing closer atten­tion to when it was and was not read­ing me (usu­ally the lat­ter). Push­ing hard on it made it even­tu­ally read me, but tight­en­ing the head­band had­n’t helped the pre­vi­ous sev­eral nights. Push­ing and not push­ing, I noticed a sub­tle click. Appar­ently the band part with the metal sen­sor pad con­nects to the wire­less unit by 3 lit­tle black metal nubs; 2 were solidly in place, but the third was com­pletely loose. Sus­pi­cious, I try pulling on the band with­out push­ing on the wire­less unit—leav­ing the loose con­nec­tion loose. Sure enough, no con­nec­tion was reg­is­tered. I push on the unit while loos­ing the head­band—and the con­nec­tion worked. I felt I finally had solved it. It was­n’t a loose head­band or me pulling it off at night or oils on the metal sen­sors or a prob­lem with the SD card. I was too tired to fix it when I had the real­iza­tion, but resolved the next morn­ing to fix it by wrap­ping a rub­ber band around the wire­less unit and band. This turned out to not inter­fere with recharg­ing, and when I took a short nap, the data looked fine and gap­less. So! The long data drought is hope­fully over.

2011-02-11 ZQ sleep logs

On the 15th of Feb­ru­ary, I had a very early flight to San Fran­cis­co. That night and every night from then on, I was using mela­ton­in, so we’ll just include all the nights for which any sen­si­ble data was gath­ered. Oddly enough, the data and ZQs seem bad (as one would expect from sleep­ing on a couch), but I wake up feel­ing fairly refreshed. By this point we have the idea how the sleep charts work, so I will sim­ply link them rather than dis­play them.

Then I took a long break on updat­ing this page; when I had a month or two of data, I uploaded to Zeo again, and buck­led down and fig­ured out how to have crop pages. The shell script (for screen­shots of my browser, YMMV) is for file in *.png; do mogrify +repage -crop 700x350+350+285 $file; done;

Gen­eral obser­va­tions: almost all these nights were on mela­tonin. Not far into this peri­od, I real­ized that the lit­tle rub­ber band was not work­ing, and I hauled out my red and tight­ened it but good; and again, you can see the tran­si­tion from crappy record­ings to much cleaner record­ings. The rest of Feb­ru­ary:



April 4th was one of the few nights that I was not on mela­tonin dur­ing this times­pan; I occa­sion­ally take a week­end and try to drop all sup­ple­ments and nootrop­ics besides the mul­ti­vi­t­a­mins and fish oil, which includes my mela­tonin pills. This night (or more pre­cise­ly, that Sun­day evening) I also stayed up late work­ing on my com­put­er, get­ting in to bed at 12:25 AM. You can see how well that worked out. Dur­ing the 2 AM wake peri­od, it occurred to me that I did­n’t espe­cially want to sac­ri­fice a day to show that com­puter work can make for bad sleep (which I already have plenty of cita­tions for in the Mela­tonin essay), and I gave in, tak­ing a pill. That worked out much bet­ter, with a rel­a­tively nor­mal num­ber of wak­ings after 2 AM and a rea­son­able amount of deep & REM sleep.


One-legged standing

Seth Roberts found that for him, stand­ing a lot helped him sleep. This seems very plau­si­ble to me—­more fatigue to repair, closer to ances­tral con­di­tions of con­stant walk­ing—and tal­lied with my own expe­ri­ence. (One sum­mer I worked at a sum­mer camp, where I spent the entire day on my feet; I always slept very well though my bunk was uncom­fort­able.) He also found that stress­ing his legs by stand­ing on one at a time for a few min­utes also helped him sleep. That did not seem as plau­si­ble to me. But still worth try­ing: stand­ing is free, and if it does noth­ing, at least I got a lit­tle more exer­cise.

Roberts tried a fairly com­pli­cated ran­dom­ized rou­tine. I am sim­ply alter­nat­ing days as with mela­tonin (note that I have resumed tak­ing mela­tonin every day). My stand­ing method is also sim­ple; for 5 min­utes, I stand on one leg, rise up onto the ball of my foot (be­cause my calves are in good shape), and then sink down a foot or two and hold it until the burn­ing sen­sa­tion in my thigh forces me to switch to the other leg. (I seem to alter­nate every minute.) I walk my dog most every day, so the effect is not as sim­ple as ‘some mod­er­ate exer­cise that day’; in the next exper­i­ment, I might try 5 min­utes of dumb­bell bicep curls instead.

One-legged standing analysis

The ini­tial results were promis­ing. Of the first 5 days, 3 are ‘on’ and 2 are off; all 3 on-days had higher ZQs than the 2 off-days. Unfor­tu­nate­ly, the full time series did not seem to bear this out. Look­ing at the ~70 recorded days between 2011-06-11 and 2011-08-27 (raw CSV data), the raw uncor­rected aver­ages looked like this (as before, the ‘3’ means the inter­ven­tion was used, ‘0’ that it was not):

Stand­ing ZQ vs non-s­tand­ing
Morn­ing feel rat­ing
Total sleep time
Total deep sleep time
Total REM sleep time
Num­ber of times woken
Total time awake

R analy­sis, using mul­ti­vari­ate lin­ear regres­sion13 turns in a non-sig­nifi­cant value for one-legged­ness in gen­eral (p = 0.23); by vari­able:

Vari­able Effect p-value Coeffi­cien­t’s sign is…
ZQ -1.24 0.16 worse
Total.Z -4.09 0.37 worse
Time.to.Z 0.47 0.51 worse
Time.in.Wake -0.37 0.80 bet­ter
Time.in.REM -5.33 0.02 worse
Time.in.Light 2.76 0.38 worse
Time.in.Deep -1.56 0.10 worse
Awakenings -0.05 0.79 bet­ter
Morning.Feel -0.05 0.32 worse

No p-val­ues sur­vived mul­ti­ple-cor­rec­tion14:.

While I did not repli­cate Robert­s’s setup exactly in the inter­est of time and ease, and it was not blind­ed, I tried to com­pen­sate with an unusu­ally large sam­ple: 69 nights of data. This was a mixed exper­i­ment: there seems to be an neg­a­tive effect, but none of the changes seem to have large effect sizes or strong p-val­ues.

The one-legged stand­ing was not in exclu­sion to mela­tonin use, but I had used it most every night. I thought I might go on using one-legged stand­ing, per­haps skip­ping it on nights when I am up par­tic­u­larly late or lack the willpow­er, but I’ve aban­doned it because it is a lot of work to use and the result looked weak. In the future, I should look into whether walks before bed­time help.

Vitamin D

Vit­a­min D is a hor­mone endoge­nously cre­ated by expo­sure to sun­light; due to his­tor­i­cally low out­doors activ­ity lev­els, it has become a pop­u­lar sup­ple­ment and I use it. Some anec­dotes sug­gest that vit­a­min D may have cir­ca­dian and zeit­ge­ber effects due to its orig­in, and is harm­ful to sleep when taken at night. I ran a blinded ran­dom­ized self­-ex­per­i­ment on tak­ing vit­a­min D pills at bed­time. The vit­a­min D dam­aged my sleep and espe­cially how rested I felt upon wak­en­ing, sug­gest­ing vit­a­min D did have a stim­u­lat­ing effect which obstructed sleep. I con­ducted a fol­lowup blinded ran­dom­ized self­-ex­per­i­ment on the log­i­cal next ques­tion: if vit­a­min D is a day­time cue, then would vit­a­min D taken in the morn­ing show some ben­e­fi­cial effects? The results were incon­clu­sive (but slightly in favor of ben­e­fit­s). Given the asym­me­try, I sug­gest that vit­a­min D sup­ple­ments should be taken only in the morn­ing.

Main arti­cle: .


Potas­sium and mag­ne­sium are min­er­als that many Amer­i­cans are defi­cient in. I tried using potas­sium cit­rate and imme­di­ately noticed diffi­culty sleep­ing. A short ran­dom­ized (but not blind­ed) self­-ex­per­i­ment of ~4g potas­sium taken through­out the day con­firmed large neg­a­tive effects on my sleep. A longer fol­lowup ran­dom­ized and blinded self­-ex­per­i­ment used stan­dard­ized doses taken once a day early in the morn­ing, and also found some harm to sleep, and I dis­con­tin­ued potas­sium use entire­ly.

Main arti­cle: .

LSD microdosing

In the mid­dle of the five-fold exper­i­ment, I paused part of it to run ; I included sleep met­rics to check for dis­tur­bances. It did not seem to affect laten­cy, total sleep, or awak­en­ings, but did improve (d = 0.42) the “morn­ing feel” non-s­ta­tis­ti­cal­ly-sig­nifi­cantly (due to the mul­ti­ple cor­rec­tion). Unfor­tu­nate­ly, given that it seemed to neg­a­tively affect more impor­tant met­rics like the self­-rat­ing of mood/productivity & cre­ativ­i­ty, this is not nearly enough to begin to jus­tify fur­ther use of LSD micro­dos­ing for me.


Sus­pi­cious that alco­hol was delay­ing my sleep and wors­en­ing my sleep when I did finally go to bed, I recorded my alco­hol con­sump­tion for a year. Cor­re­lat­ing alco­hol use against when I go to bed shows no inter­est­ing cor­re­la­tion, nor with any of the other sleep vari­ables Zeo records, even after cor­rect­ing for a shift in my sleep pat­terns over that year. So it would seem I was wrong.

In May 2013, I began to won­der if alco­hol was dam­ag­ing my sleep; I don’t drink alco­hol too often and never more than a glass or two, so I don’t have any tol­er­ance built up. I noticed that on nights when I drank some red wine or had some of my , it seemed to take me much longer to fall asleep and I would reg­u­larly wake up in the mid­dle of the night. So I began not­ing down days on which I drank any alco­hol, to see if it cor­re­lated with sleep prob­lems (and prob­a­bly then just refrain from alco­hol in the evening, since I don’t care enough to run a ran­dom­ized exper­i­men­t).

In May 2014, I ran out of all my mead and also a gal­lon of bur­gundy wine I had bought to make beef bour­guignon with, so that marked a nat­ural close to the data col­lec­tion. I com­piled the alco­hol data along with the Zeo data in the rel­e­vant time peri­od, and looked at the key met­rics with a mul­ti­vari­ate mul­ti­ple regres­sion. The main com­plex­ity here is that I ear­lier dis­cov­ered that I had grad­u­ally shifted my sleep down and now Start.of.Night looks like a sig­moid, so to con­trol for that, I fit a sig­moid to the Date using non­lin­ear least squares, and then plugged the esti­mated val­ues in. The code, show­ing only the results for the Alcohol boolean:

drink <- read.csv("https://www.gwern.net/docs/zeo/2014-gwern-alcohol.csv")
summary(nlsLM(Start.of.Night ~ Alcohol + as.integer(Date) + (a / (1 + exp(-b * (as.integer(Date) - c)))),
              start = list(a = 6.15e+05, b = -1.18e-04, c = -5.15e+04),
              control=(nls.lm.control(ftol = sqrt(.Machine$double.eps)/4.9, maxfev=1024, maxiter=1024)),
# Parameters:
#    Estimate Std. Error t value Pr(>|t|)
# a  5.61e+06   6.49e+09    0.00     1.00
# b -1.00e-03   2.44e-04   -4.10  4.8e-05
# c -8.26e+03   1.16e+06   -0.01     0.99
summary(lm(cbind(Start.of.Night, Time.to.Z, Time.in.Wake, Awakenings, Morning.Feel, Total.Z, Time.in.REM, Time.in.Deep) ~
                  Alcohol +
                  as.integer(Date) + I(5.61e+06 / (1 + exp(-(1.00e-03) * (as.integer(Date) - (-8.26e+03))))),
# Response Start.of.Night :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                   -8.96e-01   4.75e+00   -0.19     0.85
# Response Time.to.Z :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                   -2.50e+00   1.41e+00   -1.77    0.077
# Response Time.in.Wake :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                   -2.04e+00   2.40e+00   -0.85   0.3956
# Response Awakenings :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                   -2.03e-01   2.85e-01   -0.71     0.48
# Response Morning.Feel :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                   -5.03e-02   9.16e-02   -0.55   0.5836
# Response Total.Z :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                    1.04e+01   7.89e+00    1.32     0.19
# Response Time.in.REM :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# (Intercept)                                                    7.59e+05   9.83e+05    0.77     0.44
# AlcoholTRUE                                                    1.84e+00   3.58e+00    0.51     0.61
# Response Time.in.Deep :
# Coefficients:
#                                                                Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE                                                    1.14e+00   1.41e+00    0.80     0.42

Zilch. No cor­re­la­tion is at all inter­est­ing.

So it looks like alco­hol—at least in the small quan­ti­ties I con­sume—­makes no differ­ence.


Bed time for better sleep

Some­one asked if I could turn up a bet­ter bed­time using their Zeo data. I accept­ed, but the sleep data comes with quite a few vari­ables and it’s not clear which vari­able is the ‘best’—for exam­ple, I don’t think much of the ZQ vari­able, so it’s not as sim­ple as regress­ing ZQ ~ Bedtime and find­ing what value of Bed­time max­i­mizes ZQ. I decided that I could try find­ing the opti­mal bed­time by two strate­gies:

  1. look for some under­ly­ing fac­tor of good sleep using —I’d expect maybe 2 or 3 fac­tors, one for total sleep, one for insom­nia, and maybe one for REM sleep­—and max­i­mize the good ones and min­i­mize the bad ones, equally weighted
  2. just do a mul­ti­vari­ate regres­sion and weight each vari­able equally

So, setup:

zeo <- read.csv("https://www.gwern.net/docs/zeo/gwern-zeodata.csv")
zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y")
## convert "05/12/2014 06:45" to "06:45"
zeo$Start.of.Night <- sapply(strsplit(as.character(zeo$Start.of.Night), " "), function(x) { x[2] })
## convert "06:45" to 24300
interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x))
                                           else { y <- unlist(strsplit(x, ":")); as.integer(y[[1]])*60 + as.integer(y[[2]]); }
                          else NA
zeo$Start.of.Night <- sapply(zeo$Start.of.Night, interval)
## correct for the switch to new unencrypted firmware in March 2013;
## I don't know why the new firmware subtracts 15 hours
zeo[(zeo$Sleep.Date >= as.Date("2013-03-11")),]$Start.of.Night
 <- (zeo[(zeo$Sleep.Date >= as.Date("2013-03-11")),]$Start.of.Night + 900) %% (24*60)

## after midnight (24*60=1440), Start.of.Night wraps around to 0, which obscures any trends,
## so we'll map anything before 7AM to time+1440
zeo[zeo$Start.of.Night<420 & !is.na(zeo$Start.of.Night),]$Start.of.Night
 <- (zeo[zeo$Start.of.Night<420 & !is.na(zeo$Start.of.Night),]$Start.of.Night + (24*60))

## keep only the variables we're interested in:
zeo <- zeo[,c(2:10, 23)]
## define naps or nights with bad data as total sleep time under ~1.5 hours (100m) & delete
zeo <- zeo[zeo$Total.Z>100,]
write.csv(zeo, file="bedtime-factoranalysis.csv", row.names=FALSE)

Let’s begin with a sim­ple fac­tor analy­sis, look­ing for a ‘good sleep’ fac­tor. Zeo Inc appar­ently was try­ing for this with the ZQ vari­able but I’ve always been sus­pi­cious of it because it does­n’t seem to track Morning.Feel or Awakenings very well but sim­ply be how long you slept (Total.Z):

zeo <- read.csv("https://www.gwern.net/docs/zeo/2014-07-26-bedtime-factoranalysis.csv")
# VSS complexity 1 achieves a maximimum of 0.8  with  6  factors
# VSS complexity 2 achieves a maximimum of 0.94  with  6  factors
# The Velicer MAP achieves a minimum of 0.09  with  1  factors
# Empirical BIC achieves a minimum of  466.5  with  5  factors
# Sample Size adjusted BIC achieves a minimum of  39396  with  5  factors
# Statistics by number of factors
#    vss1 vss2   map dof chisq prob sqresid  fit RMSEA   BIC SABIC complex  eChisq    eRMS eCRMS eBIC
# 1  0.71 0.00 0.090  35 41394    0  6.4648 0.71  0.99 41145 41256     1.0 1.8e+03 0.12926  0.15 1577
# 2  0.77 0.85 0.099  26 40264    0  3.3366 0.85  1.13 40079 40162     1.2 9.4e+02 0.09275  0.12  755
# 3  0.78 0.89 0.139  18 40323    0  2.1333 0.91  1.36 40195 40253     1.4 9.0e+02 0.09075  0.14  772
# 4  0.75 0.89 0.216  11 39886    0  1.3401 0.94  1.73 39808 39843     1.5 8.0e+02 0.08560  0.17  722
# 5  0.78 0.89 0.280   5 39415    0  0.7267 0.97  2.56 39380 39396     1.4 5.0e+02 0.06779  0.20  467
# 6  0.80 0.94 0.450   0 38640   NA  0.3194 0.99    NA    NA    NA     1.2 2.2e+02 0.04479    NA   NA
# 7  0.80 0.92 0.807  -4 37435   NA  0.1418 0.99    NA    NA    NA     1.2 1.0e+02 0.03075    NA   NA
# 8  0.78 0.91 4.640  -7 30474   NA  0.0002 1.00    NA    NA    NA     1.3 2.5e-02 0.00048    NA   NA
# 9  0.78 0.91   NaN  -9 30457   NA  0.0002 1.00    NA    NA    NA     1.3 2.5e-02 0.00048    NA   NA
# 10 0.78 0.91    NA -10 30440   NA  0.0002 1.00    NA    NA    NA     1.3 2.5e-02 0.00048    NA   NA

## BIC says 5 factors, so we'll go with that:
factorization <- fa(zeo, nfactors=5); factorization
# Standardized loadings (pattern matrix) based upon correlation matrix
#                  MR1   MR2   MR5   MR4   MR3   h2    u2 com
# ZQ              0.87 -0.14 -0.01  0.25 -0.04 0.99 0.013 1.2
# Total.Z         0.96  0.04 -0.01  0.07 -0.04 0.99 0.011 1.0
# Time.to.Z       0.05 -0.03  0.92  0.03  0.10 0.84 0.159 1.0
# Time.in.Wake   -0.18  0.90 -0.02  0.04 -0.15 0.83 0.168 1.1
# Time.in.REM     0.87  0.05  0.03  0.05  0.09 0.78 0.215 1.0
# Time.in.Light   0.94  0.02 -0.04 -0.20 -0.14 0.84 0.158 1.1
# Time.in.Deep    0.02  0.03  0.01  0.99 -0.02 0.98 0.023 1.0
# Awakenings      0.35  0.75  0.08 -0.03  0.26 0.79 0.209 1.7
# Start.of.Night -0.21  0.00  0.10 -0.05  0.86 0.84 0.162 1.2
# Morning.Feel    0.22 -0.13 -0.55  0.11  0.46 0.66 0.343 2.5
#                        MR1  MR2  MR5  MR4  MR3
# SS loadings           3.65 1.44 1.21 1.16 1.08
# Proportion Var        0.37 0.14 0.12 0.12 0.11
# Cumulative Var        0.37 0.51 0.63 0.75 0.85
# Proportion Explained  0.43 0.17 0.14 0.14 0.13
# Cumulative Proportion 0.43 0.60 0.74 0.87 1.00
#  With factor correlations of
#       MR1   MR2   MR5   MR4   MR3
# MR1  1.00  0.03 -0.18  0.34 -0.03
# MR2  0.03  1.00  0.27 -0.09  0.00
# MR5 -0.18  0.27  1.00 -0.09  0.09
# MR4  0.34 -0.09 -0.09  1.00  0.03
# MR3 -0.03  0.00  0.09  0.03  1.00
# Mean item complexity =  1.3
# Test of the hypothesis that 5 factors are sufficient.
# The degrees of freedom for the null model are  45  and the objective function was  40.02 with Chi Square of  48376
# The degrees of freedom for the model are 5  and the objective function was  32.69
# The root mean square of the residuals (RMSR) is  0.07
# The df corrected root mean square of the residuals is  0.2
# The harmonic number of observations is  1152 with the empirical chi square  473.1  with prob <  5.1e-100
# The total number of observations was  1214  with MLE Chi Square =  39412  with prob <  0
# Tucker Lewis Index of factoring reliability =  -6.359
# RMSEA index =  2.557  and the 90 % confidence intervals are  2.527 2.569
# BIC =  39377
# Fit based upon off diagonal values = 0.97

This looks like MR1=over­all sleep; MR2=insomnia/bad-sleep; MR5=d­iffi­cul­ty-falling-asleep?; MR4=deep­-sleep­-(not part of MR1!); MR3=­dun­no. MR1 and MR4 cor­re­late 0.34, and MR2/MR5 0.27, which makes sense. I want to max­i­mize over­all sleep and deep sleep (deep sleep seems con­nected to health), so MR1 and M4.

Now that we have our fac­tors, we can extract them and plot them over time for a graph­i­cal look:

MR1 <- predict(factorization, data=zeo)[,1]
MR4 <- predict(factorization, data=zeo)[,4]

par(mfrow=c(2,1), mar=c(4,4.5,1,1))
plot(MR1 ~ I(Start.of.Night/60), xlab="",        ylab="Total sleep (MR1)", data=zeo)
plot(MR4 ~ I(Start.of.Night/60), xlab="Bedtime", ylab="Deep sleep (MR4)",  data=zeo)
Total & deep sleep fac­tors vs bed­time

looks like a over­all lin­ear decline (later=­worse), but pos­si­bly with a peak some­where look­ing like a qua­drat­ic.

So we’ll try fit­ting qua­drat­ics:

factorModel <- lm(cbind(MR1, MR4) ~ Start.of.Night + I(Start.of.Night^2), data=zeo); summary(factorModel)
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -6.63e+01   7.65e+00   -8.67   <2e-16
# Start.of.Night       9.74e-02   1.07e-02    9.13   <2e-16
# I(Start.of.Night^2) -3.56e-05   3.72e-06   -9.57   <2e-16
# Residual standard error: 0.829 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.152,   Adjusted R-squared:  0.15
# F-statistic:  101 on 2 and 1127 DF,  p-value: <2e-16
# Response MR4 :
# Call:
# lm(formula = MR4 ~ Start.of.Night + I(Start.of.Night^2), data = zeo)
# Residuals:
#    Min     1Q Median     3Q    Max
# -3.057 -0.651 -0.017  0.600  4.329
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -5.06e+01   8.97e+00   -5.64  2.1e-08
# Start.of.Night       7.23e-02   1.25e-02    5.79  9.3e-09
# I(Start.of.Night^2) -2.58e-05   4.36e-06   -5.92  4.2e-09
# Residual standard error: 0.971 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0384,  Adjusted R-squared:  0.0367
# F-statistic: 22.5 on 2 and 1127 DF,  p-value: 2.57e-10

## on the other hand, if we had ignored the quadratic term, we'd
## get a much worse fit
summary(lm(cbind(MR1, MR4) ~ Start.of.Night, data=zeo))
# Coefficients:
#                 Estimate Std. Error t value Pr(>|t|)
# (Intercept)     6.643744   0.653047    10.2   <2e-16
# Start.of.Night -0.004613   0.000457   -10.1   <2e-16
# Residual standard error: 0.861 on 1128 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0829,  Adjusted R-squared:  0.0821
# F-statistic:  102 on 1 and 1128 DF,  p-value: <2e-16
# Response MR4 :
# Coefficients:
#                 Estimate Std. Error t value Pr(>|t|)
# (Intercept)     2.337279   0.747401    3.13   0.0018
# Start.of.Night -0.001627   0.000523   -3.11   0.0019
# Residual standard error: 0.986 on 1128 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.00851, Adjusted R-squared:  0.00764
# F-statistic: 9.69 on 1 and 1128 DF,  p-value: 0.0019

So we want to use the qua­drat­ic. Given this qua­dratic mod­el, what’s the opti­mal bed­time?

estimatedFactorValues <- predict(factorModel, newdata=data.frame(Start.of.Night=1:max(zeo$Start.of.Night, na.rm=TRUE)))
## when is MR1 maximized?
which(estimatedFactorValues[,1] == max(estimatedFactorValues[,1]))
# 1368
1368 / 60
# [1] 22.8
## 10:48 PM seems reasonable
## when is MR3 maximized?
which(estimatedFactorValues[,2] == max(estimatedFactorValues[,2]))
# 1401
## 11:21 PM seems reasonable

## summing the factors isn't quite the average of the two time, but it's close:
combinedFactorSums <- rowSums(estimatedFactorValues)
which(combinedFactorSums == max(combinedFactorSums))
# 1382
## 11:02PM

Maybe using fac­tors was­n’t a good idea? We can try a mul­ti­vari­ate regres­sion on the vari­ables direct­ly:

quadraticModel <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM,
                           Time.in.Light, Time.in.Deep, Awakenings, Morning.Feel)
                       ~ Start.of.Night + I(Start.of.Night^2), data=zeo)
# Response ZQ :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -7.84e+02   1.06e+02   -7.38  3.1e-13
# Start.of.Night       1.29e+00   1.48e-01    8.68  < 2e-16
# I(Start.of.Night^2) -4.70e-04   5.16e-05   -9.10  < 2e-16
# Residual standard error: 11.5 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.139,   Adjusted R-squared:  0.137
# F-statistic: 90.9 on 2 and 1127 DF,  p-value: <2e-16
# Response Total.Z :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -4.48e+03   5.54e+02   -8.08  1.7e-15
# Start.of.Night       7.32e+00   7.73e-01    9.47  < 2e-16
# I(Start.of.Night^2) -2.67e-03   2.69e-04   -9.91  < 2e-16
# Residual standard error: 60 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.158,   Adjusted R-squared:  0.156
# F-statistic:  106 on 2 and 1127 DF,  p-value: <2e-16
# Response Time.to.Z :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -6.09e+02   1.22e+02   -4.98  7.3e-07
# Start.of.Night       8.43e-01   1.71e-01    4.94  8.8e-07
# I(Start.of.Night^2) -2.81e-04   5.95e-05   -4.73  2.6e-06
# Residual standard error: 13.2 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0431,  Adjusted R-squared:  0.0415
# F-statistic: 25.4 on 2 and 1127 DF,  p-value: 1.61e-11
# Response Time.in.Wake :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -1.26e+02   1.76e+02   -0.72     0.47
# Start.of.Night       2.15e-01   2.45e-01    0.88     0.38
# I(Start.of.Night^2) -7.83e-05   8.55e-05   -0.92     0.36
# Residual standard error: 19.1 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.00149, Adjusted R-squared:  -0.000283
# F-statistic: 0.84 on 2 and 1127 DF,  p-value: 0.432
# Response Time.in.REM :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -1.43e+03   2.69e+02   -5.32  1.2e-07
# Start.of.Night       2.32e+00   3.75e-01    6.19  8.6e-10
# I(Start.of.Night^2) -8.39e-04   1.31e-04   -6.42  2.0e-10
# Residual standard error: 29.1 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0608,  Adjusted R-squared:  0.0592
# F-statistic: 36.5 on 2 and 1127 DF,  p-value: 4.37e-16
# Response Time.in.Light :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -2.45e+03   3.43e+02   -7.15  1.5e-12
# Start.of.Night       4.07e+00   4.78e-01    8.50  < 2e-16
# I(Start.of.Night^2) -1.50e-03   1.67e-04   -9.00  < 2e-16
# Residual standard error: 37.2 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.164,   Adjusted R-squared:  0.162
# F-statistic:  110 on 2 and 1127 DF,  p-value: <2e-16
# Response Time.in.Deep :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -5.88e+02   1.10e+02   -5.34  1.1e-07
# Start.of.Night       9.27e-01   1.53e-01    6.04  2.1e-09
# I(Start.of.Night^2) -3.30e-04   5.35e-05   -6.17  9.5e-10
# Residual standard error: 11.9 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0398,  Adjusted R-squared:  0.0381
# F-statistic: 23.4 on 2 and 1127 DF,  p-value: 1.12e-10
# Response Awakenings :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -1.18e+02   2.71e+01   -4.36  1.4e-05
# Start.of.Night       1.68e-01   3.77e-02    4.46  9.0e-06
# I(Start.of.Night^2) -5.67e-05   1.32e-05   -4.31  1.7e-05
# Residual standard error: 2.93 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0274,  Adjusted R-squared:  0.0256
# F-statistic: 15.9 on 2 and 1127 DF,  p-value: 1.62e-07
# Response Morning.Feel :
# Coefficients:
#                      Estimate Std. Error t value Pr(>|t|)
# (Intercept)         -2.12e+01   7.02e+00   -3.01  0.00266
# Start.of.Night       3.32e-02   9.79e-03    3.39  0.00073
# I(Start.of.Night^2) -1.15e-05   3.41e-06   -3.37  0.00079
# Residual standard error: 0.761 on 1127 degrees of freedom
#   (84 observations deleted due to missingness)
# Multiple R-squared:  0.0103,  Adjusted R-squared:  0.0085
# F-statistic: 5.84 on 2 and 1127 DF,  p-value: 0.00301

## Likewise, what's the optimal predicted time?
estimatedValues <- predict(quadraticModel, newdata=data.frame(Start.of.Night=1:max(zeo$Start.of.Night, na.rm=TRUE)))
# but what time is best? we have so many choices of variable to optimize.
# Let's simply sum them all and say bigger is better
# first, we need to negate 'Time.in.Wake', 'Time.to.Z', 'Awakenings',
# as for those, bigger is worse
estimatedValues[,3] <- -estimatedValues[,3] # Time.to.Z
estimatedValues[,4] <- -estimatedValues[,4] # Time.in.Wake
estimatedValues[,8] <- -estimatedValues[,8] # Awakenings
combinedSums <- rowSums(estimatedValues)
which(combinedSums == max(combinedSums))
# 1362

Or 10:42PM, which is almost iden­ti­cal to the MR1 esti­mate. So just like before.

Both approaches sug­gest that I go to bed some­what ear­lier than I do now. This has the same cor­re­la­tion ≠ causal­ity issue as the rise-time analy­sis does (per­haps I am espe­cially sleepy on the days I go to bed a bit early and so nat­u­rally sleep more), but on the other hand, it’s not sug­gest­ing I go to bed at 7PM or any­thing crazy, so I am more inclined to take a chance on it.

Rise time for productivity

I noticed a claim that for one per­son, ris­ing at 3-5AM (!) seemed to improve their days “because the morn­ing hours have no dis­trac­tions” and I won­dered whether there might be any such cor­re­la­tion for myself, so I took my usual MP daily self­-rat­ing and plot­ted against rise-time that day:

Self­-rat­ing vs rise time, n = 841

It looks like a cubic sug­gest­ing one peak around 8:30AM and then a later peak, but that’s based on so lit­tle I ignore it. The causal rela­tion­ship is also unclear: maybe get­ting up ear­lier really does cause higher MP self­-rat­ings, but per­haps on days I don’t feel like doing any­thing I am more likely to sleep in, or some other com­mon cause. The avail­able sam­ples sug­gest that ear­lier than that is worse, pos­si­bly much worse, so I am not inclined to try out some­thing I expect to make me mis­er­able.

The source code of the graph & analy­sis; pre­pro­cess­ing:

mp <- read.csv("~/selfexperiment/mp.csv", colClasses=c("Date","integer"))
zeo <- read.csv("https://www.gwern.net/docs/zeo/gwern-zeodata.csv")
## we want the date of the day sleep ended, not started, so we ignore the usual 'Sleep.Date' and construct our own 'Date':
zeo$Date <- as.Date(sapply(strsplit(as.character(zeo$Rise.Time), " "), function(x) { x[1] }), format="%m/%d/%Y")
## convert "05/12/2014 06:45" to "06:45"
zeo$Rise.Time <- sapply(strsplit(as.character(zeo$Rise.Time), " "), function(x) { x[2] })
## convert "06:45" to the integer 24300
interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x))
                                           else { y <- unlist(strsplit(x, ":")); as.integer(y[[1]])*60 + as.integer(y[[2]]); }
                          else NA
zeo$Rise.Time <- sapply(zeo$Rise.Time, interval)
## doesn't always work, so delete missing data:
zeo <- zeo[!is.na(zeo$Date),]

## correct for the switch to new unencrypted firmware in March 2013;
## I don't know why the new firmware changed things; adjustment of 226 minutes was estimated using:
# library(changepoint); cpt.mean(na.omit(zeo$Rise.Time)); '$mean [1] 566.7 340.2';  566.7 - 340.2 = 226
zeo[(zeo$Date >= as.Date("2013-03-11")),]$Rise.Time  <-
 (zeo[(zeo$Date >= as.Date("2013-03-11")),]$Rise.Time + 226) %% (24*60)

allData <- merge(mp,zeo)
morning <- data.frame(MP=allData$MP, Rise.Time=allData$Rise.Time)
morning$Rise.Time.Hour <- morning$Rise.Time / 60
write.csv(morning, file="morning.csv", row.names=FALSE)

Graph­ing and fit­ting:

morning <- read.csv("https://www.gwern.net/docs/zeo/2014-07-26-risetime-mp.csv")
ggplot(data = morning, aes(x=Rise.Time.Hour, y=jitter(MP, factor=0.2)))
 + xlab("Wake time (24H)")
 + ylab("Mood/productivity self-rating (2/3/4)")
 + geom_point(size=I(4))
 ## cross-validation suggests 0.8397 but looks identical to auto-LOESS span choice
 + stat_smooth(span=0.8397)

## looks 100% like a cubic function
linear <- lm(MP ~ Rise.Time,         data=morning)
cubic  <- lm(MP ~ poly(Rise.Time,3), data=morning)
# Model 1: MP ~ Rise.Time
# Model 2: MP ~ poly(Rise.Time, 3)
#   Res.Df RSS Df Sum of Sq    F Pr(>F)
# 1    839 442
# 2    837 437  2      5.36 5.14 0.0061
#        df  AIC
# linear  3 1852
# cubic   5 1846
# ...Coefficients:
#                     Estimate Std. Error t value Pr(>|t|)
# (Intercept)           3.0571     0.0249  122.70   <2e-16
# poly(Rise.Time, 3)1  -0.9627     0.7225   -1.33    0.183
# poly(Rise.Time, 3)2  -1.4818     0.7225   -2.05    0.041
# poly(Rise.Time, 3)3   1.7795     0.7225    2.46    0.014
# Residual standard error: 0.723 on 837 degrees of freedom
# Multiple R-squared:  0.0142,    Adjusted R-squared:  0.0107
# F-statistic: 4.02 on 3 and 837 DF,  p-value: 0.00749

# plot(morning$Rise.Time,morning$MP); points(morning$Rise.Time,fitted(cubic),pch=19)
which(fitted(cubic) == max(fitted(cubic))) / 60
#  516   631   762
# 8.60 10.52 12.70

Magnesium citrate

Re-an­a­lyz­ing data from a mag­ne­sium self­-ex­per­i­ment, I find both pos­i­tive and neg­a­tive effects of the mag­ne­sium on my sleep. It’s not clear what the net effect is.

I became after not­ing a pos­si­ble effect on my pro­duc­tiv­ity from TruBrain (which among other things included a mag­ne­sium tablet), and then a clear cor­re­la­tion from some mag­ne­sium l-thre­onate. I’d also long heard of mag­ne­sium help­ing sleep, and was curi­ous about that too. So I began a large (~207 days) RCT try­ing out 136mg then 800mg of ele­men­tal mag­ne­sium per day in late 2013—early 2014. (This was not a large enough exper­i­ment to defin­i­tively answer ques­tions about both pro­duc­tiv­ity and sleep, but since I have all the data on hand, I thought I’d look.)

The results of the main were sur­pris­ing: it seemed that the mag­ne­sium caused an ini­tial large boost to my pro­duc­tiv­i­ty, but the boost began to fade and after 20 days or so, the effect became neg­a­tive, and the period with the larger dose had a worse effect, sug­gest­ing a cumu­la­tive over­dose.

With the differ­ing effect of the doses in mind, I looked at the effect on my sleep data.



magnesium <- read.csv("https://www.gwern.net/docs/nootropics/2013-2014-magnesium.csv")
magnesium$Date <- as.Date(magnesium$Date)

zeo <- read.csv("https://www.gwern.net/docs/zeo/gwern-zeodata.csv")
zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y")
zeo$Date <- zeo$Sleep.Date
# create a equally-weighted index of bad sleep: a z-score of the 3 bad things
zeo$Disturbance <- scale(zeo$Time.to.Z) + scale(zeo$Awakenings) + scale(zeo$Time.in.Wake)

magnesiumSleep <- merge(zeo, magnesium)
write.csv(magnesiumSleep, file="2014-07-27-magnesium-sleep.csv", row.names=FALSE)

(I then hand-edited the CSV to delete unused column­s.)

Graph­ing Dis­tur­bance:

Sleep dis­tur­bance over time, col­ored by mag­ne­sium dose, with LOESS-smoothed trend-lines
magnesiumSleep <- read.csv("https://www.gwern.net/docs/zeo/2014-07-27-magnesium-sleep.csv")
magnesiumSleep$Date <- as.Date(magnesiumSleep$Date)
## historical baseline:
magnesiumSleep[is.na(magnesiumSleep$Magnesium.citrate),]$Magnesium.citrate <- -1
ggplot(data = magnesiumSleep, aes(x=Date, y=Disturbance, col=as.factor(magnesiumSleep$Magnesium.citrate))) +
 ylab("Disturbance z-score (lower=better)") +
 geom_point(size=I(4)) +
 stat_smooth() +
 scale_colour_manual(values=c("gray49", "grey35", "red1", "red2" ),
                     name = "Magnesium")

Analy­sis (first dis­tur­bances, then all vari­ables):

magnesiumSleep <- read.csv("https://www.gwern.net/docs/zeo/2014-07-27-magnesium-sleep.csv")
l0 <- lm(Disturbance ~ as.factor(Magnesium.citrate), data=magnesiumSleep)
# ...Coefficients:
#                                   Estimate Std. Error  t value  Pr(>|t|)
# (Intercept)                     -0.5020571  0.1862795 -2.69518 0.0076218
# as.factor(Magnesium.citrate)136 -0.0566556  0.3101388 -0.18268 0.8552318
# as.factor(Magnesium.citrate)800 -0.5394708  0.3259212 -1.65522 0.0994178

So it seems that mag­ne­sium cit­rate may decrease sleep prob­lems.

l1 <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM, Time.in.Light,
               Time.in.Deep, Awakenings, Morning.Feel)
               ~ as.factor(Magnesium.citrate),
# Response ZQ : ...Coefficients:
#                                 Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                     95.85149    1.29336 74.11065  < 2e-16
# as.factor(Magnesium.citrate)136 -3.27254    2.15332 -1.51976  0.13012
# as.factor(Magnesium.citrate)800  1.49545    2.26290  0.66086  0.50945
# Response Total.Z : ...Coefficients:
#                                  Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                     536.35644    6.59166 81.36898  < 2e-16
# as.factor(Magnesium.citrate)136 -27.37398   10.97453 -2.49432 0.013414
# as.factor(Magnesium.citrate)800  15.86805   11.53300  1.37588 0.170367
# Response Time.to.Z : ...Coefficients:
#                                 Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                     12.59406    1.24108 10.14766  < 2e-16
# as.factor(Magnesium.citrate)136  4.26559    2.06629  2.06437 0.040247
# as.factor(Magnesium.citrate)800 -2.43079    2.17144 -1.11944 0.264269
# Response Time.in.Wake : ...Coefficients:
#                                 Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                     24.09901    1.87720 12.83776  < 2e-16
# as.factor(Magnesium.citrate)136 -3.66041    3.12537 -1.17119  0.24289
# as.factor(Magnesium.citrate)800 -4.16023    3.28441 -1.26666  0.20672
# Response Time.in.REM : ...Coefficients:
#                                  Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                     171.45545    2.99387 57.26889  < 2e-16
# as.factor(Magnesium.citrate)136  -6.45545    4.98452 -1.29510  0.19675
# as.factor(Magnesium.citrate)800   2.27925    5.23818  0.43512  0.66393
# Response Time.in.Light : ...Coefficients:
#                                  Estimate Std. Error  t value   Pr(>|t|)
# (Intercept)                     304.54455    4.08746 74.50709 < 2.22e-16
# as.factor(Magnesium.citrate)136 -23.33403    6.80525 -3.42883 0.00073338
# as.factor(Magnesium.citrate)800  20.51667    7.15156  2.86884 0.00455323
# Response Time.in.Deep : ...Coefficients:
#                                 Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                     60.88119    1.20888 50.36152  < 2e-16
# as.factor(Magnesium.citrate)136  2.48723    2.01268  1.23578  0.21796
# as.factor(Magnesium.citrate)800 -6.81996    2.11510 -3.22441  0.00147
# Response Awakenings : ...Coefficients:
#                                  Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                      6.039604   0.238675 25.30475  < 2e-16
# as.factor(Magnesium.citrate)136 -0.548376   0.397372 -1.38001  0.16910
# as.factor(Magnesium.citrate)800 -0.427359   0.417594 -1.02338  0.30734
# Response Morning.Feel : ...Coefficients:
#                                   Estimate Std. Error  t value Pr(>|t|)
# (Intercept)                      2.7227723  0.0762575 35.70497  < 2e-16
# as.factor(Magnesium.citrate)136  0.1193330  0.1269620  0.93991  0.34837
# as.factor(Magnesium.citrate)800 -0.1513437  0.1334229 -1.13432  0.25799
l2 <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM, Time.in.Light,
               Time.in.Deep, Awakenings, Morning.Feel) ~ Magnesium.citrate,
#                              Df    Pillai approx F num Df den Df     Pr(>F)
# as.factor(Magnesium.citrate)  2 0.3265357 4.271083     18    394 2.3902e-08
# Residuals                    204
#                              Df    Pillai approx F num Df den Df     Pr(>F)
# Magnesium.citrate            1 0.1815233  4.85456      9    197  7.1454e-06
# Residuals         205
      < 0.05)
# [1] 13 14 15 17 18

A table sum­ma­riz­ing the results by dose (‘all’ is the net effect from the non-fac­tor ver­sion):

Vari­able Dose (mg) Coef p Effect
Morning.Feel 136 0.11933 0.3483 bet­ter
Morning.Feel 800 -0.15134 0.2579 worse
Morning.Feel all -0.00022 0.1752 worse
ZQ 136 -3.27254 0.1301 worse
ZQ 800 1.49545 0.5094 bet­ter
ZQ all 0.00270 0.3344 bet­ter
Total.Z 136 -27.3739 0.0134 worse
Total.Z 800 15.8680 0.1703 bet­ter
Total.Z all 0.02698 0.0632 bet­ter
Time.in.REM 136 -6.45545 0.1967 worse
Time.in.REM 800 2.27925 0.6639 bet­ter
Time.in.REM all 0.00447 0.4895 bet­ter
Time.in.Light 136 -23.3340 0.0007 worse
Time.in.Light 800 20.5166 0.0045 bet­ter
Time.in.Light all 0.03202 0.0005 bet­ter
Time.in.Deep 136 2.48723 0.2179 bet­ter
Time.in.Deep 800 -6.81996 0.0014 worse
Time.in.Deep all -0.00939 0.0004 worse
Time.to.Z 136 4.26559 0.0402 worse
Time.to.Z 800 -2.43079 0.2642 bet­ter
Time.to.Z all -0.00415 0.1262 bet­ter
Time.in.Wake 136 -3.66041 0.2428 bet­ter
Time.in.Wake 800 -4.16023 0.2067 bet­ter
Time.in.Wake all -0.00449 0.2673 bet­ter
Awakenings 136 -0.54837 0.1691 bet­ter
Awakenings 800 -0.42735 0.3073 bet­ter
Awakenings all -0.00042 0.4144 bet­ter

For the low dose, 4⁄9 were bet­ter; for the high dose, 7⁄9 were bet­ter. Adjust­ing for mul­ti­ple-com­par­i­son at p < 0.05: the sur­viv­ing effects are:

Vari­able Dose (mg) Coef p Effect
Time.in.Light 136 -23.3340 0.0007 worse
Time.in.Light 800 20.5166 0.0045 bet­ter
Time.in.Light all 0.03202 0.0005 bet­ter
Time.in.Deep 800 -6.81996 0.0014 worse
Time.in.Deep all -0.00939 0.0004 worse


I ran a ran­dom­ized exper­i­ment with a free pro­gram (Red­shift) which red­dens screens at night to avoid tam­per­ing with mela­tonin secre­tion & the sleep from 2012-2013, mea­sur­ing sleep changes with my Zeo. With 533 days of data, the main result is that Red­shift causes me to go to sleep half an hour ear­lier but oth­er­wise does not improve sleep qual­i­ty.

Main arti­cle: .


As part of a self­-ex­per­i­ment involv­ing low doses of lithium oro­tate blinded & ran­dom­ized in 7-day paired blocks, I checked for effects on Zeo sleep data dur­ing the self­-ex­per­i­ment. No vari­ables reached sta­tis­ti­cal-sig­nifi­cance in that exper­i­ment, includ­ing the sleep ones.

Main arti­cle: Nootrop­ics page.


I ran a blinded ran­dom­ized self­-ex­per­i­ment of 2.5g nightly ZMA pow­der effect on Zeo-recorded sleep data dur­ing March-Oc­to­ber 2017 (n = 127). The lin­ear model and SEM model show no sta­tis­ti­cal­ly-sig­nifi­cant effects or high pos­te­rior prob­a­bil­ity of ben­e­fits, although all point-es­ti­mates were in the direc­tion of ben­e­fits. Data qual­ity issues reduced the avail­able dataset, ren­der­ing the exper­i­ment par­tic­u­larly under­pow­ered and the results more incon­clu­sive. I decided to not con­tinue use of ZMA after run­ning out; ZMA may help my sleep but I need to improve data qual­ity before attempt­ing any fur­ther sleep self­-ex­per­i­ments on it.

Main arti­cle: .


Ever since I was a lit­tle kid watch­ing on & then , I had one burn­ing ques­tion about the antics of the cast and their island idyll/prison: what was it like to sleep in a , any­way‽ Skip­per and Gilli­gan slept in ham­mocks all the time, but the show stub­bornly refused to go into any details about the nature of ham­mock sleep­ing. Was it bet­ter than beds? Worse? Hot­ter? Cold­er? Did it hurt the neck?

While my beds usu­ally are good as far as beds go, I’ve never been com­pletely happy with them: as a side sleep­er, it’s all too easy for me to wake up with a par­a­lyzed arm or a crick in the neck. (It is irri­tat­ing to like a sheet of bub­ble-wrap in the morn­ing.) And any­time I have to move a bed, I can’t help won­der­ing if beds really have to be as bulky and heavy as they are. But it seemed to me that a ham­mock, enfurl­ing & enclos­ing one as they do, might resolve that prob­lem. What does the sci­en­tific lit­er­a­ture say about this? The topic seems to be almost com­pletely unre­searched. For exam­ple, almost every hit for the word “ham­mock” on Pubmed is due to the author B.D. Ham­mock. Google Scholar does a lit­tle bit bet­ter, as the first few pages of hits, besides turn­ing up B.D. Ham­mock again, points at a short exper­i­ment “Rock­ing syn­chro­nizes brain waves dur­ing a short nap” which com­pared 12 men nap­ping on a sway­ing bed, and sug­gests some lit­er­a­ture on the effect of spinal angle on sleep. This silence is a lit­tle sur­pris­ing, con­sider that a non­triv­ial frac­tion of human­ity sleeps in ham­mocks or ham­mock­-like things—y­ou’d think navies, at the very least, would be inter­ested in the sub­ject of whether ham­mocks were bet­ter than bed­s—but so it goes.

The ques­tions, at irreg­u­lar inter­vals over the years, con­tin­ued to prey on my mind, occa­sion­ally prompted by men­tion of sailors. Of course I peri­od­i­cally would run into lawn/garden ham­mocks, but those wretched con­trap­tions were no answer: the cord made for an uncom­fort­able rest, and the enor­mous spreader bars lead to severe insta­bil­ity (although they made for great pranks). Finally in 2014, it dawned on me that I had access to an unused stand for a lawn ham­mock; I had room to set it up in my bed­room; and from idly brows­ing Ama­zon, I knew I could get a ham­mock for under $50, which seemed rea­son­able for an exper­i­ment. Why cunc­tate and repine fur­ther? I could­n’t think of any rea­son why not, so after some more brows­ing, the cheap­est ham­mock seemed to be the Army Green Ultra Light Ham­mocks with Tree Strap for $22.50, and I ordered it in Sep­tem­ber.

I was a lit­tle sur­prised how small and light­weight the hunter-green nylon ham­mock turns out to be (the whole pack­age fits in a padded enve­lope mailer and weighs under a pound), and quickly set it up.

The frame creaked alarm­ingly under my 200 pounds, but it held up. It feels very differ­ent from a bed, more like a slide at an amuse­ment park in how one is lay­ing back into a tube. Lay­ing in a ham­mock is also much more sta­ble than a lawn ham­mock, at least once you get into it suc­cess­ful­ly. Another issue was the grad­ual dis­com­fort of hav­ing my feet ele­vated due to the V-shape of the ham­mock as it sagged under my weight. This seemed mostly resolved by tight­en­ing the ropes and lay­ing at more of a diag­o­nal.

I found it easy to take a brief nap or rest in it, but it felt like it was squeez­ing my shoul­ders into my chest and my first attempt to sleep overnight failed. The sec­ond & third nights went bet­ter, but still not as good as the bed.

The prob­lem seems to be the arms/chest squeez­ing, caused by noth­ing ‘push­ing apart’ the two walls of the ham­mock at the top. The Wikipedia arti­cle on ham­mocks men­tions sailors using a “spreader bar”, which sounds like a solu­tion to my prob­lem. So I need to find a piece of wood and tweak it into a suit­able form, while avoid­ing any sharp cor­ners which might cut the nylon mate­r­ial of the ham­mock.

My first approach was to take a short nar­row plank about the width of my shoul­ders and saw V-shaped notches in each end, then bevel and sand the edges of the notches so they would­n’t fray the nylon cords. This was easy enough, and then one sim­ply sticks it in between the two cords on one end. It turns out that the plank slips out very eas­i­ly, and the pres­sure causes it to slide up halfway to form a dia­mond con­fig­u­ra­tion, which does­n’t accom­plish the goal of spread­ing the ham­mock, since the ham­mock is still dan­gling from a point where one wants a per­pen­dic­u­lar line. I could force it down the strings towards the end of the ham­mock, spread­ing out the ham­mock, but the instant any pres­sure was placed on the sys­tem (such as by get­ting in), the plank would either col­lapse or revert to a dia­mond+­point.

So if it kept slip­ping, I’d force it to stay. This time, I drilled two holes through oppo­site ends of the plan, fed the two ends of the nylon cord through the two holes, drew the cord as tight as pos­si­ble, then put a knot behind/above each hole in the plank. Now they could­n’t slip because the knots would not pass through the drilled holes, and I had left no slack, so when weight was put, a point was not formed but more of a tri­an­gle. (I did wind up adding extra knots as slack slowly grew.)

That worked. Now I could lie back and my chest was not being com­pressed. Com­bined with an inter­me­di­ate loose­ness of hang (it turns out tighter is not always bet­ter once the squeeze prob­lem is resolved and you’re sleep­ing slightly diag­o­nal­ly), the ham­mock was now very pleas­ant for nap­ping, and pretty good for sleep­ing.

I gave sleep­ing in ham­mock a few more nights of try­ing, and ran into a new prob­lem: the same fea­ture that makes ham­mocks so good for hot cli­mates also makes them prob­lem­atic in the win­ter, ie you’re exposed to the air. The chill woke me up early in the morn­ing twice, even after I added in a fuzzy blan­ket to sleep on top of. Prob­a­bly I could fix this by adding a thicker blan­ket under­neath, but I decided to pack up the ham­mock (which takes very lit­tle space) and retry again in spring when my room starts get­ting warm again.

In May 2015, after it became warm enough again that I needed air con­di­tion­ing on at night, I set the ham­mock back up and gave it another try. After 3 failed nights, I gave up: the cold­ness was great, but no posi­tion or ten­sion seemed to give my shoul­ders enough room to move and let me roll over. I con­cluded that I’m too used to sleep­ing in a bed to adapt to a ham­mock in the absence of trop­i­cal incen­tives.

So I moved the ham­mock out­side—what­ever its prob­lems for sleep­ing in, it’s great for nap­ping and infi­nitely more com­fort­able than the com­mon rope lawn ham­mock.

A par­tial solu­tion to the cold is foam earplugs; another inter­est­ing pos­si­bil­ity is a , a hol­low pil­low or frame­work which reduces the need for a thick blanket/pillow to clutch and sup­port one while lay­ing on one’s side. (I don’t know what to do about the neck issues. Prob­a­bly just put up with it as I always have.) The hol­low kind don’t seem eas­ily avail­able with­out order­ing from sketchy sites, so I took an old com­forter blan­ket, rolled it up roughly the width of my stom­ach to my chin, and tied it tightly into a bun­dle with nylon line, which works for now.

In progress

Some­one sug­gested that instead of run­ning exper­i­ments seri­al­ly, with lim­ited sam­ple sizes (be­cause I am impa­tient to try the next inter­est­ing sug­ges­tion), I could instead take a step up in sta­tis­ti­cal sophis­ti­ca­tion and use a design: use mul­ti­ple exper­i­men­tal inter­ven­tions simul­ta­ne­ously for a much larger sam­ple size, and then run analy­ses rather than sim­pler two-sam­ple t-tests. No less than praises mul­ti­fac­to­r­ial exper­i­ments as being more effi­cient: squeez­ing more data out of a given sam­ple. Hence, I thought a crazy thought: my lithium exper­i­ment was going to run for ~360 days, and so I kept putting it off. But what if I ran mul­ti­ple exper­i­ments for 360 days? If I had 4 or 5, then by the end of the year, I would have 5 results to show, and I would have the sta­tis­ti­cal equiv­a­lent of more than n = 72 (360⁄5) for each exper­i­ment. Win-win.

Clas­sic mul­ti­fac­to­r­ial designs arrange to have every pos­si­ble com­bi­na­tion of the n exper­i­ments hap­pen on some day or other (such an arrange­ment is called a ). How­ev­er, with 5 exper­i­ments, each of which has 2 states (on and off), that means I only have 25=32 pos­si­ble arrange­ments, all of which ought to be cov­ered over 360 days, ter­mi­nat­ing in March 2013. (It actu­ally will take much longer, as I paused the lithium sub­-ex­per­i­ment for sev­eral months to run the X self­-ex­per­i­ment.) So I will be lazy and will inde­pen­dently ran­dom­ize each exper­i­ment.

As it wound up, I had bit­ten off too much in try­ing to run inter­con­nected exper­i­ments: while the Red­shift exper­i­ment ran with­out too much prob­lem, an unex­pected and abrupt move in July 2012 com­pletely dis­rupted my daily rou­tine and I was unable to main­tain my habit of ran­dom­iz­ing my med­i­ta­tion ses­sions. So I will be ana­lyz­ing the exper­i­ments sep­a­rate­ly.


Rather than dumb­bells (might be hard to find in the dark), I decided to try out push-ups since I rou­tinely do 25 push-ups after show­er­ing and it ought to be men­tally easy to shift those push-ups to before/after bed­time. As before, alter­nate-day, but with a twist: on-days, I do the push-ups imme­di­ately before going to bed, but off-days entail imme­di­ately upon awak­en­ing. (I don’t exer­cise enough in gen­er­al.) I began 2011-09-21.

I inter­rupted the exper­i­ment for a long period to run the vit­a­min D exper­i­ments; when I resumed on 2012-05-08, I decided to avoid the alter­nate-day pro­ce­dure and instead ran­dom­ize morn­ing vs evening push ups with a coin. Non-blind­ed.

On 2012-11-13, I decided I was suffi­ciently con­vinced that exer­cise imme­di­ately before bed was dam­ag­ing my sleep latency that I did­n’t want to con­tinue to pay the price of worse sleep, and I dis­con­tin­ued this vari­able. Hope­fully the pre­vi­ous data will be suffi­cient to con­firm or dis­con­firm any effect.


The prac­tice of med­i­ta­tion can be time-in­ten­sive; a claimed anec­do­tal ben­e­fit is that one sleeps less and so the time require­ment isn’t as bad as it may seem.

Med­i­ta­tion has been linked with sleep changes mul­ti­ple times; see . In par­tic­u­lar, found a cor­re­la­tion between long med­i­ta­tion and reduced sleep need. The gen­eral link seems plau­si­ble—that delib­er­ate relax­ation may reduce the need for another kind of relax­ation (although I doubt med­i­ta­tion is going as far as reduc­ing synap­tic weights as the “synap­tic home­osta­sis” hypoth­e­sis pre­dicts which I dis­cuss in )—but I can think of at least 2 plau­si­ble ways the cor­re­la­tion would not be cau­sa­tion (1. those with less sleep need can afford to spend time on med­i­ta­tion; 2. med­i­ta­tion is par­tially sleep so there’s no cor­re­la­tion or cau­sa­tion to explain).

Ran­dom­ized on a daily basis: either 20-3015 min­utes of med­i­ta­tion or none. (I am not sure what a good placebo would be so I will omit it.) Non-blind­ed. My med­i­ta­tion is noth­ing fan­cy: sim­ple breath-fol­low­ing (based on early chap­ters of Mind­ful­ness in Plain Eng­lish).

Plau­si­bly, any decrease in sleep need could be due to long-term changes in the brain itself, as med­i­ta­tion is areas like the . Kaul et al 2010 above did not ran­dom­ize the long-term med­i­ta­tors’ use of med­i­ta­tion or appar­ently inves­ti­gate whether sleep time aver­ages cor­re­lated with med­i­ta­tion. If the changes are long-term, then there will be rel­a­tively lit­tle vari­a­tion dur­ing the 360 days and instead a grad­ual trend of less sleep. If no clear effect shows up in the analy­sis, I’ll try a before-after com­par­ison: com­pare n days before the exper­i­ment started to n days after the exper­i­ment and see if there is a differ­ence in the aver­ages.

Power calculation

Kaul et al 2010 describes the long-term med­i­ta­tors as spend­ing “2-3 hrs/day” in med­i­ta­tion. (Their exper­i­ment used novices who med­i­tated for 1 hour.) If med­i­ta­tion indeed reduces sleep time, but I am med­i­tat­ing for only 1⁄3 an hour, can I detect any effect?

The differ­ence between the long-term med­i­ta­tors and their nor­mal Indian coun­ter­parts was 5.2 hours of sleep per day ver­sus 7.8. Assume the worst case of 3 hours, this implies that med­i­ta­tion is indeed a net cost in time (8.2 > 7.8), but also that each hour of med­i­ta­tion is equiv­a­lent to almost an hour of sleep (). So at that con­ver­sion rate, 20 min­utes of med­i­ta­tion trans­lates to 17.32 min­utes less sleep. We will steal code and data from the pre­vi­ous Red­shift power cal­cu­la­tion: assume the same con­trol sleep, same stan­dard devi­a­tion, and sub­tract 17.32 from the con­trol to get the true mean of the inter­ven­tion

# install.packages("pwr")
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.5,type="paired",alternative="greater")

     Paired t test power calculation
              n = 157.237

# we're getting 360 days or 180 pairs; let's ask for more than 50-50 power;
# what does n = 180 buy us? Not much!
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.55,type="paired",alternative="greater")

     Paired t test power calculation

              n = 181.9631

# how many pairs *do* we need for good results?
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.75,

     Paired t test power calculation
              n = 521.5252

pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.56

     Paired t test power calculation
              n = 356.2923

This is dis­cour­ag­ing. With 180 pairs, we only have a 55% chance of see­ing any­thing at p = 0.05? That’s awful! But there’s no point in look­ing fur­ther into this power cal­cu­la­tion: I’m not going to be doing a paired t-test, after all, but some sort of ANOVA, and I’m not sure how much power the inter­fer­ing exper­i­ments cost me. The first cal­cu­la­tion is the most impor­tant: to sat­isfy some­what rea­son­able cri­te­ria, I need less than half the data I will get, which ought to be an ade­quate mar­gin of safe­ty.


For back­ground on “value of infor­ma­tion” cal­cu­la­tions, see the first cal­cu­la­tion.

I find med­i­ta­tion use­ful when I am screw­ing around and can’t focus on any­thing, but I don’t med­i­tate as much as I might because I lose half an hour. Hence, I am inter­ested in the sug­ges­tion that med­i­ta­tion may not be as expen­sive as it seems because it reduces sleep need to some degree: if for every two min­utes I med­i­tate, I need one less minute of sleep, that halves the time cost—I spend 30 min­utes med­i­tat­ing, gain back 15 min­utes from sleep, for a net time loss of 15 min­utes. So if I med­i­tate reg­u­larly but there is no sub­sti­tu­tion, I lose out on 15 min­utes a day. Fig­ure I skip every 2 days, that’s a total lost time of hours a year or $427 at min­i­mum wage. I find the the­ory some­what plau­si­ble (60%), and my year-long exper­i­ment has roughly a 55% chance of detect­ing the effect size (es­ti­mated based on the sleep reduc­tion in a Indian sam­ple of med­i­ta­tors). So . The exper­i­ment itself is unusu­ally time-in­ten­sive, since it involve ~180 ses­sions of med­i­ta­tion, which if I am “over­pay­ing” trans­lates to 45 hours () of wasted time or $315. But even includ­ing the design and analy­sis, that’s less than the cal­cu­lated value of infor­ma­tion.

This exam­ple demon­strates that drugs aren’t the only expen­sive things for which you should do exten­sive test­ing.


Orgasm has been linked occa­sion­ally with changes in sleep laten­cy, although one 1985 exper­i­men­tal study found no changes. cov­ers some incon­clu­sive fol­lowup stud­ies on related mat­ters like whether arousal or brief view­ing of porn inter­feres with sleep (no).

Ran­dom­ized on a daily basis before going to bed; no place­bo, but absti­nence. Non-blind­ed. Since the the­ory has always been about a very short­-term effect, there’s no need to worry about day­time activ­i­ties. (This would only mat­ter if I were test­ing some­thing like the folk wis­dom that mas­tur­ba­tion reduces testos­terone lev­els, where the tim­ing is not as impor­tant as the quan­ti­ty.)

Treadmill / walking desk

In June 2012, I acquire a free tread­mill. I became inter­ested in using it as a tread­mill desk, rea­son­ing that it was an easy way to get more exer­cise. My ini­tial days of use led me to sus­pect that the tread­mill desk’s exer­cise might come at the expense of some con­cen­tra­tion or pro­duc­tiv­i­ty. While I was able to quickly rule out any notice­able neg­a­tive cor­re­la­tion of tread­mill use with typ­ing speed/accuracy, that still leaves other pos­si­ble neg­a­tive effects.


Start­ing it part way, I lose poten­tial pow­er: there are only ~330 days left. The effect of most inter­est is pro­duc­tiv­i­ty, where I expect a neg­a­tive effect, but we also need a more strin­gent p-value since we’re look­ing at so many vari­ables; so 330 sam­ples gives a floor on detectable effect size of


     Paired t test power calculation

              n = 165
              d = -0.2355713

Not that great. We may wind up being able to con­clude lit­tle about the effect on pro­duc­tiv­i­ty; sim­i­larly for sleep­—the effect would have to be com­pa­ra­ble to vit­a­min D or mela­tonin to be detectable.


The VoI cal­cu­la­tion for this inves­ti­ga­tion is very diffi­cult: it may improve sleep and it may improve or worsen pro­duc­tiv­ity but regard­less is good for very valu­able exer­cise, scrap­ping the prac­tice has imme­di­ate cash val­ue, but none of this is cer­tain and there are few guides from exper­i­men­tal stud­ies.

If it turns out the tread­mill is not help­ful, I can prob­a­bly sell it for ~$100 based on prices listed in Craigslist. (I wound up sell­ing it for $70.) If it’s help­ful, I gain con­sid­er­able exer­cise (1MPH implies an 8-hour day could be 8 miles of exer­cise a day!) with the related ben­e­fits. I strongly sus­pect that this much exer­cise would influ­ence my sleep for the bet­ter, but I’m not sure the tread­mill desk really does allow for pro­duc­tiv­ity like reg­u­lar sit­ting does. If it does reduce pro­duc­tiv­ity some­what but I oth­er­wise can adapt, it’s prob­a­bly still a net gain because of the extra exer­cise. How­ev­er, a smal­l­-to-medium decrease—let’s say an effect size of d<=-0.4—­would be enough to cause me to scrap the tread­mill. This is highly unlike­ly. The large sam­ple gives a very good shot at detect­ing it. Run­ning the exper­i­ment is rel­a­tively easy since the tread­mill desk can be set up and put away in ~5 min­utes. With­out run­ning num­bers on this one, my best guess is that the VoI is neg­a­tive; so this is another exper­i­ment I am doing because it is inter­est­ing and other peo­ple may find it inter­est­ing, rather than because run­ning the exper­i­ment makes eco­nomic sense.

Morning caffeine pills

One trick to com­bat morn­ing slug­gish­ness is to get caffeine extra-early by using caffeine pills shortly before or upon try­ing to get up. From 2013-2014 I ran a blinded & place­bo-con­trolled ran­dom­ized exper­i­ment mea­sur­ing the effect of caffeine pills in the morn­ing upon awak­en­ing time and daily pro­duc­tiv­i­ty. The esti­mated effect is small and the pos­te­rior prob­a­bil­ity rel­a­tively low, but a deci­sion analy­sis sug­gests that since caffeine pills are so cheap, it would be worth­while to con­duct another exper­i­ment.

Main arti­cle: .

CO2/Bedroom ventilation experiment

Some psy­chol­ogy stud­ies find that CO2 impairs cog­ni­tion, and some sleep stud­ies find that bet­ter ven­ti­la­tion may improve sleep qual­i­ty. Use of a Netatmo air qual­ity sen­sor reveals that clos­ing my bed­room tightly to reduce morn­ing light also causes CO2 lev­els to spike overnight to 7x day­time lev­els. To inves­ti­gate the pos­si­ble harm­ful effects, in 2016 I run a non-blind self­-ex­per­i­ment ran­dom­iz­ing an open bed­room door and a bed­room box fan (2x2) and ana­lyze the data using a struc­tural equa­tion model of air qual­ity effects on a latent sleep fac­tor with mea­sure­ment error.

Main arti­cle: .


Inverse correlation of sleep quality with productivity?

Curi­ous­ly, play­ing around with the full potas­sium data after the 2013 morn­ing exper­i­ment, poor sleep qual­ity seemed to cor­re­late with higher mood/productivity rat­ings.

cor.test(pot$Disturbance, pot$MP)
#     Pearson`s product-moment correlation
# data:  pot$Disturbance and pot$MP
# t = 1.224, df = 49, p-value = 0.2269
# alternative hypothesis: true correlation is not equal to 0
# 95% confidence interval:
#  -0.1085  0.4275
# sample estimates:
#    cor
# 0.1722


While not sta­tis­ti­cal­ly-sig­nifi­cant, this inverse cor­re­la­tion comes as a sur­prise and I thought worth think­ing about more. I have a cou­ple the­o­ries on what could be going on:

  1. it could be an arti­fact and actu­ally bet­ter sleep means bet­ter per­for­mance: I’ve always been con­cerned about the pos­si­bil­ity of off-by-one errors in my data or analy­ses. If bet­ter sleep meant bet­ter per­for­mance (as one would naively sus­pec­t), and either sleep data or per­for­mance data was ‘shifted’ by one day, then you would observe the exact oppo­site.

    One would have to care­fully check the data and make sure every field is refer­ring to the time it should. If a entry records 10hrs sleep for 2012-02-03, does that refer to sleep that morn­ing which is nec­es­sary because you were awake dur­ing 2012-02-02, or does it refer to the sleep you engage in that evening (you go to bed at 11pm 2012-02-03 and that is the sleep data being used).

    This seems unlike­ly, since such an error should screw up all sorts of other analy­ses (for exam­ple such a flip ought to have claimed that potas­sium would help sleep, if days were being reversed).

  2. it could be that on pro­duc­tive days, you leap out of bed; but if you are depressed, unmo­ti­vat­ed, apa­thet­ic, you might hang around in bed for a while after the alarm rings. Depressed peo­ple some­times sleep more than reg­u­lar peo­ple; for pretty much this rea­son, I’d guess.

    This could be checked by look­ing at sleep qual­ity indi­ca­tors in the begin­ning or mid­dle of the night. For exam­ple time to fall asleep (higher on more pro­duc­tive days in this sam­ple), or per­cent­age in deep sleep (mostly done towards the begin­ning and mid­dle of a sleep; seemed to be lower for pro­duc­tive days). One could try to test the slug­gard hypoth­e­sis: how much past an alarm one snoozed.

  3. it’s a tem­po­rary cor­re­la­tion of this time peri­od, per­haps related to the potas­si­um, per­haps not.

    This is testable: with more data, does the cor­re­la­tion shrink or go away?

  4. I have some­times won­dered if I am depressed. One of the curi­ous facts about depres­sion is that the symp­toms of depres­sion in peo­ple who pre­fer evenings (owl­s), and I am indeed an owl. What does this imply?

    We can do some back­-of-the-en­ve­lope esti­mates. reports a very high depres­sion inci­dence; we’ll call it a 25% life­time risk. But pre­sum­ably the treat­ment only works if one is actu­ally in a depres­sive episode, and while it’s unclear what the dis­tri­b­u­tion or length of depres­sion period (as opposed to indi­vid­ual episodes) might be, it seems to be closer to years than months or decades, so we’ll put it at ~3 years out of an adult lifes­pan of ~60 years or a per-year risk of . On closer exam­i­na­tion of Selvi et al 2006, the morning/evening split only appears with the total sleep depri­va­tion pro­ce­dure (morn­ing types see their mood wors­en, evening sees it improve) while with par­tial sleep depri­va­tion both groups seem to see an improve­ment in their mood; since I rarely skip sleep entirely and such nights are dropped from the Zeo data, the total sleep depri­va­tion results are irrel­e­vant, but then my chrono­type being evening does­n’t mat­ter. Final­ly, the sleep depri­va­tion papers esti­mate <60% effec­tive­ness in the depressed, so that knocks the pos­si­bil­ity that both I am depressed and par­tial sleep depri­va­tion helps me to <0.025. 2.5% is not a large pos­si­bil­i­ty; and my vague spec­u­la­tion and a small inverse cor­re­la­tion do not seem like they would increase that pos­si­bil­ity a lot.

(If it’s not the­se, I don’t have any sug­ges­tion on why it might be. Why would poor sleep either cause pro­duc­tiv­ity or be caused by some­thing that later also causes pro­duc­tiv­i­ty?)


But before rashly assum­ing I am depres­sive or engag­ing in per­son­ally costly self­-ex­per­i­ments like sleep depri­va­tion, I decided on 2013-04-26 to check the cor­re­la­tion on a larger dataset.

Typ­ing up my full self­-rat­ing dataset of 416 days and clean­ing up all the data16, I rechecked the cor­re­la­tion: r = 0.06617 This is notice­ably smaller (hence, less prac­ti­cally rel­e­vant) than the pre­vi­ous cor­re­la­tion, is also not sta­tis­ti­cal­ly-sig­nifi­cant, and shrink­ing is what one would expect from a spu­ri­ous rela­tion­ship.

To be more sure, I reused some of the tech­niques from my (specifi­cal­ly, ) and looked for a rela­tion­ship; the result was sim­i­lar, an odds which was inverse but close to no effect (1.05718). More impor­tant­ly, when all the other vari­ables are taken into account in the logis­tic regres­sion, things change19: with other data to con­di­tion on, the inverse rela­tion­ship of sleep qual­ity with mood/productivity reverses and becomes the expected rela­tion­ship (an increase in sleep dis­tur­bances pre­dicts lower mood/productivity); many of the other vari­ables turn out to be far stronger pre­dic­tors (big­ger odd­s); and some of the signs look odd (how can total sleep time pre­dict increased mood/productivity, yet increas­ing all forms of sleep—REM/light/deep—predicts decreased mood/productivity‽). I attempted to con­struct a sim­pler mod­el, which wound up ignor­ing any met­ric of sleep dis­tur­bance and ignor­ing all but 3 vari­ables, and con­clud­ing that “Morn­ing Feel” was the most impor­tant pre­dic­tor20—which makes a lot of sense to me, and con­firms my pre­vi­ous exper­i­ments’ focus­ing on the “Morn­ing Feel” vari­able.

Given this weak­en­ing and in the absence of any cor­rob­o­rat­ing infor­ma­tion, I con­sider it highly unlikely that the orig­i­nal cor­re­la­tion is reflect­ing an anti-de­pres­sant effect due to sleep depri­va­tion. A fol­lowup in a few years may be war­ranted to see if a larger still dataset will shrink the cor­re­la­tion closer to zero.

Phases of the moon

I attempt to repli­cate, using pub­lic Zeo-recorded sleep datasets, a find­ing of a monthly cir­ca­dian rhythm affect­ing sleep in a small sleep lab. I find only small non-s­ta­tis­ti­cal­ly-sig­nifi­cant cor­re­la­tions, despite being well-pow­ered.

Main arti­cle: .

SDr lucid dreaming: exploratory data analysis

In Octo­ber 2012, an acquain­tance offered me an extract from his free-form data on which he had been com­pil­ing since 2004, to see what insights I could extract. In May 2013, I aug­mented it with another 60 entries

Data cleaning

The orig­i­nal text was a seri­ous mess, and I put sev­eral hours into clean­ing it up and orga­niz­ing it into some­thing more sen­si­ble. This was­n’t enough, so I wrote an ugly Haskell pro­gram to parse it into a quasi-CSV file:

import Data.List (isInfixOf, isPrefixOf, intercalate)
import Data.List.Split (splitOn) -- http://hackage.haskell.org/package/split

main :: IO ()
main = do txt <- readFile "2012-sdr-dream.txt"
          let txt' = filter (not . isPrefixOf "#") $ lines txt
          let header = drop 2 $ head $ filter (isPrefixOf "# Sleep Date,") $ lines txt
          let fields = map (splitOn ",") txt'
          let csvs = map convert fields
          putStrLn $ unlines (header : map show csvs)

data CSVEntry = CSVEntry { sleepDate :: String, totalZ :: Int,
                           wakeTime :: String, intensity :: String, recall :: String,
                           emotion :: String, interrupted :: Bool, melatonin :: Bool, lucid :: String }
instance Show CSVEntry where
 show a = intercalate "," [sleepDate a, if totalZ a == 0 then "" else show (totalZ a),
                           wakeTime a, intensity a, recall a, emotion a,
                           if interrupted a then "1" else "0", if melatonin a then "1" else "0", lucid a]

convert :: [String] -> CSVEntry
convert xs = CSVEntry { sleepDate = safeHead $ filter (\x -> isInfixOf "." x || isInfixOf "20" x) xs,
                        totalZ = timeToMinutes $ drop 12 $ safeHead $ filter (isInfixOf "dreamtime: ") xs,
                        wakeTime = drop 7 $ safeHead $ filter (isInfixOf "wake: ") xs,
                        intensity = drop 6 $ safeHead $ filter (isInfixOf "int: ") xs,
                        recall = drop 9 $ safeHead $ filter (isInfixOf "recall: ") xs,
                        emotion = drop 6 $ safeHead $ filter (isInfixOf "emo: ") xs,
                        lucid =  drop 8 $ safeHead $ filter (isInfixOf "lucid: ") xs,
                        interrupted = any (isInfixOf "interrupted") xs,
                        melatonin = any (isInfixOf "melatonin") xs }
                                safeHead :: [String] -> String
                                safeHead ys = if null ys then "" else head ys

                                -- clock hour:minute to total minutes: timeToMinutes "4:30" → 270
                                timeToMinutes :: String -> Int
                                timeToMinutes a = if null a then 0 else let (x,y) = break (==':') a
                                                     in read x * 60 + read (tail y)


This was usable. My next ques­tion was: since none of his rou­tines were ran­dom­ized and cor­re­la­tions were all that one could extract, what cor­re­la­tions were in his data?

table <- read.csv("https://www.gwern.net/docs/zeo/2013-sdr-dream.csv")
      Sleep.Date     Total.Z        Wake.Time     Intensity        Recall         Emotion
 2011.10.02:  2   Min.   : 120           :217   Min.   :0.10   Min.   :0.000   Min.   :-0.50
 2011.11.26:  2   1st Qu.: 480   16:00   :  3   1st Qu.:0.30   1st Qu.:0.200   1st Qu.: 0.00
 2012.02.28:  2   Median : 600   11:00   :  2   Median :0.40   Median :0.300   Median : 0.20
 2012.04.15:  2   Mean   : 613   13:23:00:  2   Mean   :0.44   Mean   :0.367   Mean   : 0.18
 2012.06.21:  2   3rd Qu.: 720   19:17:00:  2   3rd Qu.:0.50   3rd Qu.:0.500   3rd Qu.: 0.40
 2013.01.23:  2   Max.   :1320   4:55:00 :  2   Max.   :7.00   Max.   :1.000   Max.   : 0.70
 (Other)   :316   NA's   :8      (Other) :100   NA's   :94     NA's   :26      NA's   :296
  Interrupted     Melatonin          Lucid      Day.quality
 Min.   :0.00   Min.   :0.0000   Min.   :0.0   Min.   :0.10
 1st Qu.:0.00   1st Qu.:0.0000   1st Qu.:0.1   1st Qu.:0.30
 Median :0.00   Median :0.0000   Median :0.2   Median :0.40
 Mean   :0.07   Mean   :0.0762   Mean   :0.2   Mean   :0.42
 3rd Qu.:0.00   3rd Qu.:0.0000   3rd Qu.:0.2   3rd Qu.:0.52
 Max.   :1.00   Max.   :1.0000   Max.   :0.6   Max.   :0.70
 NA's   :76                      NA's   :319   NA's   :312

# These 2 date fields haven't been turned into anything useful, so we'll just delete them:
rm(table$Wake.Time, table$Sleep.Date)

# Warning: 'Lucid' has just 9 datapoints, and 'Melatonin' just 6!
# Table cleaned up heavily by hand from default R output:
# deleted duplicates, censored any correlation -0.1<x<0.1 etc.
             Recall  Emotion Interrupted Melatonin  Lucid  Day.quality
Total.Z                                    -0.12    -0.43  0.56
Intensity    0.35     0.37                           0.79
Recall                0.16      -0.16       0.14    -0.15
Emotion                          0.28      -0.14
Interrupted                                          0.91
Melatonin                                                  0.25

Much of the data is too impov­er­ished to draw any sug­ges­tions from. The remain­ing cor­re­la­tions are:

  • ‘Inten­sity’/‘Recall’: r = 0.35

    The causal­ity is likely ‘Inten­sity’->‘Recall’; either one is prob­a­bly impos­si­ble to exper­i­men­tally manip­u­late.

  • ‘Inten­sity’/‘Emo­tion’: r = 0.37

    Causal­ity could go either way or to a third fac­tor; ‘Emo­tion’ might be manip­u­la­ble by intend­ing to dream of dis­turb­ing top­ics, but might not.

  • ‘Inter­rupted’/‘Recall’: r=-0.16

  • ‘Inter­rupted’/‘Emo­tion’: r = 0.28

    ‘Inter­rup­tion’ is exper­i­men­tally manip­u­la­ble by eg. an alarm clock or room­mate. ‘Recall’ might be improved by some change in jour­nal­ing, for exam­ple doing at your bed instead of wait­ing until you’re on your com­put­er. The pos­i­tive cor­re­la­tion with ‘Emo­tion’ sug­gests that, per the WILD method­ol­ogy of lucid dream­ing (see & , Explor­ing the World of Lucid Dream­ing), a tem­po­rary awak­en­ing does increase the chance of a lucid dream (laden with emo­tion).

  • ‘Mela­tonin’ inter­est­ingly cor­re­lates with both day qual­ity and with reduced sleep; this is inter­est­ing because Total.Z increas­ing also increased Day.quality so it’s not clear how mela­tonin could do both at the same time if more sleep is oth­er­wise bet­ter. The cor­re­la­tions may be sta­tis­ti­cal­ly-sig­nifi­cant but the data is too wretched and the melatonin/day-quality vari­ables too few to say any­thing fur­ther.

(One obser­va­tion that came to mind work­ing on clean­ing the data was that col­lec­tion was very sparse, spo­radic, and acci­den­tal-look­ing.)

So these gen­eral points sug­gest 3 future over­lap­ping approach­es:

  1. delib­er­ate use of inter­rup­tions (maybe ran­dom­ized), to inves­ti­gate effect on lucid dream­ing
  2. more sys­tem­atic usage (per­haps ran­dom­ized or blind­ed) of mela­ton­in, to allow cor­re­la­tions or causal infer­ences to other vari­ables
  3. attack­ing the unsys­tem­atic data col­lec­tion (per­haps it’s too much trou­ble to do all those vari­ables each day?) by get­ting a Zeo to han­dle part of the data col­lec­tion for you.

  1. Rel­e­vant papers:

    Also rel­e­vant: “Com­par­ing 10 Sleep Track­ers (2017): How well do they track your sleep? A 9-day min­ute-by-minute com­par­i­son”.↩︎

  2. The cheaper alter­na­tive to the Zeo would be the , the most pop­u­lar of the many accelerom­e­ters on the mar­ket. There aren’t many com­par­isons; Diana Sher­man com­pared one night, Joe Betts-LaCroix com­pared ~38 nights of data, and Christo­pher Win­ter com­pared one night of polysomnog­ra­phy, Philips Acti­watch Spec­trum (actig­ra­phy), Basis Chrome (move­ment, heart-rate, oth­er­s), and the Jaw­bone Up & Fit­Bit Flex & iPhone+“24/7” (actig­ra­phy). In the pre­vi­ous cas­es, the Fit­bit seemed to be pretty sim­i­lar to the Zeo at esti­mat­ing total sleep time (the only thing it can mea­sure). Betts-LaCroix explic­itly rec­om­mends the Zeo, but I’m not clear on whether that is due to the bet­ter data qual­ity or because Fit­bit made it hard to impos­si­ble for him to extract the detailed Fit­bit data while Zeo offers easy export­ing. Sim­i­lar­ly, in her 2013 Ams­ter­dam talk, Chris­tel De Maeyer presents her sleep data sum­maries (means) from two dis­joint time peri­ods using the Zeo and the accelerom­e­ter band which were com­pa­ra­ble for total sleep esti­mates. In any case, I already have the Zeo and I’ve come to like the detailed infor­ma­tion.↩︎

  3. I had pre­vi­ously tried huperzine-A and sub­jec­tively noticed no effect from it, but I had no way of really notic­ing any effect on sleep, and in his The Four-hour Body claims:

    Tak­ing 200 mil­ligrams of huperzine-A 30 min­utes before bed can increase total REM by 20-30%. Huperzine-A, an extract of Huperzia ser­rata, slows the break­down of the neu­ro­trans­mit­ter acetyl­choline. It is a pop­u­lar nootropic (smart drug), and I have used it in the past to accel­er­ate learn­ing and increase the inci­dence of lucid dream­ing. I now only use huperzine-A for the first few weeks of lan­guage acqui­si­tion, and no more than three days per week to avoid side effects. Iron­i­cal­ly, one doc­u­mented side effect of overuse is insom­nia. The brain is a sen­si­tive instru­ment, and while gen­er­ally well tol­er­at­ed, this drug is con­traindi­cated with some classes of med­ica­tions. Speak with your doc­tor before using.

  4. My own sus­pi­cion is that given the exis­tence of neu­ron-level sleep in mice, poor self­-mon­i­tor­ing in humans, and anec­do­tal reports about polypha­sic sleep, is that polypha­sic sleep is a real & work­able phe­nom­e­non but that it comes at the price of a large chunk of men­tal per­for­mance.↩︎

  5. argues that there is no need for peo­ple to use the old frame­work of p-val­ues and null hypothe­ses etc, with their many well-known philo­soph­i­cal diffi­cul­ties and mis­lead­ing inter­pre­ta­tion­s—in­ter­pre­ta­tions I, alas, per­pet­u­ate in my analy­ses with my use of sta­tis­ti­cal sig­nifi­cance:

    Nev­er­the­less, some peo­ple have the impres­sion that con­clu­sions from NHST and Bayesian meth­ods tend to agree in sim­ple sit­u­a­tions such as com­par­i­son of two groups: “Thus, if your pri­mary ques­tion of inter­est can be sim­ply expressed in a form amenable to a t-test, say, there really is no need to try and apply the full Bayesian machin­ery to so sim­ple a prob­lem.” (Brooks, 2003, p. 2694) This arti­cle shows, to the con­trary, that Bayesian para­me­ter esti­ma­tion pro­vides much richer infor­ma­tion than the NHST t-test, and that its con­clu­sions can differ from those of the NHST t-test. Deci­sions based on Bayesian para­me­ter esti­ma­tion are bet­ter founded than NHST, whether the deci­sions of the two meth­ods agree or not. The con­clu­sion is bold but sim­ple: Bayesian para­me­ter esti­ma­tion super­sedes the NHST t-test.

    Unfor­tu­nate­ly, while I have no love for NHST, I did find it much eas­ier to use the NHST con­cepts & code when learn­ing how to do these analy­ses. In the future, hope­fully I can switch to Bayesian tech­niques.↩︎

  6. The usual way to cor­rect for the issue of mul­ti­ple com­par­isons inflat­ing results (a big prob­lem in epi­demi­ol­ogy and why their results are so often false) is to use a —if I look at the p-val­ues for 7 Zeo met­rics, I would­n’t con­sider any to be sta­tis­ti­cal­ly-sig­nifi­cant at ‘p = 0.05’ unless they were actu­ally sta­tis­ti­cal­ly-sig­nifi­cant at , which is even more strin­gent than the rarer ‘p = 0.01’ cri­te­ri­on. With the even stronger cri­te­rion ‘p = 0.007’, it’s a safe bet than none of my tests give sta­tis­ti­cal­ly-sig­nifi­cant results. Which may be the right thing to con­clude, since all my data is just n = 1 and unre­li­able in many ways, but still, the Bon­fer­roni cor­rec­tion is not being very help­ful here.

    The caveat is that the Bon­fer­roni cor­rec­tion is intended for use on ‘inde­pen­dent’ data, while the Zeo met­rics are all very depen­dent, some by defi­n­i­tion (eg. ZQ is defined partly as what the REM sleep length was, AFAIK). So while the Bon­fer­roni cor­rec­tion will still do the job of only let­ting through really sta­tis­ti­cal­ly-sig­nifi­cant data, it’ll do so by throw­ing out way more poten­tially good results than one has to. (It’ll avoid some false pos­i­tives by mak­ing many false neg­a­tives.) So what should we do?

    Andy McKen­zie sug­gested lim­it­ing our by using the method of Ben­jamin & Hochberg 1995:

    …let’s say that you test 6 hypothe­ses, cor­re­spond­ing to differ­ent fea­tures of your Zeo data. You could use a t-test for each, as above. Then aggre­gate and sort all the p-val­ues in ascend­ing order. Let’s say that they are 0.001, 0.013, 0.021, 0.030, 0.067, and 0.134.

    Assume, arbi­trar­i­ly, that you want the over­all false dis­cov­ery rate to be 0.05, which is in this con­text called the q-val­ue. You would then sequen­tially test, from the last value to the first, whether the cur­rent p-value is less than . You stop when you get to the first true inequal­ity and call the p-val­ues of the rest of the hypothe­ses [sta­tis­ti­cal­ly-]sig­nifi­cant.

    So in this exam­ple, you would stop when you cor­rectly call , and only the hypothe­ses cor­re­spond­ing to the first four [small­est] p-val­ues would be called [sta­tis­ti­cal­ly-]sig­nifi­cant.

  7. If we cor­rect for mul­ti­ple com­par­isons (see pre­vi­ous foot­note) at q-val­ue=0.05, none of them sur­vive:

    p.adjust(c(0.11,0.77,0.89,0.16,0.63,0.74,0.73,0.63,0.20), method="BH") < 0.05

    Oh well.↩︎

  8. “Block­ing” is a style of vari­a­tion on a sim­ple ran­dom­ized design where instead of con­sid­er­ing each day sep­a­rate and ran­dom­iz­ing a sin­gle day, we instead ran­dom­ize pairs of days, or more; so instead of flip­ping our coin to decide whether ‘this week’ is place­bo, we flip our coin to decide whether ‘this week will be placebo & next active’ or ‘this week active & next placebo’. This has 2 big advan­tages which jus­tify the com­plex­i­ty:

    1. Often, I’m wor­ried about sim­ple ran­dom­iza­tion lead­ing to an imbal­ance in sam­ple vs exper­i­men­tal; if I’m only get­ting 20 total dat­a­points on some­thing, then ran­dom­iza­tion could eas­ily lead to some­thing like 14 con­trol and 6 exper­i­men­tal dat­a­points—throw­ing out a lot of sta­tis­ti­cal power com­pared to 10 con­trol and 10 exper­i­men­tal! Why am I los­ing pow­er? Because data is sub­ject to : each new point reduces the stan­dard error of your esti­mates less than the pre­vi­ous one did (since the total error shrinks as, rough­ly, inverse of the square root of the total sam­ple size; the differ­ence between √1 and √2 is big­ger and shrinks error more than √2 vs √3, etc) . So the extra 4 con­trol dat­a­points reduce the error less than the lost 4 exper­i­men­tal dat­a­points would have, and this leaves me with a final answer less pre­cise than if it had been exactly 10:10. (If dimin­ish­ing returns isn’t intu­itive, imag­ine tak­ing it to an extreme: is 10:10 just as good as 5:15? As good as 2:18? How about 0:20?) But if I pair days like this, then I know I will get exactly 10:10.
    2. Block­ing is the nat­ural way to han­dle mul­ti­ple-day effects or trends: if I think lithium oper­ates slow­ly, I will pair entire weeks or months, rather than days and hop­ing enough exper­i­men­tal and con­trol days form runs which will reveal any trend rather than wash it out in aver­ag­ing.
  9. The net present value for­mula is the annual sav­ings divided by the nat­ural log of the dis­count rate, out to eter­ni­ty. Expo­nen­tial dis­count­ing means that a bond that expires in 50 years is worth a sur­pris­ingly sim­i­lar amount to one that con­tin­ues pay­ing out for­ev­er. For exam­ple, a 50 year bond pay­ing $10 a year at a dis­count rate of 5% is worth sum (map (\t -> 10 / (1 + 0.05)^t) [1..50]) → 182.5 but if that same bond never expires, it’s worth 10 / log 1.05 = 204.9 or just $22.4 more! My own expected longevity is ~50 more years, but I pre­fer to use the sim­ple nat­ural log for­mula rather than the more accu­rate sum­ma­tion. Either way is inter­est­ing; Vaniver:

    …pos­si­bly a way to drive it home is to talk about divid­ing by log 1.05, which is essen­tially mul­ti­ply­ing by 20.5. If you can make a one-time invest­ment that pays off annu­ally until you die, that’s worth 20.5 times the annual return, and mul­ti­ply­ing the value of some­thing by 20 can often move it from not worth think­ing about to worth think­ing about.

  10. Vaniver notes that one rea­son I might be less con­fi­dent than you would expect is that many sub­stances or sup­ple­ments lose effect over time as one’s body regains home­osta­sis and com­pen­sates for the sub­stance, build­ing tol­er­ance. Which is quite true, and a major rea­son I tested mela­ton­in—I was sure it worked for me in the past, but did it still work?↩︎

  11. For sim­plic­i­ty, in all my VoI cal­cu­la­tions I assume that I’ll stop buy­ing the sup­ple­ment (or doing the activ­i­ty) if I hit a neg­a­tive result. The proper way a real ana­lyst would do this value of infor­ma­tion ques­tion would be to say that the neg­a­tive result gives us addi­tional infor­ma­tion which changes the expect­ed-value of mela­tonin use.

    In my mela­tonin arti­cle arti­cle, I cal­cu­lated that since mela­tonin saved me close to an hour while each dose cost lit­er­ally a penny or two, the value was astro­nom­i­cal—$2350.60 a year! By Bayes’ for­mu­la, if I started with 80% con­fi­dence and had a 95% accu­rate test, a neg­a­tive result drops my 80% all the way down to 17%. We get this by using a deriva­tion of Bayes’s the­o­rem:

    But iron­i­cally if I now believed that mela­tonin only had a 17% chance of doing some­thing help­ful rather than noth­ing at all (as com­pared to my orig­i­nal 80% belief), well, 17% of $2350 ($117) is still way more money than the mela­tonin cost ($10), so I’d use it any­way!

    Would it make sense to iter­ate again and test mela­tonin a sec­ond time? Well, what does the cal­cu­la­tion say? We have a new prior of 17; what hap­pens if we get a neg­a­tive result again? and then the expected value is , which is not much more than the cost of $10, and given the diffi­cult-to-quan­tify pos­si­bil­ity of neg­a­tive long-term health effects, is not enough of a profit to really entice me.↩︎

  12. Tech­nol­ogy Review edi­tor Emily Singer noticed the same prob­lem when using her Zeo.↩︎

  13. The R inter­preter ses­sion, load­ing a CSV as before:

    R> zeo <- read.csv("https://www.gwern.net/docs/zeo/2011-zeo-oneleg.csv")
    R> colnames(zeo)[24] <- "OneLeg"
    R> l <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM,
                     Time.in.Light, Time.in.Deep, Awakenings, Morning.Feel)
                ~ OneLeg, data=zeo)
    R> summary(manova(l))
    #           Df Pillai approx F num Df den Df Pr(>F)
    # OneLeg     1  0.177     1.37      9     57   0.23
    # Residuals 65
    R> summary(l)
    # Response ZQ :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   96.231      1.712   56.22   <2e-16
    # OneLeg        -1.244      0.883   -1.41     0.16
    # Response Total.Z :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   514.67       8.84    58.2   <2e-16
    # OneLeg         -4.09       4.56    -0.9     0.37
    # Response Time.to.Z :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   14.949      1.373   10.89  2.7e-16
    # OneLeg         0.469      0.708    0.66     0.51
    # Response Time.in.Wake :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   12.821      2.786    4.60    2e-05
    # OneLeg        -0.369      1.436   -0.26      0.8
    # Response Time.in.REM :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   168.72       4.25   39.70   <2e-16
    # OneLeg         -5.33       2.19   -2.43    0.018
    # Response Time.in.Light :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   277.15       6.06   45.75   <2e-16
    # OneLeg          2.76       3.12    0.88     0.38
    # Response Time.in.Deep :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   69.282      1.802   38.44   <2e-16
    # OneLeg        -1.558      0.929   -1.68    0.098
    # Response Awakenings :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   4.1538     0.3690   11.26   <2e-16
    # OneLeg       -0.0513     0.1902   -0.27     0.79
    # Response Morning.Feel :
    # Coefficients:
    #             Estimate Std. Error t value Pr(>|t|)
    # (Intercept)   2.8718     0.1014    28.3   <2e-16
    # OneLeg       -0.0525     0.0523    -1.0     0.32
  14. If we cor­rect for mul­ti­ple com­par­isons (see pre­vi­ous foot­note on the Bon­fer­roni cor­rec­tion) at q-val­ue=0.05, none of them sur­vive:

    p.adjust(c(0.16,0.37,0.51,0.80,0.02,0.38,0.10,0.79,0.32), method="BH") < 0.05

    Oh well! Sta­tis­tics is a harsh mis­tress indeed.↩︎

  15. I don’t use a timer, but instead count 400 full breaths. Depend­ing on how fast and shal­lowly I breathe, this runs from 20-35 min­utes (eg. 2012-05-16’s med­i­ta­tion ran 33 min­utes long). To be con­ser­v­a­tive, I will assume the med­i­ta­tion is only 20 min­utes. In mid-Oc­to­ber, I bought and began using instead a timer which could be set to 15 min­utes.↩︎

  16. The exact pro­cess­ing steps, for those curi­ous:

    zeo <- read.csv("https://www.gwern.net/docs/zeo/gwern-zeodata.csv")
    zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y")
    mp <- read.csv("mp.csv", colClasses=c("Date","factor"))
    zeo$MP <- ordered(mp[mp$Date %in% zeo$Sleep.Date,]$MP)
    zeo$Disturbance <- scale(zeo$Time.to.Z) + scale(zeo$Awakenings) + scale(zeo$Time.in.Wake)
    zeo <- zeo[!is.na(zeo$Disturbance) & !is.na(zeo$Morning.Feel),]
  17. Load & cor­re­late:

    zeo <- read.csv("https://www.gwern.net/docs/zeo/2013-gwern-sleepdisturbances-productivity.csv")
    cor.test(zeo$Disturbance, as.integer(zeo$MP))
        Pearson`s product-moment correlation
    data:  zeo$Disturbance and as.integer(zeo$MP)
    t = 1.344, df = 414, p-value = 0.1798
    alternative hypothesis: true correlation is not equal to 0
    95% confidence interval:
     -0.03045  0.16102
    sample estimates:
  18. We regress a con­tin­u­ous pre­dic­tor onto a cat­e­gor­i­cal out­come:

    # turn into an ordinal variable
    zeo$MP <- ordered(zeo$MP)
    lmodel <- polr(MP ~ Disturbance, data = zeo); summary(lmodel)
                 Value Std. Error t value
    Disturbance 0.0553     0.0429    1.29
        Value  Std. Error t value
    1|2 -4.413  0.450     -9.808
    2|3 -0.990  0.110     -8.965
    3|4  1.101  0.113      9.711
    Residual Deviance: 915.66
    AIC: 923.66
  19. Try out more vari­ables:

    almodel <- polr(MP ~ Disturbance + ZQ + Total.Z + Time.to.Z + Time.in.Wake + Time.in.REM +
                         Time.in.Light + Time.in.Deep + Awakenings + Morning.Feel, data = zeo); almodel
      Disturbance            ZQ       Total.Z     Time.to.Z  Time.in.Wake   Time.in.REM Time.in.Light
        -0.431623     -0.276236      0.307941      0.045819      0.003266     -0.246901     -0.272593
     Time.in.Deep  Morning.Feel
        -0.227003      0.205541
        1|2     2|3     3|4
    -2.9105  0.5465  2.6902
    Residual Deviance: 903.01
    AIC: 927.01
  20. Reduced by cut­ting out extra­ne­ous vari­ables using :

    salmodel <- step(almodel); summary(salmodel)
                   Value Std. Error t value
    Time.to.Z     0.0163    0.00713    2.29
    Time.in.Deep -0.0152    0.00823   -1.85
    Morning.Feel  0.1906    0.12683    1.50
        Value  Std. Error t value
    1|2 -4.457  0.785     -5.675
    2|3 -1.011  0.649     -1.557
    3|4  1.113  0.649      1.713
    Residual Deviance: 907.60
    AIC: 919.60