I discuss my beliefs about Quantified Self, and demonstrate with a series of single-subject design self-experiments using a Zeo. A Zeo records sleep via EEG; I have made many measurements and performed many experiments. This is what I have learned so far:

- the Zeo headband is wearable long-term
- melatonin improves my sleep
- one-legged standing does little
- Vitamin D (at night) damages my sleep
- Vitamin D (in morning) does not affect my sleep
- potassium (over the day but not so much the morning) damages my sleep and does not improve my mood/productivity
- small quantities of alcohol appear to make little difference to my sleep quality
- I may be better off changing my sleep timing by waking up somewhat earlier & going to bed somewhat earlier

Quantified Self (QS) is a movement with many faces and as many variations as participants, but the core of everything is this: experiment with things that can improve your life.

# What is QS?

Quantified Self is not expensive devices, or meet-ups, or videos, or even ebooks telling you what to do. Those are tools to an end. If reading this page does anything, my hope is to pass on to some readers the Quantified Self *attitude*: a playful thoughtful attitude, of wondering whether this thing affects that other thing and what implications could be easily tested. âScienceâ without the capital âSâ or the belief that only scientists are allowed to think.

Thatâs all Quantified Self is, no matter how simple or complicated your devices, no matter how automated your data collection, no matter whether you found a pedometer lying around or hand-engineered your own EEG headset.

Quantified Self is simply about having ideas, gathering some data, seeing what it says, and improving oneâs life based on the data. If gathering data is too hard and would make your life worse off - then donât do it! If the data canât make your life better - then donât do it! Not every idea can or should be tested.

The QS cycle is straightforward and flexible:

- Have an idea
- Gather data
- Test the data
- Make a change; GOTO 1

Any of these steps can overlap: you may be collecting sleep data long before you have the idea (in the expectation that you *will* have an idea), or you may be making the change as part of the data in an experimental design, or you may inadvertently engage in a ânatural experimentâ before wondering what the effects were (perhaps the baby wakes you up on random nights and lets you infer the costs of poor sleep).

The point is not publishable scientific rigor. If you are the sort of person who wants to run such rigorous self-experiments, fantastic! The point is making your life better, for which scientific certainty is not necessary: imagine you are choosing between equally priced sleep pills and equal safety; the first sleep pill will make you go to sleep faster by 1 minute and has been validated in countless scientific trials, and while the second sleep pill has in the past week has ended the sweaty nightmares that have plagued you every few days since childhood but alas has only a few small trials in its favor - which would you choose? I would choose the second pill!

To put it in more economic/statistical terms, what we want from a self-experiment is for it to give us a confidence just good enough to tell whether the expected value of our idea is more than the idea will cost. But we donât need more confidence unless we want to persuade other people! (So from this perspective, it is possible to do a QS self-experiment which is âtoo goodâ. Much like one can overpay for safety and buy too much insurance - like extra warranties on electronics such as video game consoles, a notorious rip-off.)

## What QS Is Not: (Just) Data Gathering

One failure mode which is particularly dangerous for QSers is to overdo the data collection and collect masses of data they never *use*. Famous computer entrepreneur & mathematician Stephen Wolfram exemplified this for me in March 2012 with his lengthy blog post âThe Personal Analytics of My Lifeâ in which he did some impressive graphing and exploration of data from 1989 to 2012: a third of a million (!) emails, full keyboard logging, calendar, phone call logs (with missed calls include), a pedometer, revision history of his tome *A New Kind of Science*, file types accessed per date, parsing scanned documents for dates, a treadmill, and perhaps more he didnât mention.

Wolframâs dataset is well-depicted in informative graphs, breathtaking in its thoroughness, and even more impressive for its duration. So why do I read his post with sorrow? I am sad for him because I have read the post several times, and as far as I can see, he has not benefited in any way from his data collection, with one minor exception:

Very early on, back in the 1990s, when I first analyzed my e-mail archive, I learned that a lot of e-mail threads at my company would, by a certain time of day, just resolve themselves. That was a useful thing to know, because if I jumped in too early I was just wasting my time.

Nothing else in his life was better 1989-2012 because he did all this, and he shows no indication that he will benefit in the future (besides having a very nifty blog post). And just reading through his post with a little imagination suggests plenty of experiments he could do:

He mentions that 7% of his keystrokes are the Backspace key.

This seems remarkably high and must be slowing down his typing by a nontrivial amount. Why doesnât he try a typing tutor to see if he can improve his typing skill, or learn the keyboard shortcuts in his text editor? If he is wasted >7% of all his typing (because he had to type what he is Backspacing over, of course), then he is wasting typing time, slowing things done, adding frustration to his computer interactions and worst, putting himself at greater risk of crippling RSI.- How often does he access old files? Since he records access to all files, he can ask whether all the logging is paying for itself.
- Is there any connection between the steps his pedometer records and things like his mood or emailing? Exercise has been linked to many benefits, both physical and mental, but on the other hand, walking isnât a very quick form of exercise. Which effect predominates? This could have the practical consequence of scheduling a daily walk just as he tries to make sure he can have dinner with his family.
- Does a flurry of emails or phone calls disrupt his other forms of productivity that day? For example, while writing his book would he have been better off barricading himself in solitude or working on it in between other tasks?
His email counts are astonishingly high in general:

Is answering so many emails

*really*necessary? Perhaps he has put too much emphasis on email communication, or perhaps this indicates he should delegate more - or if running Mathematica is so time-consuming, perhaps he should re-evaluate his life and ask whether that is what he truly wants to do now. I have no idea what the answer to any of these questions are or whether an experiment of any kind could be run on them, but these are key life decisions which could be prompted by the data - but werenât.

Another QS piece(âItâs Hard to Stay Friends With a Digital Exercise Monitorâ) struck me when the author, Jenna Wortham, reflected on her experience with her Nike+ FuelBand motion sensor:

The forgetfulness and guilt I experienced as my FuelBand honeymoon wore off is not uncommon, according to people who study behavioral science. The collected data is often interesting, but it is hard to analyze and use in a way that spurs change. âIt doesnât trigger you to do anything habitually,â said Michael Kim, who runs Kairos Labs, a Seattle-based company specializing in designing social software to influence behaviorâŠMr.Â Kim, whose rĂ©sumĂ© includes a stint as director of Xbox Live, the online gaming system created by Microsoft, said the game-like mechanisms of the Nike device and others like it were ânot enoughâ for the average user. âPoints and badges do not lead to behavior change,â he said.

One thinks of a saying of W. Edwards Deming: âExperience by itself teaches nothing.â Indeed. A QS experiment is a 4-legged beast: if any leg is far too short or far too long, it canât carry our burdens.

And with Wolfram and Wortham, we see that 2 legs of the poor beast have been amputated. They collected data, but they had no ideas and they made no changes in their life; and because QS was not part of their life, it soon left their life. Wortham seems to have dropped the approach entirely, and Wolfram may only persevere for as long as the data continues to be useful in demonstrating the abilities of his companyâs products.

# Zeo QS

On Christmas 2010, I received one of Zeo Incâs (founded 2003, shutting down 2013) Zeo bedside unit after long coveting it and dreaming of using it for all sorts of sleep-related questions. (As of February 2013, the bedside unit seems toâve been discontinued; the most comparable Zeo Inc. product seems to be the Zeo Sleep Manager Pro, ~$90.) With it, I begin to apply my thoughts about Quantified Self.

A Zeo is a scaled-down (one-electrode) EEG sensor-headband, which happens to have an alarm clock attached. The EEG data is processed to estimate whether one is asleep and what stage of sleep one is in. Zeo breaks sleep down into waking, REM, light, and deep. (The phases arenât necessarily that physiologically distinct.) Itâs been compared with regular polysomnography by Zeo Inc and others (see also Griessenberger et al 2013) and seems to be reasonably accurate. (Since regular sleep tests cost thousands of dollars per session and are of questionable external validity since they are a very different setting than your own bedroom, I am fine with a Zeo being just âreasonablyâ accurate.)

The data is much better than what you would get from more popular methods like cellphones with accelerometers, since an accelerometer only knows if you are moving or not, which isnât a very reliable indicator of sleep^{1}. (You could just be lying there staring at the ceiling, wide awake. Or perhaps the cat is kneading you while you are in light sleep.) As well, half the interest is how exactly sleep phases are arranged and how long the cycles are; you could use that information to devise a custom polyphasic schedule or just figure out a better nap length than the rule-of-thumb of 20 minutes. And the price isnât *too* bad - $150 for the normal Zeo as of February 2012. (The basic mobile Zeo is much cheaper, but Iâve seen people complain about it and apparently it doesnât collect the same data as more expensive mobile version or the original bedside unit.)

# Tests

âA thinker sees his own actions as experiments & questions - as attempts to find out something. Success and failure are for him

answersabove all.â âFriedrich Nietzsche,The Happy Science#41

I personally want the data for a few distinct purposes, but in the best Quantified Self vein, mostly experimenting:

more thoroughly quantifying the benefits of melatonin

- and dose levels: 1.5mg may be too much. I should experiment with a variety: 0.1, 0.5, 1.0, 1.5, and 3mg?

- quantifying the costs of modafinil
- testing benefits of huperzine-A
^{2} - designing & starting polyphasic sleep
- assisting lucid dreaming
- reducing sleep time in general (better & less sleep)
investigating effects of n-backing:

- do n-backing just before sleep, and see whether percentages shift (more deep sleep as the brain grows/changes?) or whether one sleeps better (fewer awakenings, less light sleep).
- do n-backing after waking up, to look for correlation between good/bad sleeps and performance (one would expect good sleep ~> good scores).
- test the costs of polyphasic sleep on memory
^{3}

- (positive) effect of Seth Robertsâs one-legged standing on sleep depth/efficiency
- possible sleep reductions due to meditation
serial cable uses:

- quantifying meditation (eg. length of gamma frequencies)
- rank music by distractibility?
- measure focus over the day and during specific activities (eg. correlate frequencies against n-backing performance)

- testing benefit of using Redshift/f.lux to adjust monitor color temperature
- Measure negative effect of nicotine on sleep & determine appropriate buffer
- test claims of sleep benefits from magnesium

I have tried to do my little self-experiments as well as I know how to, and hopefully my results are less bogus than the usual anecdotes one runs into online. What I would really like is for other people (especially Zeo owners) to *replicate* my results. To that end I have taken pains to describe my setups in complete detail so others can use it, and provided the data and complete R or Haskell programs used in analysis. If anyone replicates my results in any fashion, please contact me and I would be happy to link your self-experiment here!

# First impressions

## First night

Christmas morning, I unpacked it and admired the packaging, and then looked through the manual. The base-station/alarm-clock seems pretty sturdy and has a large clear screen. The headband seemed comfortable enough that it wouldnât bother me. The various writings with it seemed rather fluffy and preppy, but I did my technical homework before hand, so could ignore their crap.

Late that night (quite late, since the girls stayed up playing *Fable 3* and Xbox Kinect dancing games and what not), I turn in wearily. I had noticed that the alarm seemed to be set for ~3:30 AM, but I was very tired from the long day and taking my melatonin, and didnât investigate further - I mean, what electronic would ship with the alarm both enabled and enabled for a bizarre time? It wasnât worth bothering the other sleeper by turning on the light and messing with it. I put on the headband, verified that the Zeo seemed to be doing stuff, and turned in. Come 3 AM, and the damn music goes off! I hit snooze, too discombobulated to figure out how to turn off the alarm.

So that explains the strange Zeo data for the first day:

The major surprise in this data was how quickly I fell asleep: 18 minutes. I had always thought that I took much longer to fall asleep, more like 45 minutes, and had budgeted accordingly; but apparently being deluded about when you are awake and asleep is common - which leads into an interesting philosophical point: if your memories disagree with the Zeo, who should you believe? The rest of the data seemed too messed up by the alarm to learn anything from.

# Uses

## Meditation

One possible application for Zeo was meditation. Most meditation studies are very small & methodologically weak, so it might be worthwhile to verify for oneself any interesting claims. If Zeoâs measuring via EEG, then presumably itâs learning something about how relaxed and activity-less oneâs mind is. Iâm not seeking enlightenment, just calmness, which would seem to be in the purview of an EEG signal. (As Charles Babbage said. errors made using insufficient data are still less than errors made using no data at all.) But alas, I meditated for a solid 25 minutes and the Zeo stubbornly read at the same wake level the entire time; I then read my Donald Keene book, *Modern Japanese diaries*, for a similar period with no change at all. It is possible that the 5-minute averaging (Zeo measures every 2 seconds) is hiding useful changes, but probably itâs simply not picking up any real differences. Oh well.

## Smart alarm

The second night I had set the alarm to a more reasonable time, and also enabled its smart alarm mode (âSmartWakeâ), where the alarm will go off up to 30 minutes early if you are ever detected to be awake or in light sleep (as opposed to REM or deep sleep). One thing I forgot to do was take my melatonin; I keep my supplements in the car and there was a howling blizzard outside. It didnât bother me since I am not addicted to melatonin.

In the morning, the smart alarm mode seemed to work pretty well. I woke up early in a good mode, thought clearly and calmly about the situation - and went back to sleep. (Itâs a holiday, after all.)

## Replacing headband

Around 15 May 2011, I gave up on the original headband - it was getting too dirty to get good readings - and decided to rip it apart to see what it was made of, and to order a new set of three for $35 (which seems reasonable given the expensive material that the contacts are made of - silver fabric); they then cost $50. A little googling found me a coupon, `FREESHIP`

, but apparently it only applied to the Zeo itself and so the pads were actually $40, or ~$13 a piece. I wonât say that buying replacement headbands semi-annually is something that *thrills* me, but $20 a year for sleep data is a small sum. Certainly itâs more cost-effective than most of the nootropics I have used. (Full disclosure: 9 months after starting this page, Zeo offered me a free set of sensors. I used them and when the news broke about Zeo going out of business, I bought another set.)

/ / /

In the future, I might try to make my own; eok.gnah claims that buying the silver fabric is apparently cheaper than ordering from Zeo, marciot reports success in making headbands, and it seems one can even hook up other sensors to the headband. Another alternative is, since the Zeo headband is a one-electrode EEG headset, to take an approach similar to the EEG people and occasionally add small dabs of conductive paste, since fairly large quantities are cheap (eg. 12oz for $30). There was a disposable adhesive gel ECG electrodes with offset press-stud connections being experimented with by Zeo Inc, but they never entered wide use before it shut down.

# Melatonin

Before writing my melatonin advocacy article, I had used melatonin regularly for 6+ years, ever since I discovered (somewhen in high school or college) that it was useful for enforcing bedtimes and seemed to improve sleep quality; when I posted my writeup to LessWrong people were naturally a little skeptical of my specific claim that it improved the quality of my sleep such that I could reduce scheduled time by an hour or so. Now that I had a Zeo, wouldnât it be a good idea to see whether it did anything, lo these many years later?

The following section represents 5 or 6 months of data (raw CSV data; guide to Zeo CSV). My basic dosage was 1.5mg of melatonin taken 0-30 minutes before going to sleep.

## Graphic

Deep sleep and âtime in wakeâ were both apparently unaffected; âtime in wakeâ apparently had too small a sample to draw much conclusion:

Surprisingly, total REM sleep fell:

While the raw ZQ falls, the regression takes into account the correlated variables and indicates that this is something of an

REMâs average fell by 29 minutes, deep sleep fell by 1 minute, but total sleep fell by 54 minutes; this implies that light sleep fell by 24 minutes. (The averages were 254.2 & 233.3) I am not sure what to make of this. While my original heuristic of a one hour reduction turns out to be surprisingly accurate, I had expected light and deep sleep to take most of the time hit. Do I get enough REM sleep? I donât know how I would answer that.

I did feel fine on the days after melatonin use, but I didnât track it very systematically. The best I have is the âmorning feelâ parameter, which the Zeo asks you on waking up; in practice I entered the values as: a â2â means I woke feeling poor or unrested, â3â was fine or mediocre, and â4â was feeling good. When we graph the average of morning feel against melatonin use or non-use, we find that melatonin was noticeably better (2.95 vs 3.17):

Graphing some more of the raw data:

Unfortunately, during this period, I didnât regularly do my n-backing either, so thereâd be little point trying to graph that. What I spent a lot of my free time doing was editing `gwern.net`

, so it might be worth looking at whether nights on melatonin correspond to increased edits the next day. In this graph of edits, the red dots are days without melatonin and the green are days with melatonin; I donât see any clear trend, although itâs worth noting almost all of the very busy days were melatonin days:

## Melatonin analysis

The data is very noisy (especially towards the end, perhaps as the headband got dirty) and the response variables are intercorrelated which makes interpretation difficult, but hopefully the overall conclusions from the multivariate linear analysis are not entirely untrustworthy. Letâs look at some average. Zeoâs website lets you enter in a 3-valued variable and then graph the average day for each variable against a particular recorded property like ZQ or total length of REM sleep. I defined one dummy variable, and decided that a â0â would correspond to not using melatonin, â1â would correspond to using it, and â2â would correspond to using a double-dose or more (on the rare occasions I felt I needed sleep insurance). The following additional NHST-style^{4} analyses of *p*-values is done by importing the CSV into R; given all the issues with self-experimentation (these melatonin days werenât even blinded), the *p*-values should be treated as gross guesses, where <0.01 indicates I should take it seriously, <0.05 is pretty good, <0.10 means I shouldnât sweat it, and anything bigger than 0.20 is, at most, interesting while >0.5 means ignore it; weâll also look at correcting for multiple comparisons^{5}, for the heck of it. A mnemonic: *p*-values are about whether the effect exists, and *d*-values are whether we care. For a visualization of effect sizes, see âWindowpane as a Jar of Marblesâ.

The analysis session in the `R`

interpreter:

```
# Read in data w/ variable names in header; uninteresting columns deleted in OpenOffice.org
zeo <- read.csv("http://www.gwern.net/docs/zeo/2011-zeo-melatonin.csv")
# "Melatonin" was formerly "SSCF 10";
# I also edited the CSV to convert all '3' to '1' (& so a binary)
R> l <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM,
Time.in.Deep, Awakenings, Morning.Feel, Time.in.Light)
~ Melatonin, data=zeo)
R> summary(manova(l))
Df Pillai approx F num Df den Df Pr(>F)
Melatonin 1 0.102 0.717 9 57 0.69
Residuals 65
R> summary(l)
Response ZQ :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 83.52 4.13 20.21 <2e-16
Melatonin 2.43 4.99 0.49 0.63
Response Total.Z :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 452.38 22.86 19.79 <2e-16
Melatonin 9.68 27.59 0.35 0.73
Response Time.to.Z :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.48 2.59 7.52 2.1e-10
Melatonin -5.04 3.13 -1.61 0.11
Response Time.in.Wake :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.095 1.521 4.66 1.6e-05
Melatonin -0.247 1.836 -0.13 0.89
Response Time.in.REM :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 144.62 9.38 15.41 <2e-16
Melatonin -3.73 11.32 -0.33 0.74
Response Time.in.Deep :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.33 3.26 16.68 <2e-16
Melatonin 5.56 3.93 1.41 0.16
Response Awakenings :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.095 0.524 5.90 1.4e-07
Melatonin -0.182 0.633 -0.29 0.77
Response Morning.Feel :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.952 0.142 20.78 <2e-16
Melatonin 0.222 0.171 1.29 0.2
Response Time.in.Light :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 253.86 13.59 18.68 <2e-16
Melatonin 7.93 16.40 0.48 0.63
```

The MANOVA indicates no statistically-significant difference between the groups of days, taking all variables into account (*p*=0.69). To summarize the regression:

Variable | Correlate/Effect | p-value |
Coefficientâs sign isâŠ |
---|---|---|---|

`Time.to.Z` |
-5.04 | 0.11 | better |

`Awakenings` |
-0.18 | 0.77 | better |

`Time.in.Wake` |
-0.25 | 0.89 | better |

`Time.in.Deep` |
5.56 | 0.16 | better |

`Time.in.Light` |
7.93 | 0.63 | worse |

`Time.in.REM` |
-3.73 | 0.74 | worse |

`Total.Z` |
9.68 | 0.73 | better |

`ZQ` |
2.43 | 0.63 | better |

`Morning.Feel` |
0.22 | 0.20 | better |

Part of the problem is that too many days wound up being useless, and each day costs us information and reduces our *true* sample size. (None of the metrics are strong enough to survive multiple correction^{6}, sadly.)

And also unfortunately, this dataseries doesnât distinguish between addition to melatonin or benefits from melatonin - perhaps the 3.2 is my ânormalâ sleep quality and the 2.9 comes from a âwithdrawalâ of sorts. The research on melatonin doesnât indicate any addiction effect, but who knows?

If I were to run further experiments, I would definitely run it double-blind, and maybe even test <1.5mg doses as well to see if Iâve been taking too much; 3mg turned out to be excessive, and there are one or two studies indicating that <1mg doses are best for normal people. I wound up using 1.5mg doses. (There could be 3 conditions: placebo, 0.75mg, and 1.5mg. For looking at melatonin effect in general, the data on 2 dosages could be combined. Melatonin has a short half-life, so probably there would be no point in random blocks of more than 2-3 days^{7}: we can randomize each day separately and assume that days are independent of each other.)

Worth comparing are Jayson Virissimoâs preliminary results:

According to the preliminary [Zeo] data, while on melatonin, I seemed to get more total sleep, more REM sleep, less deep sleep, and wake up about the same number of times each night. Because this isnât enough data to be very confident in the results, I plan on continuing this experiment for at least another 4 months (2 on and 2 off of melatonin) and will analyze the results for the [statistical] significance and magnitude of the effects (if there really are any) while throwing out the outliers (since my sleep schedule is so erratic).

## Value of Information (VoI)

See also the discussion as applied to ordering modafinil and testing nootropics

We all know itâs possible to spend more time figuring out how to âsave timeâ on a task than we would actually save time like rearranging books on a shelf or cleaning up in the name of efficiency (*xkcd* even has a cute chart listing the break-even points for various possibilities,âIs It Worth The Time?â), and similarly, itâs possible to spend more money trying to âsave moneyâ than one would actually save; less appreciated is that the same thing is also possible to do with gaining information.

The value of an experiment is the information it produces. What is the value of information? Well, we can take the economic tack and say value of information is the value of the decisions it *changes*. (Would you pay for a weather forecast about somewhere you are not going to? No. Or a weather forecast about your trip where you *have* to make that trip, come hell or high water? Only to the extent you can make preparations like bringing an umbrella.)

Wikipedia says that for a risk-neutral person, value of perfect information is âvalue of decision situation with perfect informationâ - âvalue of current decision situationâ. (Imperfect information is just weakened perfect information: if your information was not 100% reliable but 99% reliable, well, thatâs worth 99% as much.)

The decision is the binary take or not take. Melatonin costs ~$10 a year (if you buy in bulk during sales, as I did). Suppose I had perfect information it worked; I would not change anything, so the value is $0. Suppose I had perfect information it did not work; then I would stop using it, saving me $10 a year in perpetuity, which has a net present value^{8} (at 5% discounting) of $205. So the best-case value of perfect information - the case in which it changes my actions - is $205, because it would save me from blowing $10 every year for the rest of my life. My melatonin experiment is not perfect since I didnât randomize or double-blind it, but I had a lot of data and it was well powered, with something like a >90% chance of detecting the decent effect size I expected, so the imperfection is just a loss of 10%, down to $184. From my previous research and personal use over years, I am highly confident it works - say, 80%^{9}.

If the experiment says melatonin works, the information is useless to me since I continue using melatonin, and if the experiment says it doesnât, then letâs assume I decide to quit melatonin^{10} and then save $10 a year or $184 total. Whatâs the expected value of obtaining the information, giving these two outcomes? $(80$. Or another way, redoing the net present value: $\frac{10\xe2\x88\x920}{\mathrm{ln}1.05}\u0102\x970.9\u0102\x970.2$ At minimum wage opportunity cost of $7 an hour, $36.8 is worth 5.25 hours of my time. I spent much time on screenshots, summarizing, and analysis, and Iâd guess I spent closer to 10-15 hours all told.

This worked out example demonstrates that when a substance is cheap and you are highly confident it works, a long costly experiment may not be worth it. (Of course, I would have done it anyway due to factors not included in the calculation: to try out my Zeo, learn a bit about sleep experimentation, do something cool, and have something neat to show everyone.)

## Melatonin data

The data looked much better than the first night, except for a big 2-hour gap where I vaguely recall the sensor headband having slipped off. (I donât think it was because it was uncomfortable but due to shifting positions or something.) Judging from the cycle of sleep phases, I think I lost data on a REM peak. The REM peaks interest me because itâs a standard theory of polyphasic sleeping that thriving on 2 or 3 hours of sleep a day is possible because REM (and deep sleep) is the only phase that truly matters, and REM can dominate sleep time through REM rebound and training.

Besides that, I noticed that time to sleep was 19 minutes that night. I also had forgotten to take my melatonin. HmmâŠ

Since Iâve begun this inadvertent experiment, Iâll try continuing it, alternating days of melatonin usage. I claim in my melatonin article that usage seems to save about 1 hour of sleep/time, but thereâs several possible avenues. One could be quicker to fall asleep; one could awake fewer times; and one could have greater percentage of REM or deep sleep, reducing light sleep. (Light sleep doesnât seem very useful; I sometimes feel worse after light sleep.)

During the afternoon, I took a quick nap. Iâm not a very good napper, it seems - only the first 5 minutes registered as even light sleep.

A dose of melatonin (1.5mg) and off to bed a bit early. Iâm a little more impressed with the smart alarm; since Iâm hard-of-hearing and audio alarms rarely if ever work, I usually use a Sonic Alert vibrating alarm clock. But in the morning I woke up within a minute of the alarm, despite the lack of vibration or flashing lights. (The chart doesnât reflect this, but as a previous link says, distinguishing waking from sleeping can be difficult and the transitions are the least trustworthy parts of the data.)

The data was especially good today, with no big gaps:

You can see an impressively regular sleep cycle, cycling between REM and light sleep. Whatâs disturbing is the relative lack of deep sleep - down 4-5% (and there wasnât a lot to begin with). I suspect that the lack of deep sleep indicates I wasnât sleeping very well, but not badly enough to wake up, and this is probably due either to light from the Zeo itself - I only figured out how to turn it off a few days later - or my lack of regular blankets and use of a sleeping bag. But the awakenings around 4-6 AM and on other days has made me suspicious that one of the cats is bothering me around here and Iâm just forgetting it as I fall asleep.

The next night is another no-melatonin night. This time it took 79 minutes to fall asleep. Very bad, but far from unprecedented; this sort of thing is why I was interested in melatonin in the first place. Deep sleep is again limited in dispersion, with a block at the beginning and end, but mostly a regular cycle between light and REM:

Melatonin night, and 32 minutes to sleep. (Iâm starting to notice a trend here.) Another fairly regular cycle of phases, with some deep sleep at the beginning and end; 32 minutes to fall asleep isnât great but much better than 79 minutes.

Perhaps I should try a biphasic schedule where I sleep for an hour at the beginning and end? Thatâd seem to pick up most of my deep sleep, and REM would hopefully take care of itself with REM rebound. Need to sum my average REM & deep sleep times (that sum seems to differ quite a bit, eg one fellow needs 4+ hours. My own need seems to be similar) so I donât try to pick a schedule doomed to fail.

Another night, no melatonin. Time to sleep, just 18 minutes and the ZQ sets a new record even though my cat Stormy woke me up in the morning^{11}:

I personally blame this on being exhausted from 10 hours working on my transcription of *The Notenki Memoirs*. But a data point is a data point.

I spend New Yearâs Eve pretty much finishing *The Notenki Memoirs* (transcribing the last of the biographies, the round-table discussion, and editing the images for inclusion), which exhausts me a fair bit as well; the champagne doesnât help, but between that and the melatonin, I fall asleep in a record-setting 7 minutes. Unfortunately, the headband came off somewhere around 5 AM:

A cat? Waking up? Dunno.

Another relatively quick falling asleep night at 20 minutes. Which then gets screwed up as I simply canât stay asleep and then the cat begins bothering the heck out of me in the early morning:

Melatonin night, which subjectively didnât go too badly; 20 minutes to sleep. But lots of wake time (long enough wakes that I remembered them) and 2 or 3 hours not recorded (probably from adjusting my scarf and the headband):

Accidentally did another melatonin night (thought Monday was a no-melatonin night). Very good sleep - set records for REM especially towards the late morning which is curious. (The dreams were also very curious. I was an Evangelion character (Kaworu) tasked with riding that kind of carnival-like ride that goes up and drops straight down.) Also another quick falling asleep:

Rather than 3 melatonin nights in a row, I skipped melatonin this night (and thus will have it the next one). Perhaps because I went to sleep so very late, and despite some awakenings, this was a record-setting night for ZQ and TODO deep sleep or REM sleep? :

I also switched the alarm sounds 2 or 3 days ago to âforestâ sounds; they seem somewhat more pleasant than the beeping musical tones. The next night, data is all screwed up. What happened there? It didnât even record the start of the night, though it seemed to be active and working when I checked right before going to sleep. Odd.

Next 2 days arenât very interesting; first is no-melatonin, second is melatonin:

One of my chief Zeo complaints was the bright blue-white LCD screen. I had resorted to turning the base station over and surrounding it with socks to block the light. Then I looked closer at the labels for the buttons and learned that the up-down buttons changed the brightness and the LCD screen could be turned off. And I had read the part of the manual that explained that. Dâoh!

Off, but no data on the 22nd. No idea what the problem is - the headset seems to have been on all night.

On with a double-dose of melatonin because I was going to bed early; as you can see, didnât work:

Off, no data on the 24th. On, no data on the 25th. I donât know what went wrong on these two nights.

The 27th (on for melatonin) yielded no data because, frustratingly, the Zeo was printing a âwrite-protectedâ error on its screen; I assumed it had something to do with uploading earlier that day - perhaps I had yanked it out too quickly - and put it back in the computer, unmounted and went to eject it. But the memory card splintered on me! It was stuck and the end was splintering and little needles of plastic breaking off. I couldnât get it out and gave up. The next day (I slept reasonably well) I went back with a pair of needle-nose pliers. I had a backup memory card. After much trial and error, I figured out the card had to be FAT-formatted and have a directory structure that looked like `ZEO/ZEOSLEEP.DAT`

. So thatâs that.

- 30: on
- 31: off
- 1: on
- 2: off
- 3: on

Unfortunately, this night continues a long run of no data. Looking back, it doesnât *seem* to have been the fault of the new memory card, since some nights did have enough data for the Zeo website to generate graphs. I suspect that the issue is the pad getting dirty after more than a month of use. I hope so, anyway. Iâll look around for rubbing alcohol to clean it. That night initially starts badly - the rubbing alcohol seemed to do nothing. After some messing around, I figure out that the headband seems to have loosened over the weeks and so while the sensor felt reasonably snug and tight and was transmitting, it wasnât snug enough. I tighten it considerably and actually get some decent data:

- 5: on
- 7: on
- 8: off
- 9: on
- 11: on?

The previous night, I began paying closer attention to when it was and was not reading me (usually the latter). Pushing hard on it made it eventually read me, but tightening the headband hadnât helped the previous several nights. Pushing and not pushing, I noticed a subtle click. Apparently the band part with the metal sensor pad connects to the wireless unit by 3 little black metal nubs; 2 were solidly in place, but the third was completely loose. Suspicious, I try pulling on the band *without* pushing on the wireless unit - leaving the loose connection loose. Sure enough, no connection was registered. I push on the unit while loosing the headband - and the connection worked. I felt I finally had solved it. It wasnât a loose headband or me pulling it off at night or oils on the metal sensors or a problem with the SD card. I was too tired to fix it when I had the realization, but resolved the next morning to fix it by wrapping a rubber band around the wireless unit and band. This turned out to not interfere with recharging, and when I took a short nap, the data looked fine and gapless. So! The long data drought is hopefully over.

On the 15th of February, I had a very early flight to San Francisco. That night and every night from then on, I was using melatonin, so weâll just include all the nights for which any sensible data was gathered. Oddly enough, the data and ZQs seem bad (as one would expect from sleeping on a couch), but I wake up feeling fairly refreshed. By this point we have the idea how the sleep charts work, so I will simply link them rather than display them.

Then I took a long break on updating this page; when I had a month or two of data, I uploaded to Zeo again, and buckled down and figured out how to have ImageMagick crop pages. The shell script (for screenshots of my browser, YMMV) is `for file in *.png; do mogrify +repage -crop 700x350+350+285 $file; done;`

General observations: almost all these nights were on melatonin. Not far into this period, I realized that the little rubber band was not working, and I hauled out my red electrical tape and tightened it but good; and again, you can see the transition from crappy recordings to much cleaner recordings. The rest of February:

March:

- 03-01 / 03-02
- 03-05 / 03-07
- 03-08 / 03-09
- 03-10 / 03-11
- 03-15 / 03-19
- 03-22 / 03-23
- 03-24 / 03-25
- 03-26 / 03-27
- 03-28 / 03-29
- 03-30 / 03-31

April:

April 4th was one of the few nights that I was not on melatonin during this timespan; I occasionally take a weekend and try to drop all supplements and nootropics besides the multivitamins and fish oil, which includes my melatonin pills. This night (or more precisely, that Sunday evening) I also stayed up late working on my computer, getting in to bed at 12:25 AM. You can see how well that worked out. During the 2 AM wake period, it occurred to me that I didnât especially want to sacrifice a day to show that computer work can make for bad sleep (which I already have plenty of citations for in the Melatonin essay), and I gave in, taking a pill. That worked out much better, with a relatively normal number of wakings after 2 AM and a reasonable amount of deep & REM sleep.

# Exercise

## One-legged standing

Seth Roberts found that for him, standing a lot helped him sleep. This seems very plausible to me - more fatigue to repair, closer to ancestral conditions of constant walking - and tallied with my own experience. (One summer I worked at Yawgoog Scout Camp, where I spent the entire day on my feet; I always slept very well though my bunk was uncomfortable.) He also found that stressing his legs by standing on one at a time for a few minutes also helped him sleep. That did not seem as plausible to me. But still worth trying: standing is free, and if it does nothing, at least I got a little more exercise.

Roberts tried a fairly complicated randomized routine. I am simply alternating days as with melatonin (note that I have resumed taking melatonin every day). My standing method is also simple; for 5 minutes, I stand on one leg, rise up onto the ball of my foot (because my calves are in good shape), and then sink down a foot or two and hold it until the burning sensation in my thigh forces me to switch to the other leg. (I seem to alternate every minute.) I walk my dog most every day, so the effect is not as simple as âsome moderate exercise that dayâ; in the next experiment, I might try 5 minutes of dumbbell bicep curls instead.

### One-legged standing analysis

The initial results were promising. Of the first 5 days, 3 are âonâ and 2 are off; all 3 on-days had higher ZQs than the 2 off-days. Unfortunately, the full time series did not seem to bear this out. Looking at the ~70 recorded days between 11 June 2011 and 27 August 2011 (raw CSV data), the raw uncorrected averages looked like this (as before, the â3â means the intervention was used, â0â that it was not):

R analysis, using multivariate linear regression^{12} turns in a non-significant value for one-leggedness in general (*p*=0.23); by variable:

Variable | Effect | p-value |
Coefficientâs sign isâŠ |
---|---|---|---|

`ZQ` |
-1.24 | 0.16 | worse |

`Total.Z` |
-4.09 | 0.37 | worse |

`Time.to.Z` |
0.47 | 0.51 | worse |

`Time.in.Wake` |
-0.37 | 0.80 | better |

`Time.in.REM` |
-5.33 | 0.02 | worse |

`Time.in.Light` |
2.76 | 0.38 | worse |

`Time.in.Deep` |
-1.56 | 0.10 | worse |

`Awakenings` |
-0.05 | 0.79 | better |

`Morning.Feel` |
-0.05 | 0.32 | worse |

No *p*-values survived multiple-correction^{13}:.

While I did not replicate Robertsâs setup exactly in the interest of time and ease, and obviously it was not blinded, I tried to compensate with an unusually large sample: 69 nights of data. This was a mixed experiment: there seems to be an negative effect, but none of the changes seem to have large effect sizes or strong *p*-values.

The one-legged standing was not in exclusion to melatonin use, but I had used it most every night. I thought I might go on using one-legged standing, perhaps skipping it on nights when I am up particularly late or lack the willpower, but Iâve abandoned it because it is a lot of work to use and the result looked weak. In the future, I should look into whether walks before bedtime help.

# Vitamin D

## Background

Seth Roberts has speculated that vitamin D, despite its myriads of other benefits, may harm sleep when taken in the evening and help sleep when taken in the morning based on some anecdotes (with 2 null results). The anecdotes are nearly worthless as sleep is pretty variable (look above or below, and youâll see swings of over 20 ZQ points night to night), and just a little carelessness or selection bias will persuade one that there is a major effect where there is none - especially since they are not using Zeos or accelerometers or even giving basic quantities like âI felt bad in the morning 3/5 daysâ. But I began to wonder. Vitamin D is a chemical intimately involved in circadian rhythms (a âzeitgeberâ), with some connections to systems involved in sleep (âThe steroid hormone of sunlight soltriol (vitamin D) as a seasonal regulator of biological activities and photoperiodic rhythmsâ); given its links to the *early* day and sunlight, one would expect it to affect sleep for the worse.

To see what, if any existing research there was, I checked the 49 hits in PubMed and the first 10 pages of Google Scholar for ââvitamin Dâ sleepâ. For the most part, hits were completely irrelevant, and the most relevant ones like âVitamins and Sleep: An Exploratory Studyâ did not cover any relationship between vitamin D and sleep, much less the *timing* of vitamin D consumption. Thereâs some speculation the elderly may sleep badly in part due to lack of vitamin D (âSome new food for thought: The role of vitamin D in the mental health of older adultsâ), but the only hard results I found were weak or tangential: a correlation with daytime sleepiness in Taiwanese dialysis patients^{14}, a correlation with later sleep in American women^{15}, a correlation with earlier sleep in Japanese women^{16}, a correlation with reduced sleep difficulties in Americans, and a correlation of blood levels with both better and worse sleep in Americans^{17}. This reads like noise.

In June 2012, after I finished my 2 experiments, a preprint appeared for *Medical Hypotheses*: âThe world epidemic of sleep disorders is linked to vitamin D deficiencyâ, Gominak & Stumpf 2012; the lead author, unfortunately, had little to tell me when I emailed her, indicating that the use of vitamin D was not systematic or recorded:

An observation of sleep improvement with vitamin D supplementation led to a 2 year uncontrolled trial of vitamin D supplementation in 1500 patients with neurologic complaints who also had evidence of abnormal sleep. Most patients had improvement in neurologic symptoms and sleep but only through maintaining a narrow range of 25(OH) vitamin D3 blood levels of 60-80 ng/ml. Comparisons of brain regions associated with sleep-wake regulation and vitamin D target neurons in the diencephalon and several brainstem nuclei suggest direct central effects of vitamin D on sleepâŠAn uncontrolled trial of continuous positive airway pressure CPAP devices for patients with headache and obstructive sleep apnea was partially successful, but in the fall of 2009 two patients remarked that the serendipitous supplementation of vitamin D, in addition to the use of their CPAP devices had, over a period of weeks, allowed them to wake rested and without headaches. Because the majority of the daily headache sufferers also had vitamin D deficiency the same author went looking for a possible connection between vitamin D and paralysis during sleep. This led to the recognition that several nuclei in the hypothalamus and brainstem that are known to be involved in sleep have high concentrations of vitamin D receptors

^{15,16,17}. An uncontrolled clinical trial of vitamin D supplementation in 1500 patients over a 2 year period, maintaining a consistent vitamin D blood level in the range of 60-80 ng/ml over many months, produced normal sleep in most patients regardless of the type of sleep disorder, suggesting that multiple types of sleep disorders might share the same etiologyâŠLike other steroid hormones, Vitamin D is thought to exert its effects in the nucleus of the cell, at the vitamin D receptor, promoting transcription of specific genes. There are also reports of actions unrelated to transcription, possibly mediated by surface membrane receptors, such as Ca++ channels, that produce cellular effects in minutes^{5}. Surprisingly, doses of 20,000 IU/day promote normal sleep without being sedating, and the effect is apparent within the first day of dosing in patients who have had severe sleep disruption and very low 25(OH) vitamin D3 levelsâŠMany of the ideas about normal sleep expressed here grew out of watching patients return to normal sleep cycles, over a period of months, with just the return of the 25(OH) vitamin D3 blood level to 60-80 ng/ml. A totally unexpected observation was that the sleep difficulties produced by vitamin D levels below 50 return, in the same form, as the level goes over 80 ng/ml suggesting a narrower range of ânormalâ vitamin D levels for sleep than those published for bone health. Also, Vitamin D2, ergocalciferol (widely recommended as an âequivalentâ therapy for osteoporosis) prevented normal sleep in most patients, suggesting that D2 may be close enough in structure to act as a partial agonist at some locations, an antagonist at others.

Comments:

- I donât know about the overarching claims (I suspect most of the problem is lighting, and general demands on time), but the trial itself seems really important, especially since neither Roberts nor I had the slightest idea about it but seem to have reached similar results
- the 2 patients suggested it, in an interesting example of the value of self-experimentation
- the authors cover much more specific potential connections between vitamin D and sleep than just âcircadian rhythmsâ
- the methodology section is non-existent; how were these 1500 patients picked? how long did each use vitamin D? Unfortunately, I nor Roberts has taken vitamin D blood tests (as far as I know) and so we cannot verify that the authorsâ 60-80ng/ml range is what we fell into, but itâs plausible. How is sleep quality being measured? Are these results consistent or inconsistent with the one case of morning mood/restedness improvement but little else? Although even if they were inconsistent, that could be explained by neither of us being sleep disorder sufferers and the effect being weaker in us

In July 2012, preprints of Huang et al 2012 became available; it is a case series - the authors followed a group of veterans with chronic pain who received vitamin D supplements, finding improvements to pain but also reduction in sleep latency and increase in sleep duration. While I did not observe any effect on latency or duration in my following experiments, this would still be a promising datapoint but unfortunately, the sample had substantial dropout, and had no control group (hence no randomizing or blinding). This renders the study not very useful - the improvements being perhaps just regression toward the mean or a selection bias. In 2013, a review (McCarty et al 2013) came out arguing that âlow vitamin D levels increase the risk for autoimmune disease, chronic rhinitis, tonsillar hypertrophy, cardiovascular disease, and diabetes. These conditions are mediated by altered immunomodulation, increased propensity to infection, and increased levels of inflammatory substances, including those that regulate sleepâ; this might handle negative effects on sleep from chronically low vitamin D, but doesnât seem relevant to acute effects varying by time of administration.

Blogger Chris L looked back in August 2012 on ~1 year of Zeo data and a quasi-experiment in which he started with 4000IU of vitamin D supplementation, then 5000IU, then none; he took them at night, then switched to morning; the results were that the length of his deep sleep started high, dropped, and then recovered. He interprets this as evidence that too much vitamin D hurts sleep.

## Vitamin D at night hurts?

### Setup

I decided to run a small double-blind experiment much like the Adderall and other trials. My Vitamin D is 360 5000IU softgels by âHealthy Originsâ, bought on `iHerb.com`

. The gel-capsules contain cholecalciferol dissolved in olive oil. This made preparing placebo pills a little more difficult. I wound up puncturing the capsules, squeezing out the olive oil contents into a new capsule (they were too wide to push in) and then pushing in the empty shell; all 20 were topped off with ordinary white baking flour. (I used up the last of my creatine preparing the placebos for the Modalert day trial.) For the 20 placebo pills, I spooned in some olive oil to each and topped them off with flour as well. Each set went into its own identical Tupperware container. The process was a little messier than I had hoped, but the pills seem like they will work.

The procedure at night will be: in the dark^{18} immediately before putting on the Zeo headband and going to bed, I will take my usual melatonin pill; then I will take the two containers blindly; mix them up; select a pill from one to take, and put the selected container on the shelf next to the Zeo. In the morning, I will see which one I took. (The Vitamin D olive oil was distinctly more yellow than the green placebo olive oil.) If I took placebo, I will take my usual daily dose of Vitamin D, and if active, I will skip it. This hopefully will blind me and keep constant my total Vitamin D intake. (This procedure may need to be amended with something more like the modafinil/Adderall procedure: a bag with replacement of the consumed placebos.) If I get a run of one kind of pills, I will re-balance the numbers.

Based on the first 10 daysâ ZQs, I predict Iâll find in the final data set:

- increased sleep latency; probably at least another 10 minutes to fall asleep, as my mind seems to churn away with ideas of things to do
- increased awakenings; not that many, maybe 1 or 2 on average
decreased ZQ; by around 5-10 points (a large effect, on par with melatonin)

My best guess is that the ZQ hit is coming from reduced deep sleep, or maybe reduced deep & REM sleep. I donât think the total amount of sleep has changed.

Roberts theorizes that besides vitamin D damaging sleep, it could actively improve your sleep if taken in the morning. As it happens, in this setup, on âplaceboâ days I do take vitamin D in the morning - so wouldnât one expect to see scores improve on the nights following a placebo night (a vitamin D morning), regardless of whether *that* night was vitamin D or placebo? A quick analysis of the first 24 nights showed the lagged nights to average a ZQ of 94.5. My monthly averages for October and November were 96, so there is no obvious improvement here.

One thing I suspect but cannot confirm - since I do not have a heart rate monitor - is that ~10 minutes after taking the vitamin D pills, my heart rate increases. Not to any uncomfortable or worrisome degree, but when one expects oneâs heart rate to go down after going to bed, even a small increase in the opposite direction is noticeable. On the 12th, I finally got around to writing down this impression; then I searched online a bit and found that low vitamin D levels are associated with arrhythmia and other issues, but so are very high levels, and increased heart rates in the studies and anecdotes are associated with higher heart rates^{19}. Iâm not worried about the heart rate, but I am concerned that this is defeating the double-blinding: if all I have to do is notice my heart rate (and lying swaddled in bed in complete silence, it would be hard for me *not* to), then Iâve unblinded myself *before* falling asleep. Other stimulants like caffeine or sulbutiamine might similarly increase my heart rate, but theyâd obviously also interfere with sleep, so I canât create any âactive placeboâ even if I wanted to start over. (One promising future gadget is the âBasisâ wristwatch which measures, among other things, heart-rate; I look forward to the early reviews.)

### Vitamin D data

The data (trimmed CSV), covering January-February 2012:

Date | Pill | Quality^{20} |
ZQ | Guess |
---|---|---|---|---|

31D-1J | active | bad | 84 | right 70% |

1-2 | placebo | better | 93 | right 65% |

2-3 | active | well | 94 | 50% |

3-4 | active | poor | 86 | right 60% |

4-5 | placebo | well | 98 | wrong 60% |

5-6 | active | mediocre | 86 | 50% |

6-7 | placebo | OK | ??^{21} |
right 65% |

7-8 | placebo | good | 90 | right 60% |

8-9 | active | poor | 84 | right 65% |

9-10 | placebo | good | 95 | right 65% |

10-11 | active | good | 100 | wrong 70% |

11-12 | active | mediocre | 92 | right 70% |

12-13 | active | mediocre | 88 | 50% |

13-14 | active | poor | 100 | right 60% |

14-15 | placebo | poor | 83 | wrong 60% |

15-16 | active | poor | 101 | right 55% |

16-17 | placebo | mediocre | 90 | 50% |

17-18 | placebo | mediocre | 88 | right 60% |

18-19 | placebo | good | 100 | 50% |

19-20 | active | poor | 86 | 50% |

20-21 | active | mediocre | 85 | 50% |

21-22 | placebo | OK | 91 | right 60% |

22-23 | placebo | OK | 106 | right 65% |

23-24 | active | poor | 91 | right 65% |

24-25 | active | 1 | 79 | right 75% |

25-26 | placebo | 3 | 85 | right 65% |

26-27 | active | 2 | ??^{22} |
right 55% |

28-29 | active | 3 | 85 | 50% |

29-30 | active | 3 | 93 | wrong 55% |

30-31 | placebo | 3 | 100 | right 60% |

31J-1F | active | 3 | 94 | 50% |

1F-2F | active | 2 | 89 | right 60% |

2-3 | active | 1 | 83 | right 70% |

3-4 | placebo | 2 | 81 | wrong 70% |

5-6 | placebo | 3 | 98 | right 65% |

6-7 | active | 2 | 88 | 50% |

7-8 | active | 2 | 94 | right 55% |

8-9 | active | 3 | 94 | wrong 75% |

9-10 | placebo | 3 | 92 | 50% |

10-11 | placebo | 3 | 95 | right 60% |

11-12 | placebo | 3 | 103 | right 75% |

12-13 | placebo | 3 | 84 | right 70% |

(Data input was for âOther Disruptions 3â; 0 = placebo, 1 = vitamin D.)

### Vitamin D analysis

From a quick look at the prediction confidences, I was usually correct but perhaps underconfident: my proper scoring log score compared to a random guesser is 5.4^{23}, which is even better than my guesses in my Adderall experiment.

Looking at the data averages in the Zeo website, it looked like ZQ & total & REM sleep fell, deep increased slightly, time awake & awakenings both increased, and morning feel decreased. The R analysis^{24}:

The MANOVA is tantalizingly close to statistical-significance (*p*=0.07); the variables:

Variable | Effect | p-value |
Coefficientâs sign isâŠ |
---|---|---|---|

`Total.Z` |
-19.73 | 0.084 | worse |

`Time.in.REM` |
-14.54 | 0.021 | worse |

`Time.in.Deep` |
2.32 | 0.41 | better |

`Time.in.Wake` |
2.50 | 0.63 | worse |

`Awakenings` |
0.739 | 0.37 | worse |

`Morning.Feel` |
-0.524 | 0.0067 | worse |

`Time.to.Z` |
3.47 | 0.46 | worse |

`Morning.Feel`

jumps out as having a large effect (-0.5, on a 1-3 rating, is huge) and accordingly, a very low *p*-value which survives multiple-correction^{25}. Apparently I was waking up feeling like crap on the Vitamin D nights.

Going back to my predictions after the first 10 days, theyâre *sort* of right:

- sleep latency was increased, but not statistically-significantly and only by ~3m, which is less than half the predicted 10 minutes
- increased awakenings was less than 1 additional awakening (compared to predicted 1-2) and didnât reach statistical significance

My conclusion?

Vitamin D hurts sleep when taken at night. I know of no reason that one would want to take vitamin D late at night, so I will definitely be avoiding it at that time in the future.

### VoI

For background on âvalue of informationâ calculations, see the first calculation.

The first experiment I had no opinion on. I actually did sometimes take vitamin D in the evening when I hadnât gotten around to it earlier (I take it for its anti-cancer and SAD effects). There was no research background, and the anecdotal evidence was of very poor quality. Still, it was plausible since vitamin D *is* involved in circadian rhythms, so I gave it 50% and decided to run an experiment. What effect would perfect information that it did negatively affect my sleep have? Well, Iâd definitely switch to taking it in the morning and would never take it in the evening again, which would change maybe 20% of my future doses, and what was the negative effect? It couldnât be *that* bad or I would have noticed it already (like I noticed sulbutiamine made it hard to get to sleep). Iâm not willing to change my routines very much to improve my sleep, so I would be lying if I estimated that the value of eliminating any vitamin D-related disturbance was more than, say, 10 cents per night; so the total value of affected nights would be $0.10\u0102\x970.20\u0102\x97365.25=7.3$. On the plus side, my experiment design was high quality and ran for a fair number of days, so it would surely detect any sleep disturbance from the randomized vitamin D, so say 90% quality of information. This gives $\frac{7.3\xe2\x88\x920}{\mathrm{ln}1.05}\u0102\x970.90\u0102\x970.50=67.3$, justifying <9.6 hours. Making the pills took perhaps an hour, recording used up some time, and the analysis took several hours to label & process all the data, play with it in R, and write it all up in a clean form for readers. Still, I donât think it took almost 10 hours of work, so I think this experiment ran at a profit.

## Vitamin D at morn helps?

### Setup

The logical next thing to test is whether there is any benefit to sleep by taking vitamin D in the morning as compared to not taking vitamin D at all, since we have already established that evening is worse than morning. (Besides anecdotes, Seth Roberts reported - after I concluded my experiment - that his own non-blind varying of doses seemed to help his subjective restedness but didnât influence anything else.) I would expect any benefits in the morning to be attenuated compared to the evening effect: the morning is simply many hours away from going to bed again in the evening, giving time for many events to affect the ultimate sleep. So this experiment will run for more than 40 days of 20/20, but 56 days of 28/28; per Robertsâs suggestion, I will not randomize individual days but 8 paired *blocks* of 7 days. (Multiple days to give any slow effects time to manifest, which seem eminently possible with a fat-soluble vitamin like vitamin D; 7 days, so we donât âcycle around the weekâ but instead have exactly the same number of eg. active Sundays and placebo Sundays since sleep often varies systematically over the week.)

I prepare 27 placebo pills & 27 actives as before, stored in separate baggies. To randomize blocks of 7-days - I will fill 2 opaque containers with 7 placebo and 7 actives (with a label on the inside of the active container), and pick a container at random to use for the next 7 days. I will take one each morning upon awakening, closing my eyes. On the 8th morning, the first container will be empty, so I set it aside and open the second; when the second is emptied, I will look inside it to see whether it has the label, which lets me infer which one it was, and record whether the 2 weeks were active/placebo or placebo/active. The 2 containers will be refilled as before, and blocks 3-4 will begin. I will do this 4 times, at which point I will analyze the data.

Analysis will be the same Zeo parameters as before, but this time augmented by a simple mood indicator: 1-5, with 3 being an ordinary mildly productive day and 1 being âmy car caught on fire and was totaledâ day (real data-point), recorded at the end of the day just before bed. (I considered a more complex mood indicator, the BOMS, while setting up my lithium experiment, but rejected it as being too heavy-weight for long-term use, and subjectively, my mood doesnât vary that much.)

### Morning data

Blocks:

- 17-25F: guess: placebo (last pill used morning 25; swapped jars and consumed pill from second jar the morning of 26); actual: placebo
- 26F-8M: skipped multiple days for modafinil (omit March 1, 2); actual: active

Blocks:

- 9M-15M: guess: active actual: placebo
- 16-25: active (omit March 21)

Blocks:

- 26M-1A: guess: placebo actual: placebo
- 2A-8: active

Blocks:

- 9A-19: (omit April 11, 12) guess: placebo actual: placebo
- 20-27: active (omit April 21, 22)

Placebo/active coded as 0/1 in `SSCF.1`

^{26} in the CSV export. Mood was coded as fractional integers as the `Mood`

column.

### Morning analysis

As before, we fire up `R`

and analyze the spreadsheet with the usual assumptions^{27} about independence of the daily observations. The interpreter session:

```
zeo <- read.csv("http://www.gwern.net/docs/zeo/2012-zeo-vitamind-morning.csv")
R> # an example of the many intercorrelations which make simple t-tests misleading
R> # and motivate the use of multivariate linear regression:
R> cor(zeo[c(2,3,5:11, 25)], use="complete.obs")
Vitamin.D Mood Total.Z Time.to.Z Time.in.Wake Time.in.REM Time.in.Light
Vitamin.D 1.000000 -0.06210 0.01007 -0.004528 -0.14399 0.01844 -0.02043
Mood -0.062097 1.00000 0.03038 -0.229114 0.13365 -0.05137 0.06783
Total.Z 0.010067 0.03038 1.00000 -0.388734 -0.05258 0.77338 0.82402
Time.to.Z -0.004528 -0.22911 -0.38873 1.000000 0.17821 -0.29690 -0.28948
Time.in.Wake -0.143987 0.13365 -0.05258 0.178211 1.00000 -0.12396 0.15893
Time.in.REM 0.018437 -0.05137 0.77338 -0.296904 -0.12396 1.00000 0.35087
Time.in.Light -0.020427 0.06783 0.82402 -0.289484 0.15893 0.35087 1.00000
Time.in.Deep 0.054670 0.05648 0.57647 -0.299816 -0.35438 0.37922 0.24574
Awakenings -0.074435 0.09076 0.07645 0.142952 0.67797 0.04007 0.21834
Morning.Feel 0.053450 0.11313 0.62368 -0.285966 -0.04032 0.56241 0.51081
Time.in.Deep Awakenings Morning.Feel
Vitamin.D 0.05467 -0.07444 0.05345
Mood 0.05648 0.09076 0.11313
Total.Z 0.57647 0.07645 0.62368
Time.to.Z -0.29982 0.14295 -0.28597
Time.in.Wake -0.35438 0.67797 -0.04032
Time.in.REM 0.37922 0.04007 0.56241
Time.in.Light 0.24574 0.21834 0.51081
Time.in.Deep 1.00000 -0.28355 0.22280
Awakenings -0.28355 1.00000 0.02151
Morning.Feel 0.22280 0.02151 1.00000
l <- lm(cbind(Total.Z,Time.in.REM,Time.in.Deep,Time.in.Wake,Awakenings,Morning.Feel,Time.to.Z,Mood)
~ Vitamin.D, data=zeo)
summary(manova(l))
Df Pillai approx F num Df den Df Pr(>F)
Vitamin.D 1 0.0363 0.213 9 51 0.99
summary(l)
Response Total.Z :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 525.21 10.06 52.20 <2e-16
Vitamin.D 1.07 13.89 0.08 0.94
Response Time.in.REM :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 162.172 4.711 34.42 <2e-16
Vitamin.D 0.921 6.505 0.14 0.89
Response Time.in.Deep :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 65.34 2.53 25.85 <2e-16
Vitamin.D 1.47 3.49 0.42 0.68
Response Time.in.Wake :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 27.76 3.10 8.94 1.4e-12
Vitamin.D -4.79 4.29 -1.12 0.27
Response Awakenings :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.000 0.592 13.51 <2e-16
Vitamin.D -0.469 0.818 -0.57 0.57
Response Morning.Feel :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.8276 0.1386 20.40 <2e-16
Vitamin.D 0.0787 0.1913 0.41 0.68
Response Time.to.Z :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.448 2.827 9.00 1.1e-12
Vitamin.D -0.136 3.904 -0.03 0.97
Response Mood :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0931 0.1127 27.45 <2e-16
Vitamin.D -0.0744 0.1556 -0.48 0.63
```

The MANOVA suggests no statistically-significant difference between days (*p*=0.99), and no variables seem to have changed much:

Variable | Effect | p-value |
Coefficientâs sign isâŠ |
---|---|---|---|

`Total.Z` |
1.07 | 0.94 | better |

`Time.in.REM` |
0.92 | 0.89 | better |

`Time.in.Deep` |
1.47 | 0.68 | better |

`Time.in.Wake` - |
4.79 | 0.27 | better |

`Awakenings` - |
0.47 | 0.57 | better |

`Morning.Feel` |
0.08 | 0.68 | better |

`Time.to.Z` - |
0.14 | 0.97 | better |

`Mood` - |
0.07 | 0.63 | worse |

All the changes are junk, including ones I was fairly sure would change, like âTime to Zâ or âMoodâ. (An earlier version of this analysis found a statistically-significant effect increasing âMorning Feelâ, but this turns out to be due to the *t*-testsâ assumption that variables were not correlated, and the multivariate linear regression reduces the effect to non-significance.) âMoodâ arguably was affected by an exogenous event - my car burning ruined that particular week.. Graphing the raw data, I notice that when my car burned, my âMoodâ takes a clearly visible fall for a week, while my sleep looks like it was affected less - it seems that during that period, waking up was literally the best part of the dayâŠ

I conclude that the vitamin D in the morning did not damage any of the measured variables, unlike the vitamin D in the evening.

(This experiment also afforded me a chance to test Seth Robertsâs reaction to faked data which contradicted his vitamin D theory; he did not take it gracefully, which is useful to know in weighing his future opinions.)

### Control quality control

Like with melatonin, we might wonder: is taking vitamin D causing effects on the control days as well? With melatonin, the concern I often hear voiced is whether melatonin might in some way be âaddictiveâ or suppress normal melatonin secretion, in which case the observed difference between control and experimental days - which we interpreted as improvement - may actually be the opposite, a negative effect caused by a sort of âwithdrawalâ (lowered melatonin secretion levels, since the body has not yet adapted to the absence of melatonin supplements and will not when supplementation resumes the next day).

In the case of vitamin D, I find the results (no effect on anything *except* âMorning Feelâ) sufficiently surprising that I wonder if this fat-soluble vitamin was causing effects over periods even longer than a week; and that the true results were that both control and experimental weeks were better than unsupplemented weeks, but that âMorning Feelâ was the only variable which reacted to placebo fast enough to show up as a difference. The previously-mentioned August 2012 report of Chris L that an increase of 1k IU in his vitamin D supplementation reduced his deep sleep with month-long lags reinforces my suspicion: with such a long lag, any reduction in my deep sleep would go unnoticed. A completely âdryâ multi-month long control group is necessary.

The solution most obvious to me, although I donât know if itâs statistically correct, is to drop the vitamin D or melatonin for a long enough period that any long-term effects should have disappeared, and then compare this abstention period to the supposed âcontrolâ weeks. If the abstention weeks are worse than the control weeks, then this supports the long-term interpretation; if the abstention weeks are similar to the control weeks, then we can eliminate the long-term interpretation; and if the abstention weeks are better than the control weeks, then we ought to be puzzled and start thinking about other possibilities. (Not enough data/power? Misinterpreted results? Or, the original morning experiment was in spring, while the abstention periods were summer/autumn - does sleep get worse in summer, perhaps due to heat?)

I wonât bother with blinding this one since itâs just a double-check of an unlikely possibility. (If one wanted to blind it, the procedure would be the same as before, but with *big* blocks: say, 2 blocks of 62 days, first pick randomized, or blocks of 31 days, with 4 blocks randomized in 2 pairs.) This âexperimentâ is easy enough to run: simply stop taking vitamin D. To avoid the temptation to cheat on days I am feeling down, itâs easiest to just wait until I run out of vitamin D and procrastinate on ordering a fresh supply until a bunch of days have passed.

The vitamin D experiment terminated in April; the last day of vitamin D was 2 July 2012; and I resumed 6 September 2012 with the end of the dataset being 31 October 2012.

#### Analysis

The question is simple: does the âMorning Feelâ differ between the control days in the original Vitamin D morning experiment and between vitamin-less days as part of a long later sustained period? Was there something funky about the original control days, was there some sort of vitamin D bleed-over or maybe some sort of long-term effect which we could describe as âcontaminationâ or âdependencyâ?

The short answer is: no. When we compare the two groups of days, the âMorning Feelâ ratings have identical means, as we expected.

A Bayesian MCMC analysis^{28} (using the BEST library) produces the following graphical summary, which shows the two groups almost completely overlapping on means, with the key graph in the lower-right corner: there is no visible effect size at all (centered on 0), much less an effect size of *d*>=0.1 which we might take seriously as indicating a real difference:

More precisely, the summary statistics indicate that the difference in means & medians is usually -0.03 (negligibly small), the full range of effect size estimates is -0.4678744 to 0.4142259, and 44.4% of the possibilities were simply zero effect size.

(I did a non-parametric test as well: *p*=0.7103^{29}.)

### VoI

For background on âvalue of informationâ calculations, see the first calculation.

With the vitamin D theory partially vindicated by the previous experiment, I became fairly sure that vitamin D in the morning would benefit my sleep somehow: 70%. Benefit how? I had no idea, it might be large or small. I didnât expect it to be a second melatonin, improving my sleep and trimming it by 50 minutes, but I hoped maybe it would help me get to sleep faster or wake up less. The actual experiment turned out to show, with very high confidence, no bad change (and a good change in my mood upon awakening in the morning).

What is the âvalue of informationâ for this experiment? Essentially - zero:

- If the experiment had shown any benefit, I obviously would have continued taking it in the morning
- if the experiment had shown no effect, I would have continued taking it in the morning to avoid incurring the evening penalty discovered in the previous experiment
- if the experiment had shown the unthinkable (a negative effect), it would have to be substantial to convince me to stop taking vitamin D altogether and forfeit its many other apparent health benefits, and itâs not worth bothering to analyze an outcome I would have given <=5% chance to.

So since I did, was then, and still do supplement vitamin D, why bother? But of course, I did it because it was cool and interesting! (Estimated time cost: perhaps half the evening experiment, since I had to manually record less data, and already had the analysis worked out from before.)

# Potassium

## Potassium day use

In October 2012, I bought some potassium citrate on a lark after noting that the daily RDA and my diet suggested that I was massively deficient. The first night I slept terribly, taking what felt like hours to fall asleep and then waking up frequently - due to either the potassium or a fan left on; the second night with potassium, I turned off the fan but slept poorly again. My suspicions were aroused. I began recording sleep data.

### Background

Partway through the process, I searched Google Scholar and Pubmed (human trials) for âpotassium sleepâ; I checked the first 70 results of both. A general Google search turned up mostly speculation on the relationship of potassium *deficiency* and sleep. The only useful citation was âPotassium affects actigraph-identified sleepâ, Drennan et al 1991; actigraphs likely arenât as good as a Zeo, and *n*=6, but the study is directly relevant. Only 2 actigraph results reached statistical significance: a small improvement in sleep efficiency (the percentage of time spent laying in bed and actually sleeping) and a bigger benefit in âWASOâ (time awake during sleep time; this probably drove the sleep efficiency).

### Data

The first night (10/12) involved falling asleep in 30 minutes rather than my usual 19.6Â±11.9, waking up 12 times (5.9Â±3.4), and spending ~90 minutes awake (18.1Â±16.2) The next day (10/13) I took a similar dose and double-checked the fan before bed: 25 minutes to fall asleep, 10 awakenings, 35 minutes awake, but I woke fairly rested. So it seems like the fan was only partly to blame. The third day (10/14) I omitted any potassium: 21/8/29. Fourth (10/15) on again with an evening dose: 54/7/24. Fifth (10/16), off: 16/2/6. Sixth (10/17), on with a halved dose: 33/3/6. Seventh (10/18), off: 17/6/7. Eighth (10/20), half: 33/6/15. (At this point I began randomizing consumption between on and off; since this is preliminary, I didnât bother with blinding potassium consumption.) Ninth (10/21), on: 25/7/9. Tenth (10/22), on: 18/8/10. 11th (10/23), off: 26/4/10. 12th (10/24), off: 33/7/16. 13th (10/25), on: 32/7/13. 14th (10/26), on: 21/5/8. 15th, on: 34/2/1. 16th, off: 16/7/15. 17th, on: 29/8/20. 18th, on: 17/10/17. 19th, off: 36/9/24. 20th (11/1), on: 21/4/19. 21st (11/2), off: 29/7/16. 22nd (11/3), on: 26/7/10. 23rd (11/4), on: 16/4/11. 24th (11/5), off: 21/4/17. 25th (11/6), on: 19/9/24.

11 Nov, on: 15/3/08. 13 Nov, off: 11/8/21. 14 Nov, off: 18/8/22. 15 Nov, on: 30/8/16. 16 Nov, off: 20/7/12. 17 Nov, on: 34/8/20. 18 Nov, on: 12/8/22. 19 Nov, off: 24/8/14. 20 Nov, on: 26/4/39. 21 Nov, off: 15/6/14. 22 Nov, on: 26/8/29. 23 Nov, on: 23/4/8. 24 Nov, off: 24/3/5. 25 Nov, on: 27/7/15. 26 Nov, on: 30/10/17. 27 Nov, off: 42/12/13. 28 Nov, off: 40/11/42. 29 Nov, off: 19/14/50. 30 Nov, off: 32/8/39. (Here I counted the sample-sizes and realized the off days were drastically under-represented, reducing statistical power; so I have eliminated randomization and gone off potassium.) 1 Dec, off: 28/10/15. 2 Dec, off: 37/8/20. 3 Dec, off: 36/6/18. 4 Dec, off: 19/9/33. 5 Dec, off: 25/8/27. 6 Dec, off: 30/13/45. (Now balanced, resuming randomization.) 7 Dec, on: 31/9/60. 8 Dec, off: 22/9/23. 9 Dec, off: 11/5/21. 10 Dec, on: 30/4/10. 11 Dec, on: 22/9/50. 13 Dec, off: 20/5/6. 14 Dec, off: 33/13/25. 15 Dec, on: 26/11/22. 16 Dec, off: 33/12/28. 17 Dec, off: 42/9/31. 18 Dec, off: 31/9/61. 19 Dec, on: 23/8/18.

### Analysis

#### Sleep disturbances

If potassium was disturbing my sleep, I didnât necessarily want to wait for any one metric of wakefulness to reach significance; rather, I wanted to combine them into a single metric of sleep problems: time to fall asleep (latency), number of awakenings, and time spent awake. (With all 3, higher is worse.) Number of awakenings tends to vary over a smaller range than time to fall asleep or time spent awake - a normal value for the former might be 5, rather than 30 for the latter; to compensate for that, we convert each metric into a standard deviation indicating how unusual eg. 10 awakenings is and whether it is more unusual than it taking 15 minutes to fall asleep. Then we can do a standard test. To graph the data at each step, starting with graphing all the data on an overlapping chart^{30} (this is not per day):

Nights off potassium are colored blue and nights on potassium are red; it looks like red dots are higher than blues, overall, but the trend is not clear. So we convert each individual datapoint to its respective standard deviation^{31}:

The trend has become much clearer, but the final step is to add each dayâs scores to get an overall measure^{32}:

Now the different has become dramatic: one can almost draw a line separating both groups without any errors. As one would expect given this graphical evidence, a Bayesian two-group test reports that there is ~0 chance that the true effect size is 0, and the most likely effect size is a dismaying *d*=-1.1^{33}:

A two-sample test agrees:^{34} *p*=0.0002168. (There is no need for multiple correction in this instance.) This confirms my subjective impression.

#### Mood/productivity

A secondary question is whether potassium delivered any waking benefits. I write down at the end of each day my rating 2-4 how happy and/or productive I felt that day. Does this self-rating show any effect? Hereâs a plot of each day colored by whether it was a potassium day:

There is little visible effect, and the formal Bayesian^{35} analysis is as weak as the sleep disturbances are strong:

So there is no apparent benefit from the potassium.

### Conclusion

This experiment was hastily done and has several weaknesses, some I mentioned before; in ascending order of importance:

dosage was not uniform

Number of dosages varied from day to day as was convenient and doses were measured approximately with a spoon (since 4 grams is a pretty substantial amount, after all). Here is another objection I donât think matters: lower than average doses may contribute to an underestimate of the effect sizeâŠ but that implies that the effect size is even more extreme than -1.1! We are interested in problems that would shrink the effect size back to 0, not imply that itâs even worse than -1.1.the randomization was incomplete

As covered in the data section, there was a severe imbalance in sample size for each condition, so I stopped randomization for about a week. Intuitively, I donât think there was anything special about that week in regard to getting very good sleep (as would be necessary to contribute to an overestimated effect size), but if anyone disagreed, it would not be hard to exclude those days and use the rest.no blinding was done

I am not sure how much this matters. I had no expectation that potassium would affect my sleep at all, one user specifically denied any effect, the only study suggested Iâd find improvements, I did not want to find a negative effect much less such a severe effect, and the sheer strength of the effect over a multi-month period is a bit more than I would expect from any expectancy or placebo effect.*timing*was not uniformOf the issues, this is the most important. If potassium has some stimulating effects as anecdotes claim, then timing may be causing all the sleep disturbances and not potassium per se. It might be exactly like vitamin D in this respect: taken in the evening, it badly damages sleep but taken in the morning, it does nothing or it improves sleep.

If I were to do a followup experiment, it would be blinded & randomized as usual, with consistent doses (eliminating objections 1-3), but more importantly, the dose would be consumed upon awakening.

I am not sure I will bother with a followup experiment. Potassium is not of particular interest to me, my existing supply is low after months of consumption, I observed no subjective improvements on consumption, and so I am not inclined to run the risk of damaging *more* months of sleep. Other people can do that.

## Potassium morning use

As it happened, I managed to retrieve my pill-making machine and spare gel capsules, and I *do* hate to waste perfectly good potassium citrate powder, so I decided to do a morning experiment. I made 3x24 potassium pills and 3x24 brown rice pills (out of flour); I take one set of 3 pills each morning, randomly picking. This procedure addresses all 4 issues, and will answer the question about whether potassiumâs sleep disturbance is due to a timing issue like that of caffeine and vitamin D. Analysis will be the same as before: 3 metrics of sleep disturbance, and then daily self-rating. (I didnât devise a paired-blocks setup since my marked containers were in use elsewhere; as often happens I ran out of one set of pills first, the rice placebo pills, on 10 February 2013, and made another batch of 24 rice placebo pills. The last potassium pill was 21 February 2013.)

### Analysis

Subjectively, I noticed nothing on what turned out to be the potassium days, unlike in the first experiment.

#### Sleep disturbances

Running the analysis the same way as before, we get a small increase in sleep disturbances (*d*=0.15, higher is worse) but the effect could easily be nothing^{36}:

I suspect there really is an underlying causal effect: the first experiment indicated a large increase in sleep disturbances, and a much smaller one is in line with my expectations of the effect of a smaller standardized dose first thing upon waking.

But practically speaking, this small disturbance would be acceptable if it came with some benefit.

#### Mood/productivity

The results look almost identical to before^{37}:

### Conclusion

A much higher-quality experiment with more favorable conditions for potassium showed a result consistent with some harm to my sleep, and no benefit. I will not continue using potassium.

# LSD microdosing

In the middle of the five-fold experiment, I paused part of it to run a more interesting self-experiment using LSD microdosing; I included sleep metrics to check for disturbances. It did not seem to affect latency, total sleep, or awakenings, but did improve (*d*=0.42) the âmorning feelâ non-statistically-significantly (due to the multiple correction). Unfortunately, given that it seemed to negatively affect more important metrics like the self-rating of mood/productivity & creativity, this is not nearly enough to begin to justify further use of LSD microdosing for me.

# Alcohol

Suspicious that alcohol was delaying my sleep and worsening my sleep when I did finally go to bed, I recorded my alcohol consumption for a year. Correlating alcohol use against when I go to bed shows no interesting correlation, nor with any of the other sleep variables Zeo records, even after correcting for a shift in my sleep patterns over that year. So it would seem I was wrong.

In May 2013, I began to wonder if alcohol was damaging my sleep; I donât drink alcohol too often and never more than a glass or two, so I donât have any tolerance built up. I noticed that on nights when I drank some red wine or had some of my mead, it seemed to take me much longer to fall asleep and I would regularly wake up in the middle of the night. So I began noting down days on which I drank any alcohol, to see if it correlated with sleep problems (and probably then just refrain from alcohol in the evening, since I donât care enough to run a randomized experiment).

In May 2014, I ran out of all my mead and also a gallon of burgundy wine I had bought to make beef bourguignon with, so that marked a natural close to the data collection. I compiled the alcohol data along with the Zeo data in the relevant time period, and looked at the key metrics with a multivariate multiple regression. The main complexity here is that I earlier discovered that I had gradually shifted my sleep down and now `Start.of.Night`

looks like a sigmoid, so to control for that, I fit a sigmoid to the `Date`

using nonlinear least squares, and then plugged the estimated values in. The code, showing only the results for the `Alcohol`

boolean:

```
drink <- read.csv("http://www.gwern.net/docs/zeo/2014-gwern-alcohol.csv")
library(minpack.lm)
summary(nlsLM(Start.of.Night ~ Alcohol + as.integer(Date) + (a / (1 + exp(-b * (as.integer(Date) - c)))),
start = list(a = 6.15e+05, b = -1.18e-04, c = -5.15e+04),
control=(nls.lm.control(ftol = sqrt(.Machine$double.eps)/4.9, maxfev=1024, maxiter=1024)),
data=drink))
#
# Parameters:
# Estimate Std. Error t value Pr(>|t|)
# a 5.61e+06 6.49e+09 0.00 1.00
# b -1.00e-03 2.44e-04 -4.10 4.8e-05
# c -8.26e+03 1.16e+06 -0.01 0.99
summary(lm(cbind(Start.of.Night, Time.to.Z, Time.in.Wake, Awakenings, Morning.Feel, Total.Z, Time.in.REM, Time.in.Deep) ~
Alcohol +
as.integer(Date) + I(5.61e+06 / (1 + exp(-(1.00e-03) * (as.integer(Date) - (-8.26e+03))))),
data=drink))
# Response Start.of.Night :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE -8.96e-01 4.75e+00 -0.19 0.85
#
# Response Time.to.Z :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE -2.50e+00 1.41e+00 -1.77 0.077
#
# Response Time.in.Wake :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE -2.04e+00 2.40e+00 -0.85 0.3956
#
# Response Awakenings :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE -2.03e-01 2.85e-01 -0.71 0.48
#
# Response Morning.Feel :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE -5.03e-02 9.16e-02 -0.55 0.5836
#
# Response Total.Z :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE 1.04e+01 7.89e+00 1.32 0.19
#
# Response Time.in.REM :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 7.59e+05 9.83e+05 0.77 0.44
# AlcoholTRUE 1.84e+00 3.58e+00 0.51 0.61
#
# Response Time.in.Deep :
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# AlcoholTRUE 1.14e+00 1.41e+00 0.80 0.42
```

Zilch. No correlation is at all interesting.

So it looks like alcohol - at least in the small quantities I consume - makes no difference.

# Timing

## Bed time for better sleep

Someone asked if I could turn up a better bedtime using their Zeo data. I accepted, but the sleep data comes with quite a few variables and itâs not clear which variable is the âbestâ - for example, I donât think much of the ZQ variable, so itâs not as simple as regressing `ZQ ~ Bedtime`

and finding what value of Bedtime maximizes ZQ. I decided that I could try finding the optimal bedtime by two strategies:

- look for some underlying factor of good sleep using factor analysis - Iâd expect maybe 2 or 3 factors, one for total sleep, one for insomnia, and maybe one for REM sleep - and maximize the good ones and minimize the bad ones, equally weighted
- just do a multivariate regression and weight each variable equally

So, setup:

```
zeo <- read.csv("http://www.gwern.net/docs/zeo/gwern-zeodata.csv")
zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y")
## convert "05/12/2014 06:45" to "06:45"
zeo$Start.of.Night <- sapply(strsplit(as.character(zeo$Start.of.Night), " "), function(x) { x[2] })
## convert "06:45" to 24300
interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x))
else { y <- unlist(strsplit(x, ":")); as.integer(y[[1]])*60 + as.integer(y[[2]]); }
}
else NA
}
zeo$Start.of.Night <- sapply(zeo$Start.of.Night, interval)
## correct for the switch to new unencrypted firmware in March 2013;
## I don't know why the new firmware subtracts 15 hours
zeo[(zeo$Sleep.Date >= as.Date("2013-03-11")),]$Start.of.Night
<- (zeo[(zeo$Sleep.Date >= as.Date("2013-03-11")),]$Start.of.Night + 900) %% (24*60)
## after midnight (24*60=1440), Start.of.Night wraps around to 0, which obscures any trends,
## so we'll map anything before 7AM to time+1440
zeo[zeo$Start.of.Night<420 & !is.na(zeo$Start.of.Night),]$Start.of.Night
<- (zeo[zeo$Start.of.Night<420 & !is.na(zeo$Start.of.Night),]$Start.of.Night + (24*60))
## keep only the variables we're interested in:
zeo <- zeo[,c(2:10, 23)]
## define naps or nights with bad data as total sleep time under ~1.5 hours (100m) & delete
zeo <- zeo[zeo$Total.Z>100,]
write.csv(zeo, file="bedtime-factoranalysis.csv", row.names=FALSE)
```

Letâs begin with asimple factor analysis, looking for a âgood sleepâ factor. Zeo Inc apparently was trying for this with the `ZQ`

variable but Iâve always been suspicious of it because it doesnât seem to track `Morning.Feel`

or `Awakenings`

very well but simply be how long you slept (`Total.Z`

):

```
zeo <- read.csv("http://www.gwern.net/docs/zeo/2014-07-26-bedtime-factoranalysis.csv")
library(psych)
nfactors(zeo)
# VSS complexity 1 achieves a maximimum of 0.8 with 6 factors
# VSS complexity 2 achieves a maximimum of 0.94 with 6 factors
# The Velicer MAP achieves a minimum of 0.09 with 1 factors
# Empirical BIC achieves a minimum of 466.5 with 5 factors
# Sample Size adjusted BIC achieves a minimum of 39396 with 5 factors
#
# Statistics by number of factors
# vss1 vss2 map dof chisq prob sqresid fit RMSEA BIC SABIC complex eChisq eRMS eCRMS eBIC
# 1 0.71 0.00 0.090 35 41394 0 6.4648 0.71 0.99 41145 41256 1.0 1.8e+03 0.12926 0.15 1577
# 2 0.77 0.85 0.099 26 40264 0 3.3366 0.85 1.13 40079 40162 1.2 9.4e+02 0.09275 0.12 755
# 3 0.78 0.89 0.139 18 40323 0 2.1333 0.91 1.36 40195 40253 1.4 9.0e+02 0.09075 0.14 772
# 4 0.75 0.89 0.216 11 39886 0 1.3401 0.94 1.73 39808 39843 1.5 8.0e+02 0.08560 0.17 722
# 5 0.78 0.89 0.280 5 39415 0 0.7267 0.97 2.56 39380 39396 1.4 5.0e+02 0.06779 0.20 467
# 6 0.80 0.94 0.450 0 38640 NA 0.3194 0.99 NA NA NA 1.2 2.2e+02 0.04479 NA NA
# 7 0.80 0.92 0.807 -4 37435 NA 0.1418 0.99 NA NA NA 1.2 1.0e+02 0.03075 NA NA
# 8 0.78 0.91 4.640 -7 30474 NA 0.0002 1.00 NA NA NA 1.3 2.5e-02 0.00048 NA NA
# 9 0.78 0.91 NaN -9 30457 NA 0.0002 1.00 NA NA NA 1.3 2.5e-02 0.00048 NA NA
# 10 0.78 0.91 NA -10 30440 NA 0.0002 1.00 NA NA NA 1.3 2.5e-02 0.00048 NA NA
## BIC says 5 factors, so we'll go with that:
factorization <- fa(zeo, nfactors=5); factorization
# Standardized loadings (pattern matrix) based upon correlation matrix
# MR1 MR2 MR5 MR4 MR3 h2 u2 com
# ZQ 0.87 -0.14 -0.01 0.25 -0.04 0.99 0.013 1.2
# Total.Z 0.96 0.04 -0.01 0.07 -0.04 0.99 0.011 1.0
# Time.to.Z 0.05 -0.03 0.92 0.03 0.10 0.84 0.159 1.0
# Time.in.Wake -0.18 0.90 -0.02 0.04 -0.15 0.83 0.168 1.1
# Time.in.REM 0.87 0.05 0.03 0.05 0.09 0.78 0.215 1.0
# Time.in.Light 0.94 0.02 -0.04 -0.20 -0.14 0.84 0.158 1.1
# Time.in.Deep 0.02 0.03 0.01 0.99 -0.02 0.98 0.023 1.0
# Awakenings 0.35 0.75 0.08 -0.03 0.26 0.79 0.209 1.7
# Start.of.Night -0.21 0.00 0.10 -0.05 0.86 0.84 0.162 1.2
# Morning.Feel 0.22 -0.13 -0.55 0.11 0.46 0.66 0.343 2.5
#
# MR1 MR2 MR5 MR4 MR3
# SS loadings 3.65 1.44 1.21 1.16 1.08
# Proportion Var 0.37 0.14 0.12 0.12 0.11
# Cumulative Var 0.37 0.51 0.63 0.75 0.85
# Proportion Explained 0.43 0.17 0.14 0.14 0.13
# Cumulative Proportion 0.43 0.60 0.74 0.87 1.00
#
# With factor correlations of
# MR1 MR2 MR5 MR4 MR3
# MR1 1.00 0.03 -0.18 0.34 -0.03
# MR2 0.03 1.00 0.27 -0.09 0.00
# MR5 -0.18 0.27 1.00 -0.09 0.09
# MR4 0.34 -0.09 -0.09 1.00 0.03
# MR3 -0.03 0.00 0.09 0.03 1.00
#
# Mean item complexity = 1.3
# Test of the hypothesis that 5 factors are sufficient.
#
# The degrees of freedom for the null model are 45 and the objective function was 40.02 with Chi Square of 48376
# The degrees of freedom for the model are 5 and the objective function was 32.69
#
# The root mean square of the residuals (RMSR) is 0.07
# The df corrected root mean square of the residuals is 0.2
#
# The harmonic number of observations is 1152 with the empirical chi square 473.1 with prob < 5.1e-100
# The total number of observations was 1214 with MLE Chi Square = 39412 with prob < 0
#
# Tucker Lewis Index of factoring reliability = -6.359
# RMSEA index = 2.557 and the 90 % confidence intervals are 2.527 2.569
# BIC = 39377
# Fit based upon off diagonal values = 0.97
```

This looks like MR1=overall sleep; MR2=insomnia/bad-sleep; MR5=difficulty-falling-asleep?; MR4=deep-sleep-(not part of MR1!); MR3=dunno. MR1 and MR4 correlate 0.34, and MR2/MR5 0.27, which makes sense. I want to maximize overall sleep and deep sleep (deep sleep seems connected to health), so MR1 and M4.

Now that we have our factors, we can extract them and plot them over time for a graphical look:

```
MR1 <- predict(factorization, data=zeo)[,1]
MR4 <- predict(factorization, data=zeo)[,4]
par(mfrow=c(2,1), mar=c(4,4.5,1,1))
plot(MR1 ~ I(Start.of.Night/60), xlab="", ylab="Total sleep (MR1)", data=zeo)
plot(MR4 ~ I(Start.of.Night/60), xlab="Bedtime", ylab="Deep sleep (MR4)", data=zeo)
```

looks like a overall linear decline (later=worse), but *possibly* with a peak somewhere looking like a quadratic.

So weâll try fitting quadratics:

```
factorModel <- lm(cbind(MR1, MR4) ~ Start.of.Night + I(Start.of.Night^2), data=zeo); summary(factorModel)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -6.63e+01 7.65e+00 -8.67 <2e-16
# Start.of.Night 9.74e-02 1.07e-02 9.13 <2e-16
# I(Start.of.Night^2) -3.56e-05 3.72e-06 -9.57 <2e-16
#
# Residual standard error: 0.829 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.152, Adjusted R-squared: 0.15
# F-statistic: 101 on 2 and 1127 DF, p-value: <2e-16
#
#
# Response MR4 :
#
# Call:
# lm(formula = MR4 ~ Start.of.Night + I(Start.of.Night^2), data = zeo)
#
# Residuals:
# Min 1Q Median 3Q Max
# -3.057 -0.651 -0.017 0.600 4.329
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -5.06e+01 8.97e+00 -5.64 2.1e-08
# Start.of.Night 7.23e-02 1.25e-02 5.79 9.3e-09
# I(Start.of.Night^2) -2.58e-05 4.36e-06 -5.92 4.2e-09
#
# Residual standard error: 0.971 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0384, Adjusted R-squared: 0.0367
# F-statistic: 22.5 on 2 and 1127 DF, p-value: 2.57e-10
## on the other hand, if we had ignored the quadratic term, we'd
## get a much worse fit
summary(lm(cbind(MR1, MR4) ~ Start.of.Night, data=zeo))
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 6.643744 0.653047 10.2 <2e-16
# Start.of.Night -0.004613 0.000457 -10.1 <2e-16
#
# Residual standard error: 0.861 on 1128 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0829, Adjusted R-squared: 0.0821
# F-statistic: 102 on 1 and 1128 DF, p-value: <2e-16
#
# Response MR4 :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 2.337279 0.747401 3.13 0.0018
# Start.of.Night -0.001627 0.000523 -3.11 0.0019
#
# Residual standard error: 0.986 on 1128 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.00851, Adjusted R-squared: 0.00764
# F-statistic: 9.69 on 1 and 1128 DF, p-value: 0.0019
```

So we want to use the quadratic. Given this quadratic model, whatâs the optimal bedtime?

```
estimatedFactorValues <- predict(factorModel, newdata=data.frame(Start.of.Night=1:max(zeo$Start.of.Night, na.rm=TRUE)))
## when is MR1 maximized?
which(estimatedFactorValues[,1] == max(estimatedFactorValues[,1]))
# 1368
1368 / 60
# [1] 22.8
## 10:48 PM seems reasonable
## when is MR3 maximized?
which(estimatedFactorValues[,2] == max(estimatedFactorValues[,2]))
# 1401
## 11:21 PM seems reasonable
## summing the factors isn't quite the average of the two time, but it's close:
combinedFactorSums <- rowSums(estimatedFactorValues)
which(combinedFactorSums == max(combinedFactorSums))
# 1382
## 11:02PM
```

Maybe using factors wasnât a good idea? We can try a multivariate regression on the variables directly:

```
quadraticModel <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM,
Time.in.Light, Time.in.Deep, Awakenings, Morning.Feel)
~ Start.of.Night + I(Start.of.Night^2), data=zeo)
summary(quadraticModel)
# Response ZQ :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -7.84e+02 1.06e+02 -7.38 3.1e-13
# Start.of.Night 1.29e+00 1.48e-01 8.68 < 2e-16
# I(Start.of.Night^2) -4.70e-04 5.16e-05 -9.10 < 2e-16
#
# Residual standard error: 11.5 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.139, Adjusted R-squared: 0.137
# F-statistic: 90.9 on 2 and 1127 DF, p-value: <2e-16
#
# Response Total.Z :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -4.48e+03 5.54e+02 -8.08 1.7e-15
# Start.of.Night 7.32e+00 7.73e-01 9.47 < 2e-16
# I(Start.of.Night^2) -2.67e-03 2.69e-04 -9.91 < 2e-16
#
# Residual standard error: 60 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.158, Adjusted R-squared: 0.156
# F-statistic: 106 on 2 and 1127 DF, p-value: <2e-16
#
# Response Time.to.Z :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -6.09e+02 1.22e+02 -4.98 7.3e-07
# Start.of.Night 8.43e-01 1.71e-01 4.94 8.8e-07
# I(Start.of.Night^2) -2.81e-04 5.95e-05 -4.73 2.6e-06
#
# Residual standard error: 13.2 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0431, Adjusted R-squared: 0.0415
# F-statistic: 25.4 on 2 and 1127 DF, p-value: 1.61e-11
#
# Response Time.in.Wake :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -1.26e+02 1.76e+02 -0.72 0.47
# Start.of.Night 2.15e-01 2.45e-01 0.88 0.38
# I(Start.of.Night^2) -7.83e-05 8.55e-05 -0.92 0.36
#
# Residual standard error: 19.1 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.00149, Adjusted R-squared: -0.000283
# F-statistic: 0.84 on 2 and 1127 DF, p-value: 0.432
#
# Response Time.in.REM :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -1.43e+03 2.69e+02 -5.32 1.2e-07
# Start.of.Night 2.32e+00 3.75e-01 6.19 8.6e-10
# I(Start.of.Night^2) -8.39e-04 1.31e-04 -6.42 2.0e-10
#
# Residual standard error: 29.1 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0608, Adjusted R-squared: 0.0592
# F-statistic: 36.5 on 2 and 1127 DF, p-value: 4.37e-16
#
# Response Time.in.Light :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -2.45e+03 3.43e+02 -7.15 1.5e-12
# Start.of.Night 4.07e+00 4.78e-01 8.50 < 2e-16
# I(Start.of.Night^2) -1.50e-03 1.67e-04 -9.00 < 2e-16
#
# Residual standard error: 37.2 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.164, Adjusted R-squared: 0.162
# F-statistic: 110 on 2 and 1127 DF, p-value: <2e-16
#
# Response Time.in.Deep :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -5.88e+02 1.10e+02 -5.34 1.1e-07
# Start.of.Night 9.27e-01 1.53e-01 6.04 2.1e-09
# I(Start.of.Night^2) -3.30e-04 5.35e-05 -6.17 9.5e-10
#
# Residual standard error: 11.9 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0398, Adjusted R-squared: 0.0381
# F-statistic: 23.4 on 2 and 1127 DF, p-value: 1.12e-10
#
# Response Awakenings :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -1.18e+02 2.71e+01 -4.36 1.4e-05
# Start.of.Night 1.68e-01 3.77e-02 4.46 9.0e-06
# I(Start.of.Night^2) -5.67e-05 1.32e-05 -4.31 1.7e-05
#
# Residual standard error: 2.93 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0274, Adjusted R-squared: 0.0256
# F-statistic: 15.9 on 2 and 1127 DF, p-value: 1.62e-07
#
# Response Morning.Feel :
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -2.12e+01 7.02e+00 -3.01 0.00266
# Start.of.Night 3.32e-02 9.79e-03 3.39 0.00073
# I(Start.of.Night^2) -1.15e-05 3.41e-06 -3.37 0.00079
#
# Residual standard error: 0.761 on 1127 degrees of freedom
# (84 observations deleted due to missingness)
# Multiple R-squared: 0.0103, Adjusted R-squared: 0.0085
# F-statistic: 5.84 on 2 and 1127 DF, p-value: 0.00301
## Likewise, what's the optimal predicted time?
estimatedValues <- predict(quadraticModel, newdata=data.frame(Start.of.Night=1:max(zeo$Start.of.Night, na.rm=TRUE)))
# but what time is best? we have so many choices of variable to optimize.
# Let's simply sum them all and say bigger is better
# first, we need to negate 'Time.in.Wake', 'Time.to.Z', 'Awakenings',
# as for those, bigger is worse
estimatedValues[,3] <- -estimatedValues[,3] # Time.to.Z
estimatedValues[,4] <- -estimatedValues[,4] # Time.in.Wake
estimatedValues[,8] <- -estimatedValues[,8] # Awakenings
combinedSums <- rowSums(estimatedValues)
which(combinedSums == max(combinedSums))
# 1362
```

Or 10:42PM, which is almost identical to the MR1 estimate. So just like before.

Both approaches suggest that I go to bed somewhat earlier than I do now. This has the same correlationâ causality issue as the rise-time analysis does (perhaps I am especially sleepy on the days I go to bed a bit early and so naturally sleep more), but on the other hand, itâs not suggesting I go to bed at 7PM or anything crazy, so I am more inclined to take a chance on it.

## Rise time for productivity

I noticed a claim that for one person, rising at 3-5AM (!) seemed to improve their days âbecause the morning hours have no distractionsâ and I wondered whether there might be any such correlation for myself, so I took my usual MP daily self-rating and plotted against rise-time that day:

It looks like a cubic suggesting one peak around 8:30AM and then a later peak, but thatâs based on so little I ignore it. The causal relationship is also unclear: maybe getting up earlier really does cause higher MP self-ratings, but perhaps on days I donât feel like doing anything I am more likely to sleep in, or some other common cause. The available samples suggest that earlier than that is worse, possibly much worse, so I am not inclined to try out something I expect to make me miserable.

The source code of the graph & analysis; preprocessing:

```
mp <- read.csv("~/selfexperiment/mp.csv", colClasses=c("Date","integer"))
zeo <- read.csv("http://www.gwern.net/docs/zeo/gwern-zeodata.csv")
## we want the date of the day sleep ended, not started, so we ignore the usual 'Sleep.Date' and construct our own 'Date':
zeo$Date <- as.Date(sapply(strsplit(as.character(zeo$Rise.Time), " "), function(x) { x[1] }), format="%m/%d/%Y")
## convert "05/12/2014 06:45" to "06:45"
zeo$Rise.Time <- sapply(strsplit(as.character(zeo$Rise.Time), " "), function(x) { x[2] })
## convert "06:45" to the integer 24300
interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x))
else { y <- unlist(strsplit(x, ":")); as.integer(y[[1]])*60 + as.integer(y[[2]]); }
}
else NA
}
zeo$Rise.Time <- sapply(zeo$Rise.Time, interval)
## doesn't always work, so delete missing data:
zeo <- zeo[!is.na(zeo$Date),]
## correct for the switch to new unencrypted firmware in March 2013;
## I don't know why the new firmware changed things; adjustment of 226 minutes was estimated using:
# library(changepoint); cpt.mean(na.omit(zeo$Rise.Time)); '$mean [1] 566.7 340.2'; 566.7 - 340.2 = 226
zeo[(zeo$Date >= as.Date("2013-03-11")),]$Rise.Time
<- (zeo[(zeo$Date >= as.Date("2013-03-11")),]$Rise.Time + 226) %% (24*60)
allData <- merge(mp,zeo)
morning <- data.frame(MP=allData$MP, Rise.Time=allData$Rise.Time)
morning$Rise.Time.Hour <- morning$Rise.Time / 60
write.csv(morning, file="morning.csv", row.names=FALSE)
```

Graphing and fitting:

```
morning <- read.csv("http://www.gwern.net/docs/zeo/2014-07-26-risetime-mp.csv")
library(ggplot2)
ggplot(data = morning, aes(x=Rise.Time.Hour, y=jitter(MP, factor=0.2)))
+ xlab("Wake time (24H)")
+ ylab("Mood/productivity self-rating (2/3/4)")
+ geom_point(size=I(4))
## cross-validation suggests 0.8397 but looks identical to auto-LOESS span choice
+ stat_smooth(span=0.8397)
## looks 100% like a cubic function
linear <- lm(MP ~ Rise.Time, data=morning)
cubic <- lm(MP ~ poly(Rise.Time,3), data=morning)
anova(linear,cubic)
# Model 1: MP ~ Rise.Time
# Model 2: MP ~ poly(Rise.Time, 3)
# Res.Df RSS Df Sum of Sq F Pr(>F)
# 1 839 442
# 2 837 437 2 5.36 5.14 0.0061
AIC(linear,cubic)
# df AIC
# linear 3 1852
# cubic 5 1846
summary(cubic)
# ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 3.0571 0.0249 122.70 <2e-16
# poly(Rise.Time, 3)1 -0.9627 0.7225 -1.33 0.183
# poly(Rise.Time, 3)2 -1.4818 0.7225 -2.05 0.041
# poly(Rise.Time, 3)3 1.7795 0.7225 2.46 0.014
#
# Residual standard error: 0.723 on 837 degrees of freedom
# Multiple R-squared: 0.0142, Adjusted R-squared: 0.0107
# F-statistic: 4.02 on 3 and 837 DF, p-value: 0.00749
# plot(morning$Rise.Time,morning$MP); points(morning$Rise.Time,fitted(cubic),pch=19)
which(fitted(cubic) == max(fitted(cubic))) / 60
# 516 631 762
# 8.60 10.52 12.70
```

# Magnesium citrate

Re-analyzing data from a magnesium self-experiment, I find both positive and negative effects of the magnesium on my sleep. Itâs not clear what the net effect is.

I became interested in magnesium after noting a possible effect on my productivity from TruBrain (which among other things included a magnesium tablet), and then a clear correlation from some magnesium l-threonate. Iâd also long heard of magnesium helping sleep, and was curious about that too. So I began a large (~207 days) RCT trying out 136mg then 800mg of elemental magnesium per day in late 2013 - early 2014. (This was not a large enough experiment to definitively answer questions about both productivity and sleep, but since I have all the data on hand, I thought Iâd look.)

The results of the main were surprising: it seemed that the magnesium caused an initial large boost to my productivity, *but* the boost began to fade and after 20 days or so, the effect became negative, and the period with the larger dose had a worse effect, suggesting a cumulative overdose.

With the differing effect of the doses in mind, I looked at the effect on my sleep data.

## Analysis

Prep:

```
magnesium <- read.csv("http://www.gwern.net/docs/nootropics/2013-2014-magnesium.csv")
magnesium$Date <- as.Date(magnesium$Date)
zeo <- read.csv("http://www.gwern.net/docs/zeo/gwern-zeodata.csv")
zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y")
zeo$Date <- zeo$Sleep.Date
rm(zeo$Sleep.Date)
# create a equally-weighted index of bad sleep: a z-score of the 3 bad things
zeo$Disturbance <- scale(zeo$Time.to.Z) + scale(zeo$Awakenings) + scale(zeo$Time.in.Wake)
magnesiumSleep <- merge(zeo, magnesium)
write.csv(magnesiumSleep, file="2014-07-27-magnesium-sleep.csv", row.names=FALSE)
```

(I then hand-edited the CSV to delete unused columns.)

Graphing Disturbance:

```
magnesiumSleep <- read.csv("http://www.gwern.net/docs/zeo/2014-07-27-magnesium-sleep.csv")
magnesiumSleep$Date <- as.Date(magnesiumSleep$Date)
## historical baseline:
magnesiumSleep[is.na(magnesiumSleep$Magnesium.citrate),]$Magnesium.citrate <- -1
library(ggplot2)
ggplot(data = magnesiumSleep, aes(x=Date, y=Disturbance, col=as.factor(magnesiumSleep$Magnesium.citrate))) +
ylab("Disturbance z-score (lower=better)") +
geom_point(size=I(4)) +
stat_smooth() +
scale_colour_manual(values=c("gray49", "grey35", "red1", "red2" ),
name = "Magnesium")
```

Analysis (first disturbances, then all variables):

```
magnesiumSleep <- read.csv("http://www.gwern.net/docs/zeo/2014-07-27-magnesium-sleep.csv")
l0 <- lm(Disturbance ~ as.factor(Magnesium.citrate), data=magnesiumSleep)
summary(l0)
# ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -0.5020571 0.1862795 -2.69518 0.0076218
# as.factor(Magnesium.citrate)136 -0.0566556 0.3101388 -0.18268 0.8552318
# as.factor(Magnesium.citrate)800 -0.5394708 0.3259212 -1.65522 0.0994178
```

So it seems that magnesium citrate may decrease sleep problems.

```
l1 <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM, Time.in.Light,
Time.in.Deep, Awakenings, Morning.Feel)
~ as.factor(Magnesium.citrate),
data=magnesiumSleep)
summary(l1)
# Response ZQ : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 95.85149 1.29336 74.11065 < 2e-16
# as.factor(Magnesium.citrate)136 -3.27254 2.15332 -1.51976 0.13012
# as.factor(Magnesium.citrate)800 1.49545 2.26290 0.66086 0.50945
#
# Response Total.Z : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 536.35644 6.59166 81.36898 < 2e-16
# as.factor(Magnesium.citrate)136 -27.37398 10.97453 -2.49432 0.013414
# as.factor(Magnesium.citrate)800 15.86805 11.53300 1.37588 0.170367
#
# Response Time.to.Z : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 12.59406 1.24108 10.14766 < 2e-16
# as.factor(Magnesium.citrate)136 4.26559 2.06629 2.06437 0.040247
# as.factor(Magnesium.citrate)800 -2.43079 2.17144 -1.11944 0.264269
#
# Response Time.in.Wake : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 24.09901 1.87720 12.83776 < 2e-16
# as.factor(Magnesium.citrate)136 -3.66041 3.12537 -1.17119 0.24289
# as.factor(Magnesium.citrate)800 -4.16023 3.28441 -1.26666 0.20672
#
# Response Time.in.REM : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 171.45545 2.99387 57.26889 < 2e-16
# as.factor(Magnesium.citrate)136 -6.45545 4.98452 -1.29510 0.19675
# as.factor(Magnesium.citrate)800 2.27925 5.23818 0.43512 0.66393
#
# Response Time.in.Light : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 304.54455 4.08746 74.50709 < 2.22e-16
# as.factor(Magnesium.citrate)136 -23.33403 6.80525 -3.42883 0.00073338
# as.factor(Magnesium.citrate)800 20.51667 7.15156 2.86884 0.00455323
#
# Response Time.in.Deep : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 60.88119 1.20888 50.36152 < 2e-16
# as.factor(Magnesium.citrate)136 2.48723 2.01268 1.23578 0.21796
# as.factor(Magnesium.citrate)800 -6.81996 2.11510 -3.22441 0.00147
#
# Response Awakenings : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 6.039604 0.238675 25.30475 < 2e-16
# as.factor(Magnesium.citrate)136 -0.548376 0.397372 -1.38001 0.16910
# as.factor(Magnesium.citrate)800 -0.427359 0.417594 -1.02338 0.30734
#
# Response Morning.Feel : ...Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 2.7227723 0.0762575 35.70497 < 2e-16
# as.factor(Magnesium.citrate)136 0.1193330 0.1269620 0.93991 0.34837
# as.factor(Magnesium.citrate)800 -0.1513437 0.1334229 -1.13432 0.25799
l2 <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM, Time.in.Light,
Time.in.Deep, Awakenings, Morning.Feel) ~ Magnesium.citrate,
data=magnesiumSleep)
summary(manova(l1))
# Df Pillai approx F num Df den Df Pr(>F)
# as.factor(Magnesium.citrate) 2 0.3265357 4.271083 18 394 2.3902e-08
# Residuals 204
summary(manova(l2))
# Df Pillai approx F num Df den Df Pr(>F)
# Magnesium.citrate 1 0.1815233 4.85456 9 197 7.1454e-06
# Residuals 205
which(p.adjust(c(0.3483,0.2579,0.1752,0.1301,0.5094,0.3344,0.0134,0.1703,0.0632,0.1967,
0.6639,0.4895,0.0007,0.0045,0.0005,0.2179,0.0014,0.0004,0.0402,0.2642,
0.1262,0.2428,0.2067,0.2673,0.1691,0.3073,0.4144),
method="BH")
< 0.05)
# [1] 13 14 15 17 18
```

A table summarizing the results by dose (âallâ is the net effect from the non-factor version):

Variable | Dose (mg) | Coef | p |
Effect |
---|---|---|---|---|

`Morning.Feel` |
136 | 0.11933 | 0.3483 | better |

`Morning.Feel` |
800 | -0.15134 | 0.2579 | worse |

`Morning.Feel` |
all | -0.00022 | 0.1752 | worse |

`ZQ` |
136 | -3.27254 | 0.1301 | worse |

`ZQ` |
800 | 1.49545 | 0.5094 | better |

`ZQ` |
all | 0.00270 | 0.3344 | better |

`Total.Z` |
136 | -27.3739 | 0.0134 | worse |

`Total.Z` |
800 | 15.8680 | 0.1703 | better |

`Total.Z` |
all | 0.02698 | 0.0632 | better |

`Time.in.REM` |
136 | -6.45545 | 0.1967 | worse |

`Time.in.REM` |
800 | 2.27925 | 0.6639 | better |

`Time.in.REM` |
all | 0.00447 | 0.4895 | better |

`Time.in.Light` |
136 | -23.3340 | 0.0007 | worse |

`Time.in.Light` |
800 | 20.5166 | 0.0045 | better |

`Time.in.Light` |
all | 0.03202 | 0.0005 | better |

`Time.in.Deep` |
136 | 2.48723 | 0.2179 | better |

`Time.in.Deep` |
800 | -6.81996 | 0.0014 | worse |

`Time.in.Deep` |
all | -0.00939 | 0.0004 | worse |

`Time.to.Z` |
136 | 4.26559 | 0.0402 | worse |

`Time.to.Z` |
800 | -2.43079 | 0.2642 | better |

`Time.to.Z` |
all | -0.00415 | 0.1262 | better |

`Time.in.Wake` |
136 | -3.66041 | 0.2428 | better |

`Time.in.Wake` |
800 | -4.16023 | 0.2067 | better |

`Time.in.Wake` |
all | -0.00449 | 0.2673 | better |

`Awakenings` |
136 | -0.54837 | 0.1691 | better |

`Awakenings` |
800 | -0.42735 | 0.3073 | better |

`Awakenings` |
all | -0.00042 | 0.4144 | better |

For the low dose, 4/9 were better; for the high dose, 7/9 were better. Adjusting for multiple-comparison at *p*<0.05: the surviving effects are:

Variable | Dose (mg) | Coef | p |
Effect |
---|---|---|---|---|

`Time.in.Light` |
136 | -23.3340 | 0.0007 | worse |

`Time.in.Light` |
800 | 20.5166 | 0.0045 | better |

`Time.in.Light` |
all | 0.03202 | 0.0005 | better |

`Time.in.Deep` |
800 | -6.81996 | 0.0014 | worse |

`Time.in.Deep` |
all | -0.00939 | 0.0004 | worse |

# In progress

Someone suggested that instead of running experiments serially, with limited sample sizes (because I am impatient to try the next interesting suggestion), I could instead take a step up in statistical sophistication and use a *factorial experiment* design: use multiple experimental interventions simultaneously for a much larger sample size, and then run ANOVA analyses rather than simpler two-sample t-tests. No less than R.A. Fisher praises multifactorial experiments as being more efficient: squeezing more data out of a given sample. Hence, I thought a crazy thought: my lithium experiment was going to run for ~360 days, and so I kept putting it off. But what if I ran multiple experiments for 360 days? If I had 4 or 5, then by the end of the year, I would have 5 results to show, and I would have the statistical equivalent of *more* than *n*=72 ($\frac{360}{5}$) for each experiment. Win-win.

Classic multifactorial designs arrange to have every possible combination of the *n* experiments happen on some day or other (such an arrangement is called a Latin square). However, with 5 experiments, each of which has 2 states (on and off), that means I only have 2^{5}=32 possible arrangements, all of which ought to be covered over 360 days, terminating in March 2013. (It actually will take much longer, as I paused the lithium sub-experiment for several months to run the X self-experiment.)

So I will be lazy and will independently randomize each experiment. What are my 5 chosen interventions?

## Lithium

Rationale & procedure in the Nootropics page. Randomized in 7-day paired blocks. Blinded.

## Redshift/f.lux

My earlier melatonin experiment found it helped me sleep. Melatonin secretion is also influenced by the color of light (some references can be found in my melatonin article), specifically blue light tends to suppress melatonin secretion while redder light does not affect it. (This makes sense: blue/white light is associated with the brightest part of the day, while reddish light is the color of sunsets.) Electronics and computer monitors frequently emit white or blue light. (The recent trend of bright blue LEDs is particularly deplorable in this regard.) Besides the plausible suggestion about melatonin, reddish light impairs night vision less and is easier to see under dim conditions: you may want a blazing white screen at noon so you can see *something*, but in a night setting, that is like staring for hours straight into a fluorescent light.

Hence, you would like to both dim your monitor and also shift the color temperature towards the cooler redder end of the spectrum with a utility like Redshift.

But does it actually *work*? An experiment is called for!

The suggested mechanism is through melatonin secretion. So weâd look at all the usual sleep metrics plus mood plus an additional one: what time I go to bed.. One of the reasons I became interested in melatonin was as a way of getting myself to go to bed rather than stay up until 3 AM - a chemically enforced bedtime - and it seems plausible that if Redshift is reducing the interference of the computer monitor, it will make me stay up later (âbut I donât feel sleepy yetâ).

I randomized 529 days from 11 May 2012 to 4 November 2013.

### Power calculation

The earlier melatonin experiment found somewhat weak effects with >100 days of data, and one would expect that actually consuming 1.5mg of melatonin would be a stronger intervention than simply shifting my laptop screen color. (What if I donât use my laptop that night? What if Iâm surrounded by white lights?) 30 days is probably too small, judging from the other experiments; 60 is more reasonable, but 90 feels more plausible.

It may be time to learn some more statistics, specifically how to do statistical power calculations for sample size determination (introduction). As I understand it, a power calculation is an equation balancing your sample size, the effect size, and the significance level (eg the old *p*<0.05); if you have 2, you can deduce the third. So if you already knew your sample size and your effect size, you could predict what significance your results would have. In this specific case, we can specify our significance at the usual level, and we can guess at the effect size, but we want to know what sample size we *should* have.

Letâs pin down the effect size: we expect any Redshift effect to be weaker than melatonin supplementation, and the most striking change in melatonin (the reduction in total sleep time by ~50 minutes) had an effect size of 0.37. As usual, R has a bunch of functions we can use. Stealing shamelessly from an R guide, and reusing the means and standard deviations from the melatonin experiment, we can begin asking questions like: âsuppose I wanted a 90% chance of my experiment producing a solid result of *p*>0.01 (not 0.05, so I can do multiple correction) if the Redshift data looks like the melatonin data and acts the same way?â

```
install.packages("pwr", depend = TRUE)
library(pwr)
pwr.t.test(d=(456.4783-407.5312)/131.4656,power=0.9,sig.level=0.01,type="paired",alternative="greater")
Paired t test power calculation
n = 96.63232
d = 0.3723187
sig.level = 0.01
power = 0.9
alternative = greater
NOTE: n is number of *pairs*
```

*n* is pairs of days, so each *n* is one day on, one day off; so it requires 194 days! Ouch, but OK, that was making some assumptions. What if we say the effect size was halved?

```
pwr.t.test(d=((456.4783-407.5312)/131.4656)/2,power=0.9,sig.level=0.01,type="paired",alternative="greater")
Paired t test power calculation
n = 378.3237
```

Thatâs much worse (as one should expect - the smaller an effect or desired *p*-value or chance you donât have the power to observe it, the more data you need to see it). What if we weaken the power and significance level to 0.5 and 0.05 respectively?

```
pwr.t.test(d=((456.4783-407.5312)/131.4656)/2,power=0.5,sig.level=0.05,type="paired",alternative="greater")
Paired t test power calculation
n = 79.43655
d = 0.1861593
```

This is more reasonable, since *n*=80 or 160 days will fit within the experiment but look at what it cost us: itâs now a coin-flip that the results will show anything, and they may not pass multiple correction either. But itâs also very expensive to gain more certainty - if we halve that 50% chance of finding nothing, it basically doubles the number of pairs of days we need from 79 to 157:

```
pwr.t.test(d=((456.4783-407.5312)/131.4656)/2,power=0.75,sig.level=0.05,type="paired",alternative="greater")
Paired t test power calculation
n = 156.5859
d = 0.1861593
```

Statistics is a harsh master. What if we solve the equation for a different variable, power or significance? Maybe I can handle 200 days, what would 100 pairs buy me in terms of power?

```
pwr.t.test(d=((456.4783-407.5312)/131.4656)/2,n=100,sig.level=0.05,type="paired",alternative="greater")
Paired t test power calculation
n = 100
d = 0.1861593
sig.level = 0.05
power = 0.5808219
```

Just 58%. (But at *p*=0.01, *n*=100 only buys me 31% power, so it could be worse!) At 120 pairs/240 days, I get 65% power, so it may all be doable. I guess itâll depend on circumstances: ideally, a Redshift trial will involve no work on my part, so the real question becomes what quicker sleep experiments does it stop me from running and how long can I afford to run it? Would it painfully overlap with things like the lithium trial?

Speaking of the lithium trial, the plan is to run it for a year. What would *2* years of Redshift data buy me even at *p*=0.01?

```
pwr.t.test(d=((456.4783-407.5312)/131.4656)/2,n=365,sig.level=0.01,type="paired",alternative="greater")
Paired t test power calculation
n = 365
d = 0.1861593
sig.level = 0.01
power = 0.8881948
```

Nice! Of course, we have to expect to lose a good deal of statistical power due to interference/uncertainty from the other simultaneous experiments that will be fed into the ANOVA, but I donât know how to calculate that.

### Experiment

OK, power calculations aside, how exactly to run it? I donât expect any bleed-over from day to day, so we randomize on a per-day basis. Each day must either have Redshift running or not. Redshift is run from cron every 15 minutes: `*/15 * * * * redshift -o`

. (This is to deal with logouts, shutdowns, freezes, etc., that might kill Redshift as a persistent daemon.) Weâll change the code to at the beginning of each day run:

```
@daily redshift -x; if ((RANDOM \% 2 < 1));
then touch ~/.redshift; echo `date +"\%d \%b \%Y"`: on >> ~/redshift.log;
else rm ~/.redshift; echo `date +"\%d \%b \%Y"`: off >> ~/redshift.log; fi
```

Then the Redshift call simply includes a check for the fileâs existence:

`*/15 * * * * if [ -f ~/.redshift ]; then redshift -o; fi`

Now we have completely automatic randomization and logging of the experiment. As long as I donât screw things up by deleting either file or uninstalling Redshift, and I keep using my Zeo, all the data is gathered and labeled nicely until I finish the experiment and do the analysis. Non-blinded, or perhaps I should say quasi-blinded - I initially donât know, but I *can* check the logs or file to see what that day was, and obviously I will at some point in the night notice whether the monitor is reddened or not.

As it turned out, I received a proof that I was *not* noticing the randomization. On 11 January 2013, due to Internet connectivity problems, I was idling on my computer and thought to myself that I hadnât noticed Redshift turn my screen salmon-colored in a while, and I happened to idly try `redshift -x`

(reset the screen to normal) and then `redshift -o`

(immediately turn the screen red) - but neither did anything at all. Busy with other things, I set the anomaly aside until a few days later, I traced the problem to a package I had uninstalled back in 25 September 2012 because my system didnât use it - which it did not, but this had the effect of removing another package which turned out to set the default video driver to the proper driver, and so removing it forced my system to a more primitive driver which apparently did not support Redshift functionality^{38}! And I had not noticed for 3 solid months. This was a frustrating incident, but since it took me so long to notice, I am going to keep the 3 monthsâ data and keep them in the âoffâ category - this is not nearly as good as if those 3 months had varied (since now the âonâ category will be underpopulated), but it seems better than just deleting them all.

So to recap: the experiment is 100+ days with Redshift randomized on or off by a shell script, affecting the usual sleep metrics plus time of bed. The expectation is that lack of Redshift will produce a weak negative effect: increasing awakenings & time awake & light sleep, increasing overall sleep time, and also pushing back bedtime.

### VoI

For background on âvalue of informationâ calculations, see the first calculation.

Like the modafinil day trial, this is another value-less experiment justified by its intrinsic interest. I expect the results will confirm what I believe: that red-tinting my laptop screen will result in less damage to my sleep by not forcing lower melatonin levels with blue light. The only outcome that might change my decisions is if the use of Redshift actually worsens my sleep, but I regard this as highly unlikely. It is cheap to run as it is piggybacking on other experiments, and all the randomizing & data recording is being handled by 2 simple shell scripts.

## Push-ups

Rather than dumbbells (might be hard to find in the dark), I decided to try out push-ups since I routinely do 25 push-ups after showering and it ought to be mentally easy to shift those push-ups to before/after bedtime. As before, alternate-day, but with a twist: on-days, I do the push-ups immediately before going to bed, but off-days entail immediately upon awakening. (I donât exercise enough in general.) I began 21 September 2011.

I interrupted the experiment for a long period to run the vitamin D experiments; when I resumed on 8 May 2012, I decided to avoid the alternate-day procedure and instead randomize morning vs evening push ups with a coin. Non-blinded.

On 13 November 2012, I decided I was sufficiently convinced that exercise immediately before bed was damaging my sleep latency that I didnât want to continue to pay the price of worse sleep, and I discontinued this variable. Hopefully the previous data will be sufficient to confirm or disconfirm any effect.

## Meditation

The practice of meditation can be time-intensive; a claimed anecdotal benefit is that one sleeps less and so the time requirement isnât as bad as it may seem.

Meditation has been linked with sleep changes multiple times; see âMeditation and Its Regulatory Role on Sleepâ. In particular, âMeditation acutely improves psychomotor vigilance, and may decrease sleep needâ found a correlation between long meditation and reduced sleep need. The general link seems plausible - that deliberate relaxation may reduce the need for another kind of relaxation (although I doubt meditation is going as far as reducing synaptic weights as the âsynaptic homeostasisâ hypothesis predicts which I discuss in Drug heuristics) - but I can think of at least 2 plausible ways the correlation would not be causation (1. those with less sleep need can afford to spend time on meditation; 2. meditation *is* partially sleep so thereâs no correlation or causation to explain).

Randomized on a daily basis: either 20-30^{39} minutes of meditation or none. (I am not sure what a good placebo would be so I will omit it.) Non-blinded. My meditation is nothing fancy: simple breath-following (based on early chapters of *Mindfulness in Plain English*).

Plausibly, any decrease in sleep need could be due to long-term changes in the brain itself, as meditation is known to affect areas like the prefrontal cortex. Kaul et al 2010 above did not randomize the long-term meditatorsâ use of meditation or apparently investigate whether sleep time averages correlated with meditation. If the changes are long-term, then there will be relatively little variation during the 360 days and instead a gradual trend of less sleep. If no clear effect shows up in the analysis, Iâll try a before-after comparison: compare *n* days before the experiment started to *n* days after the experiment and see if there is a difference in the averages.

### Power calculation

Kaul et al 2010 describes the long-term meditators as spending â2-3 hrs/dayâ in meditation. (Their experiment used novices who meditated for 1 hour.) If meditation indeed reduces sleep time, but I am meditating for only $\frac{1}{3}$ an hour, can I detect any effect?

The difference between the long-term meditators and their normal Indian counterparts was 5.2 hours of sleep per day versus 7.8. Assume the worst case of 3 hours, this implies that meditation is indeed a net cost in time (8.2 > 7.8), but also that each hour of meditation is equivalent to almost an hour of sleep ($\frac{7.8\xe2\x88\x925.2}{3}=0.866...$). So at that conversion rate, 20 minutes of meditation translates to 17.32 minutes less sleep. We will steal code and data from the previous Redshift power calculation: assume the same control sleep, same standard deviation, and subtract 17.32 from the control to get the true mean of the intervention

```
# install.packages("pwr")
library(pwr)
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.5,type="paired",alternative="greater")
Paired t test power calculation
n = 157.237
# we're getting 360 days or 180 pairs; let's ask for more than 50-50 power;
# what does n = 180 buy us? Not much!
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.55,type="paired",alternative="greater")
Paired t test power calculation
n = 181.9631
# how many pairs *do* we need for good results?
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.75,
sig.level=0.01,type="paired",alternative="greater")
Paired t test power calculation
n = 521.5252
pwr.t.test(d=(456.4783 - (456.4783 - 17.32))/131.4656,power=0.56
sig.level=0.01,type="paired",alternative="greater")
Paired t test power calculation
n = 356.2923
```

This is discouraging. With 180 pairs, we only have a 55% chance of seeing anything at *p*=0.05? Thatâs *awful*! But thereâs no point in looking further into this power calculation: Iâm not going to be doing a paired t-test, after all, but some sort of ANOVA, and Iâm not sure how much power the interfering experiments cost me. The first calculation is the most important: to satisfy somewhat reasonable criteria, I need less than half the data I will get, which ought to be an adequate margin of safety.

### VoI

For background on âvalue of informationâ calculations, see the first calculation.

I find meditation useful when I am screwing around and canât focus on anything, but I donât meditate as much as I might because I lose half an hour. Hence, I am interested in the suggestion that meditation may not be as expensive as it seems because it reduces sleep need to some degree: if for every two minutes I meditate, I need one less minute of sleep, that halves the time cost - I spend 30 minutes meditating, gain back 15 minutes from sleep, for a net time loss of 15 minutes. So if I meditate regularly but there is no substitution, I lose out on 15 minutes a day. Figure I skip every 2 days, thatâs a total lost time of $\frac{15\u0102\x97\frac{2}{3}\u0102\x97365.25}{60}=61$ hours a year or $427 at minimum wage. I find the theory somewhat plausible (60%), and my year-long experiment has roughly a 55% chance of detecting the effect size (estimated based on the sleep reduction in a Indian sample of meditators). So $\frac{427\xe2\x88\x920}{\mathrm{ln}1.05}\u0102\x970.60\u0102\x970.55=2888$. The experiment itself is unusually time-intensive, since it involve ~180 sessions of meditation, which if I am âoverpayingâ translates to 45 hours ($\frac{180\u0102\x9715}{60}$) of wasted time or $315. But even including the design and analysis, thatâs less than the calculated value of information.

This example demonstrates that drugs arenât the only expensive things for which you should do extensive testing.

## Masturbation

Orgasm has been linked occasionally with changes in sleep latency, although one 1985 experimental study found no changes. Schenck et al 2007 covers some inconclusive followup studies on related matters like whether arousal or brief viewing of porn interferes with sleep (no).

Randomized on a daily basis before going to bed; no placebo, but abstinence. Non-blinded. Since the theory has always been about a very short-term effect, thereâs no need to worry about daytime activities. (This would only matter if I were testing something like the folk wisdom that masturbation reduces testosterone levels, where the timing is not as important as the quantity.)

## Treadmill / walking desk

In June 2012, I acquire a free treadmill. I became interested in using it as a treadmill desk, reasoning that it was an easy way to get more exercise. My initial days of use led me to suspect that the treadmill deskâs exercise might come at the expense of some concentration or productivity. While I was able to quickly rule out any noticeable negative correlation of treadmill use with typing speed/accuracy, that still leaves other possible negative effects.

### Power

Starting it part way, I lose potential power: there are only ~330 days left. The effect of most interest is productivity, where I expect a negative effect, but we also need a more stringent *p*-value since weâre looking at so many variables; so 330 samples gives a floor on detectable effect size of

```
pwr.t.test(n=(330/2),power=0.75,sig.level=0.01,type="paired",alternative="less")
Paired t test power calculation
n = 165
d = -0.2355713
```

Not that great. We may wind up being able to conclude little about the effect on productivity; similarly for sleep - the effect would have to be comparable to vitamin D or melatonin to be detectable.

### VoI

The VoI calculation for this investigation is very difficult: it may improve sleep and it may improve or worsen productivity but regardless is good for very valuable exercise, scrapping the practice has immediate cash value, but none of this is certain and there are few guides from experimental studies.

If it turns out the treadmill is not helpful, I can probably sell it for ~$100 based on prices listed in Craigslist. If itâs helpful, I gain considerable exercise (1MPH implies an 8-hour day could be 8 miles of exercise a day!) with the related benefits. I strongly suspect that this much exercise would influence my sleep for the better, but Iâm not sure the treadmill desk really does allow for productivity like regular sitting does. If it does reduce productivity somewhat but I otherwise can adapt, itâs probably still a net gain because of the extra exercise. However, a small-to-medium decrease - letâs say an effect size of *d*<=-0.4 - would be enough to cause me to scrap the treadmill. This is highly unlikely. The large sample gives a very good shot at detecting it. Running the experiment is relatively easy since the treadmill desk can be set up and put away in ~5 minutes. Without running numbers on this one, my best guess is that the VoI is negative; so this is another experiment I am doing because it is interesting and other people may find it interesting, rather than because running the experiment makes economic sense.

## Morning caffeine pills

With the coming of winter, I, like so many other people, have started to find sleeping in to be too tempting: why get out of bed into the cold air when I can just snuggle under my covers and drowse another hour? This is bad because I was getting sufficient sleep as it was and didnât need more, and because I think it may exacerbate sleep inertia as the waking process is dragged out for a long time. All in all, the days seemed less productive and drearier whenever I crawled out of bed an hour later than usual.

Then I was reminded by Kaj Sotala of an Anders Sandberg blog post Iâd seen a while back, âThe Early Bird gets the Caffeine Pillâ:

I set my alarm to 6:00 and 8:00. At 6:00 I go up, take a 50mg caffeine pill, and go to bed again. Then I sleep and wake up rested and energetic around 8. In my case the time for the pill to start working seems to be 1.5 hours. A dose of one pill ensures that I wake up (but still yawning) while two pills makes me start the day much more quickly. The added benefit is of course a regular sleep schedule.

It sounds logical enough (why *wouldnât* a caffeine pill work?), and he cites a study successfully trying a similar trick with naps. Iâd meant to try it out at some point, and winter was as good a reason as any. I already had an ample supply of caffeine pills (technically, piracetam+caffeine+others), so I had just been procrastinating on doing a design & setting up my usual RCT. I decided that I might as well try it out as a simple easy non-blinded alternate-day pilot experiment and if I felt like it after a month or two of data, I might try an RCT.

So on 4 November 2013, I started keeping a little jar of my caffeine+piracetam pills by my bedside and using them on alternate days (specifically, my Zeo SmartWake fires in the 9-9:30AM window and I take it then, while I may or may not snooze on). Thus far they do seem to wake me up. I stopped around April 2014.

### Pilot analysis

The correlational data shows a 15-20 minute difference in rise-time between caffeine & non-caffeine days.

First, does morning caffeine affect total sleep or time awake? I wouldnât expect so, since itâs aimed at reducing morning wakefulness:

```
zeo <- read.csv("http://www.gwern.net/docs/zeo/2014-06-28-gwern-zeodata-caffeinecorrelation.csv")
zeo$Morning.Caffeine <- as.logical(zeo$Morning.Caffeine)
wilcox.test(Total.Z ~ Morning.Caffeine, data=zeo)
#
# Wilcoxon rank sum test with continuity correction
#
# data: Total.Z by Morning.Caffeine
# W = 2244, p-value = 0.7168
# alternative hypothesis: true location shift is not equal to 0
wilcox.test(Time.in.Wake ~ Morning.Caffeine, conf.int=TRUE, data=zeo)
#
# Wilcoxon rank sum test with continuity correction
#
# data: Time.in.Wake by Morning.Caffeine
# W = 2090, p-value = 0.7623
# alternative hypothesis: true location shift is not equal to 0
# 95 percent confidence interval:
# -5 3
# sample estimates:
# difference in location
# -1
```

We should be able to see a shift in rise or wake time to an earlier time:

```
# convert "05/12/2014 06:45" to "06:45"
zeo$Rise.Time <- sapply(strsplit(as.character(zeo$Rise.Time), " "), function(x) { x[[2]] })
# convert "06:45" to 24300
interval <- function(x) { if (!is.na(x)) { if (grepl(" s",x)) as.integer(sub(" s","",x))
else { y <- unlist(strsplit(x, ":"));
as.integer(y[[1]])*60 + as.integer(y[[2]]); }
}
else NA
}
zeo$Rise.Time <- sapply(zeo$Rise.Time, interval)
## `hist(zeo$Rise.Time)` looks normally distributed, but there's a big outlier, so we'll use a U-test:
wilcox.test(Rise.Time ~ Morning.Caffeine, conf.int=TRUE, data=zeo)
#
# Wilcoxon rank sum test with continuity correction
#
# data: Rise.Time by Morning.Caffeine
# W = 2705, p-value = 0.01863
# alternative hypothesis: true location shift is not equal to 0
# 95 percent confidence interval:
# 5 40
# sample estimates:
# difference in location
# 20
```

A definite hit! Rising 20 minutes earlier seems like a plausible estimate, too. Letâs take a look at the graph of rise-time over time:

```
zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y")
library(ggplot2)
qplot(Sleep.Date, Rise.Time, color=Morning.Caffeine, data=zeo)
```

Two observations immediately jump out:

- the blue points (caffeine-affected) do seem to generally be below the red points (caffeine-free) and the U-testâs claim is believable
- there seem to be very distinct temporal patterns, which make any correlations or analysis treacherous: before/after experiments will be worthless since they will sample from distinct periods of rising-time, so an experiment should definitely be blocked as pairs-of-days to minimize the clear drift or sinusoidal pattern.

A more precise analysis with covariates is possible; for example, depending on how late I went to bed, that might affect when I get up in the morning. But you have to be careful in what you look at - if you look at something like âtotal sleep lengthâ, well, thatâs partially *caused by* sleeping in! It must be impossible for the variables to be affected by sleeping in or not. So, `Total.Z`

, `Time.in.REM`

, etc are all out. I think we can include:

- how long it took to fall asleep;
- what time I went to sleep; which gives us a smaller estimate of 15 minutes:

```
zeo$Start.of.Night <- sapply(strsplit(as.character(zeo$Start.of.Night), " "), function(x) { x[[2]] })
zeo$Start.of.Night <- sapply(zeo$Start.of.Night, interval)
summary(lm(formula = Rise.Time ~ Morning.Caffeine + Start.of.Night + Time.to.Z, data = zeo))
#
# Residuals:
# Min 1Q Median 3Q Max
# -137.86 -32.13 1.84 32.29 109.22
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 63.982 45.647 1.40 0.163
# Morning.CaffeineTRUE -15.847 8.321 -1.90 0.059
# Start.of.Night 0.519 0.100 5.17 7.7e-07
# Time.to.Z 0.286 0.271 1.05 0.294
```

Finally, letâs check for damage to my sleep; itâs no good avoiding sleeping in if that then makes me feel like shit:

```
wilcox.test(ZQ ~ Morning.Caffeine, conf.int=TRUE, data=zeo)
#
# Wilcoxon rank sum test with continuity correction
#
# data: ZQ by Morning.Caffeine
# W = 2086, p-value = 0.7491
# alternative hypothesis: true location shift is not equal to 0
# 95 percent confidence interval:
# -4 3
# sample estimates:
# difference in location
# -1
wilcox.test(Morning.Feel ~ Morning.Caffeine, conf.int=TRUE, data=zeo)
#
# Wilcoxon rank sum test with continuity correction
#
# data: Morning.Feel by Morning.Caffeine
# W = 2069, p-value = 0.6568
# alternative hypothesis: true location shift is not equal to 0
# 95 percent confidence interval:
# -1.34e-05 1.98e-05
# sample estimates:
# difference in location
# -5.209e-05
```

These are the 2 main measures of whether sleep quality have degraded, and both look good. So it seems the morning caffeine correlates with earlier risings but not with worse sleep or feeling bad when I get up.

Correlation!=causation; thereâs a plausible alternative: on days when I feel like sleeping in, I âforgotâ to take a caffeine pill. So itâs worth testing. How long does the experiment need to be for 80% power and a shift of 20 minutes? (not 15m since not sure how reliable that estimate is)

```
## Calculate effect size, plug into power formula:
t.test(Rise.Time ~ Morning.Caffeine, data=zeo)
#
# Welch Two Sample t-test
#
# data: Rise.Time by Morning.Caffeine
# t = 2.746, df = 81.84, p-value = 0.007417
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# 6.23 38.99
# sample estimates:
# mean in group FALSE mean in group TRUE
# 299.9 277.2
sd(zeo$Rise.Time)
# [1] 65.19
(299.9 - 277.2) / 65.19
# [1] 0.3482
power.t.test(d=0.3482, power=0.80, type="paired", alternative="one.sided")
#
# Paired t test power calculation
#
# n = 52.37
# delta = 0.3482
# sd = 1
# sig.level = 0.05
# power = 0.8
# alternative = one.sided
#
# NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs
```

Using *d*=0.35 as an effect size estimate, a proper blind experiment (blocking pairs of days) will take 100 days total (50 placebo pills, 50 caffeine pills). I began 29 June 2014. I made the placebo pills the usual way with Bisquick, tossed together with the caffeine pills to equalize any coating; I made 120, more than I needed, because itâs always annoying to set up & make pills, and it only took 40 minutes from start to cleanup. Unfortunately, a few days into the experiment it became clear that my old caffeine pills had absorbed some ambient moisture and the tossing had not equalized the surface flavor, so the placebo pills could be easily distinguished from the caffeine pills by both flavor & texture, rendering this not a blinded & randomized experiment but just a randomized experiment.

# External links

Vitamin D discussion:

- Hacker News
- Reddit: 1, 2
- Imminst
*Business Insider*blog

- Reddit: potassium discussion
coping with Zeo Incâs shutdown:

# Appendix

### Inverse correlation of sleep quality with productivity?

Curiously, playing around with the full potassium data after the 2013 morning experiment, poor sleep quality seemed to correlate with higher mood/productivity ratings:

```
cor.test(pot$Disturbance, pot$MP)
Pearson`s product-moment correlation
data: pot$Disturbance and pot$MP
t = 1.224, df = 49, p-value = 0.2269
alternative hypothesis: true correlation is not equal to 0
95% confidence interval:
-0.1085 0.4275
sample estimates:
cor
0.1722
```

#### Hypotheses

While not statistically-significant, this inverse correlation comes as a surprise and I thought worth thinking about more. I have a couple theories on what could be going on:

it could be an artifact and actually better sleep means better performance: Iâve always been concerned about the possibility of off-by-one errors in my data or analyses. If better sleep meant better performance (as one would naively suspect), and either sleep data or performance data was âshiftedâ by one day, then you would observe the exact opposite.

One would have to carefully check the data and make sure every field is referring to the time it should. If a entry records 10hrs sleep for 3 February 2012, does that refer to sleep that

This seems unlikely, since such an error should screw up all sorts of other analyses (for example such a flip ought to have claimed that potassium would help sleep, if days were being reversed).*morning*which is necessary because you were awake during 2 February 2012, or does it refer to the sleep you engage in that*evening*(you go to bed at 11pm 3 February 2012 and that is the sleep data being used).it could be that on productive days, you leap out of bed; but if you are depressed, unmotivated, apathetic, you might hang around in bed for a while after the alarm rings. Depressed people sometimes sleep more than regular people; for pretty much this reason, Iâd guess.

This could be checked by looking at sleep quality indicators in the beginning or middle of the night. For example time to fall asleep (higher on more productive days in this sample), or percentage in deep sleep (mostly done towards the beginning and middle of a sleep; seemed to be lower for productive days). One could try to test the sluggard hypothesis: how much past an alarm one snoozed.itâs a temporary correlation of this time period, perhaps related to the potassium, perhaps not.

This is testable: with more data, does the correlation shrink or go away?I have sometimes wondered if I am depressed. One of the curious facts about depression is that sleep deprivation can temporarily relieve the symptoms of depression in people who prefer evenings (owls), and I am indeed an owl. What does this imply?

We can do some back-of-the-envelope estimates. Wikipedia reports a very high depression incidence; weâll call it a 25% lifetime risk. But presumably the treatment only works if one is actually in a depressive episode, and while itâs unclear what the distribution or length of depression period (as opposed to individual episodes) might be, it seems to be closer to years than months or decades, so weâll put it at ~3 years out of an adult lifespan of ~60 years or a per-year risk of $\frac{1}{20}=0.05$. On closer examination of Selvi et al 2006, the morning/evening split only appears with the total sleep deprivation procedure (morning types see their mood worsen, evening sees it improve) while with partial sleep deprivation both groups seem to see an improvement in their mood; since I rarely skip sleep entirely and such nights are dropped from the Zeo data, the total sleep deprivation results are irrelevant, but then my chronotype being evening doesnât matter. Finally, the sleep deprivation papers estimate <60% effectiveness in the depressed, so that knocks the possibility that both I am depressed and partial sleep deprivation helps me to <0.025. 2.5% is not a large possibility; and my vague speculation and a small inverse correlation do not seem like they would increase that possibility a

*lot*.

(If itâs not these, I donât have any suggestion on why it might be. Why would poor sleep either cause productivity or be caused by something that later also causes productivity?)

#### Analysis

But before rashly assuming I am depressive or engaging in personally costly self-experiments like sleep deprivation, I decided on 26 April 2013 to check the correlation on a larger dataset.

Typing up my full self-rating dataset of 416 days and cleaning up all the data^{40}, I rechecked the correlation: *r*=0.066^{41} This is noticeably smaller (hence, less practically relevant) than the previous correlation, is also not statistically-significant, and shrinking is what one would expect from a spurious relationship.

To be more sure, I reused some of the techniques from my analysis of the effect of weather on my mood/productivity (specifically, ordinal logistic regression) and looked for a relationship; the result was similar, an odds which was inverse but close to no effect (1.057^{42}). More importantly, when all the other variables are taken into account in the logistic regression, things change^{43}: with other data to condition on, the inverse relationship of sleep quality with mood/productivity reverses and becomes the expected relationship (an increase in sleep disturbances predicts lower mood/productivity); many of the other variables turn out to be far stronger predictors (bigger odds); and some of the signs look odd (how can total sleep time predict increased mood/productivity, yet increasing all forms of sleep - REM/light/deep - predicts *decreased* mood/productivityâœ). I attempted to construct a simpler model, which wound up ignoring any metric of sleep disturbance and ignoring all but 3 variables, and concluding that âMorning Feelâ was the most important predictor^{44} - which makes a lot of sense to me, and confirms my previous experimentsâ focusing on the âMorning Feelâ variable.

Given this weakening and in the absence of any corroborating information, I consider it highly unlikely that the original correlation is reflecting an anti-depressant effect due to sleep deprivation. A followup in a few years may be warranted to see if a larger still dataset will shrink the correlation closer to zero.

## Phases of the moon

Due to its increasing length and complexity, I have split this out to Lunar sleep.

## SDr lucid dreaming: exploratory data analysis

In October 2012, an acquaintance offered me an extract from his free-form data on lucid dreaming which he had been compiling since 2004, to see what insights I could extract. In May 2013, I augmented it with another 60 entries

### Data cleaning

The original text was a serious mess, and I put several hours into cleaning it up and organizing it into something more sensible. This wasnât enough, so I wrote an ugly Haskell program to parse it into a quasi-CSV file:

```
import Data.List (isInfixOf, isPrefixOf, intercalate)
import Data.List.Split (splitOn) -- http://hackage.haskell.org/package/split
main :: IO ()
main = do txt <- readFile "2012-sdr-dream.txt"
let txt' = filter (not . isPrefixOf "#") $ lines txt
let header = drop 2 $ head $ filter (isPrefixOf "# Sleep Date,") $ lines txt
let fields = map (splitOn ",") txt'
let csvs = map convert fields
putStrLn $ unlines (header : map show csvs)
data CSVEntry = CSVEntry { sleepDate :: String, totalZ :: Int,
wakeTime :: String, intensity :: String, recall :: String,
emotion :: String, interrupted :: Bool, melatonin :: Bool, lucid :: String }
instance Show CSVEntry where
show a = intercalate "," [sleepDate a, if totalZ a == 0 then "" else show (totalZ a),
wakeTime a, intensity a, recall a, emotion a,
if interrupted a then "1" else "0", if melatonin a then "1" else "0", lucid a]
convert :: [String] -> CSVEntry
convert xs = CSVEntry { sleepDate = safeHead $ filter (\x -> isInfixOf "." x || isInfixOf "20" x) xs,
totalZ = timeToMinutes $ drop 12 $ safeHead $ filter (isInfixOf "dreamtime: ") xs,
wakeTime = drop 7 $ safeHead $ filter (isInfixOf "wake: ") xs,
intensity = drop 6 $ safeHead $ filter (isInfixOf "int: ") xs,
recall = drop 9 $ safeHead $ filter (isInfixOf "recall: ") xs,
emotion = drop 6 $ safeHead $ filter (isInfixOf "emo: ") xs,
lucid = drop 8 $ safeHead $ filter (isInfixOf "lucid: ") xs,
interrupted = any (isInfixOf "interrupted") xs,
melatonin = any (isInfixOf "melatonin") xs }
where
safeHead :: [String] -> String
safeHead ys = if null ys then "" else head ys
-- clock hour:minute to total minutes: timeToMinutes "4:30" ~> 270
timeToMinutes :: String -> Int
timeToMinutes a = if null a then 0 else let (x,y) = break (==':') a
in read x * 60 + read (tail y)
```

### Analysis

This was usable. My next question was: since none of his routines were randomized and correlations were all that one could extract, what correlations *were* in his data?

```
table <- read.csv("http://www.gwern.net/docs/zeo/2013-sdr-dream.csv")
summary(table)
Sleep.Date Total.Z Wake.Time Intensity Recall Emotion
2011.10.02: 2 Min. : 120 :217 Min. :0.10 Min. :0.000 Min. :-0.50
2011.11.26: 2 1st Qu.: 480 16:00 : 3 1st Qu.:0.30 1st Qu.:0.200 1st Qu.: 0.00
2012.02.28: 2 Median : 600 11:00 : 2 Median :0.40 Median :0.300 Median : 0.20
2012.04.15: 2 Mean : 613 13:23:00: 2 Mean :0.44 Mean :0.367 Mean : 0.18
2012.06.21: 2 3rd Qu.: 720 19:17:00: 2 3rd Qu.:0.50 3rd Qu.:0.500 3rd Qu.: 0.40
2013.01.23: 2 Max. :1320 4:55:00 : 2 Max. :7.00 Max. :1.000 Max. : 0.70
(Other) :316 NA's :8 (Other) :100 NA's :94 NA's :26 NA's :296
Interrupted Melatonin Lucid Day.quality
Min. :0.00 Min. :0.0000 Min. :0.0 Min. :0.10
1st Qu.:0.00 1st Qu.:0.0000 1st Qu.:0.1 1st Qu.:0.30
Median :0.00 Median :0.0000 Median :0.2 Median :0.40
Mean :0.07 Mean :0.0762 Mean :0.2 Mean :0.42
3rd Qu.:0.00 3rd Qu.:0.0000 3rd Qu.:0.2 3rd Qu.:0.52
Max. :1.00 Max. :1.0000 Max. :0.6 Max. :0.70
NA's :76 NA's :319 NA's :312
# These 2 date fields haven't been turned into anything useful, so we'll just delete them:
rm(table$Wake.Time, table$Sleep.Date)
# Warning: 'Lucid' has just 9 datapoints, and 'Melatonin' just 6!
# Table cleaned up heavily by hand from default R output:
# deleted duplicates, censored any correlation -0.1<x<0.1 etc.
cor(table,use="pairwise.complete.obs")
Recall Emotion Interrupted Melatonin Lucid Day.quality
Total.Z -0.12 -0.43 0.56
Intensity 0.35 0.37 0.79
Recall 0.16 -0.16 0.14 -0.15
Emotion 0.28 -0.14
Interrupted 0.91
Melatonin 0.25
```

Much of the data is too impoverished to draw any suggestions from. The remaining correlations are:

âIntensityâ/âRecallâ:

The causality is likely âIntensityâ->âRecallâ; either one is probably impossible to experimentally manipulate.*r*=0.35âIntensityâ/âEmotionâ:

Causality could go either way or to a third factor; âEmotionâ*r*=0.37*might*be manipulable by intending to dream of disturbing topics, but might not.- âInterruptedâ/âRecallâ:
*r*=-0.16 âInterruptedâ/âEmotionâ:

âInterruptionâ is experimentally manipulable by eg. an alarm clock or roommate. âRecallâ might be improved by some change in journaling, for example doing at your bed instead of waiting until youâre on your computer. The positive correlation with âEmotionâ suggests that, per the WILD methodology of lucid dreaming (see LaBerge & Rheingold,*r*=0.28*Exploring the World of Lucid Dreaming*), a temporary awakening does increase the chance of a lucid dream (laden with emotion).âMelatoninâ interestingly correlates with both day quality and with reduced sleep; this is interesting because

`Total.Z`

increasing also increased`Day.quality`

so itâs not clear how melatonin could do both at the same time if more sleep is otherwise better. The correlations may be statistically-significant but the data is too wretched and the melatonin/day-quality variables too few to say anything further.

(One observation that came to mind working on cleaning the data was that collection was very sparse, sporadic, and accidental-looking.)

So these general points suggest 3 future overlapping approaches:

- deliberate use of interruptions (maybe randomized), to investigate effect on lucid dreaming
- more systematic usage (perhaps randomized or blinded) of melatonin, to allow correlations or causal inferences to other variables
- attacking the unsystematic data collection (perhaps itâs too much trouble to do all those variables each day?) by getting a Zeo to handle part of the data collection for you.

The obvious and cheaper alternative to the Zeo would be the Fitbit, one of the accelerometers. There arenât many comparisons; Diana Sherman compared one night, and Joe Betts-LaCroix compared ~38 nights of data. In both cases, the Fitbit seemed to be pretty similar to the Zeo at estimating total sleep time (the only thing it can measure). Betts-LaCroix explicitly recommends the Zeo, but Iâm not clear on whether that is due to the better data quality or because Fitbit made it hard to impossible for him to extract the detailed Fitbit data while Zeo offers easy exporting. In any case, I already have the Zeo and Iâve come to like the detailed information.â©

I had previously tried huperzine-A and subjectively noticed no effect from it, but I had no way of really noticing any effect on sleep, and Timothy Ferriss in his

*The Four-hour Body*claims:Taking 200 milligrams of huperzine-A 30 minutes before bed can increase total REM by 20-30%. Huperzine-A, an extract of

*Huperzia serrata*, slows the breakdown of the neurotransmitter acetylcholine. It is a popular nootropic (smart drug), and I have used it in the past to accelerate learning and increase the incidence of lucid dreaming. I now only use huperzine-A for the first few weeks of language acquisition, and no more than three days per week to avoid side effects. Ironically, one documented side effect of overuse is insomnia. The brain is a sensitive instrument, and while generally well tolerated, this drug is contraindicated with some classes of medications. Speak with your doctor before using.My own suspicion is that given the existence of neuron-level sleep in mice, poor self-monitoring in humans, and anecdotal reports about polyphasic sleep, is that polyphasic sleep is a real & workable phenomenon but that it comes at the price of a large chunk of mental performance.â©

Kruschke 2012 argues that there is no need for people to use the old framework of

*p*-values and null hypotheses etc, with their many well-known philosophical difficulties and misleading interpretations - interpretations I, alas, perpetuate in my analyses with my use of statistical significance:Nevertheless, some people have the impression that conclusions from NHST and Bayesian methods tend to agree in simple situations such as comparison of two groups: âThus, if your primary question of interest can be simply expressed in a form amenable to a t-test, say, there really is no need to try and apply the full Bayesian machinery to so simple a problem.â (Brooks, 2003, p.Â 2694) This article shows, to the contrary, that Bayesian parameter estimation provides much richer information than the NHST t-test, and that its conclusions can differ from those of the NHST t-test. Decisions based on Bayesian parameter estimation are better founded than NHST, whether the decisions of the two methods agree or not. The conclusion is bold but simple: Bayesian parameter estimation supersedes the NHST t-test.

Unfortunately, while I have no love for NHST, I

*did*find it much easier to use the NHST concepts & code when learning how to do these analyses. In the future, hopefully I can switch to Bayesian techniques.â©The usual way to correct for the issue of multiple comparisons inflating results (a big problem in epidemiology and why their results are so often false) is to use a Bonferroni correction - if I look at the

*p*-values for 7 Zeo metrics, I wouldnât consider any to be statistically-significant at â*p*=0.05â unless they were actually statistically-significant at $\frac{0.05}{7}=0.00714=0.007$, which is even more stringent than the rarer â*p*=0.01â criterion. With the even stronger criterion â*p*=0.007â, itâs a safe bet than*none*of my tests give statistically-significant results. Which may be the right thing to conclude, since all my data is just*n*=1 and unreliable in many ways, but still, the Bonferroni correction is not being very helpful here.The caveat is that the Bonferroni correction is intended for use on âindependentâ data, while the Zeo metrics are all very dependent, some by

*definition*(eg. ZQ is defined partly as what the REM sleep length was, AFAIK). So while the Bonferroni correction will still do the job of only letting through*really*statistically-significant data, itâll do so by throwing out way more potentially good results than one has to. (Itâll avoid some false positives by making many false negatives.) So what should we do?Andy McKenzie suggested limiting our false discovery rate by using the method of Benjamin & Hochberg 1995:

âŠletâs say that you test 6 hypotheses, corresponding to different features of your Zeo data. You could use a t-test for each, as above. Then aggregate and sort all the

*p*-values in ascending order. Letâs say that they are 0.001, 0.013, 0.021, 0.030, 0.067, and 0.134.Assume, arbitrarily, that you want the overall false discovery rate to be 0.05, which is in this context called the

*q*-value. You would then sequentially test, from the last value to the first, whether the current*p*-value is less than $\frac{\mathrm{\text{the current index}}\u0102\x97\mathrm{\text{the false discovery rate}}}{\mathrm{\text{the overall number of hypotheses}}}$. You stop when you get to the first true inequality and call the*p*-values of the rest of the hypotheses [statistically-]significant.So in this example, you would stop when you correctly call $0.030<\frac{4\u0102\x970.05}{6}$, and only the hypotheses corresponding to the first four [smallest]

*p*-values would be called [statistically-]significant.If we correct for multiple comparisons (see previous footnote) at

*q*-value=0.05, none of them survive:`R> p.adjust(c(0.11,0.77,0.89,0.16,0.63,0.74,0.73,0.63,0.20), method="BH") < 0.05 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE`

Oh well.â©

âBlockingâ is a style of variation on a simple randomized design where instead of considering each day separate and randomizing a single day, we instead randomize pairs of days, or more; so instead of flipping our coin to decide whether âthis weekâ is placebo, we flip our coin to decide whether âthis week will be placebo & next activeâ or âthis week active & next placeboâ. This has 2 big advantages which justify the complexity:

- Often, Iâm worried about simple randomization leading to an imbalance in sample vs experimental; if Iâm only getting 20 total datapoints on something, then randomization could easily lead to something like 14 control and 6 experimental datapoints - throwing out a lot of statistical power compared to 10 control and 10 experimental! Why am I losing power? Because data is subject to diminishing returns: each new point reduces the standard error of your estimates less than the previous one did (since the total error shrinks as, roughly, inverse of the square root of the total sample size; the difference between â1 and â2 is bigger and shrinks error more than â2 vs â3, etc) . So the extra 4 control datapoints reduce the error less than the lost 4 experimental datapoints would have, and this leaves me with a final answer less precise than if it had been exactly 10:10. (If diminishing returns isnât intuitive, imagine taking it to an extreme: is 10:10 just as good as 5:15? As good as 2:18? How about
*0*:20?) But if I pair days like this, then I*know*I will get exactly 10:10. - Blocking is the natural way to handle multiple-day effects or trends: if I think lithium operates slowly, I will pair entire weeks or months, rather than days and hoping enough experimental and control days form runs which will reveal any trend rather than wash it out in averaging.

- Often, Iâm worried about simple randomization leading to an imbalance in sample vs experimental; if Iâm only getting 20 total datapoints on something, then randomization could easily lead to something like 14 control and 6 experimental datapoints - throwing out a lot of statistical power compared to 10 control and 10 experimental! Why am I losing power? Because data is subject to diminishing returns: each new point reduces the standard error of your estimates less than the previous one did (since the total error shrinks as, roughly, inverse of the square root of the total sample size; the difference between â1 and â2 is bigger and shrinks error more than â2 vs â3, etc) . So the extra 4 control datapoints reduce the error less than the lost 4 experimental datapoints would have, and this leaves me with a final answer less precise than if it had been exactly 10:10. (If diminishing returns isnât intuitive, imagine taking it to an extreme: is 10:10 just as good as 5:15? As good as 2:18? How about
The net present value formula is the annual savings divided by the natural log of the discount rate, out to eternity. Exponential discounting means that a bond that expires in 50 years is worth a surprisingly similar amount to one that continues paying out forever. For example, a 50 year bond paying $10 a year at a discount rate of 5% is worth

`sum (map (\t -> 10 / (1 + 0.05)^t) [1..50]) ~> 182.5`

but if that same bond never expires, itâs worth`10 / log 1.05 = 204.9`

or just $22.4 more! My own expected longevity is ~50 more years, but I prefer to use the simple natural log formula rather than the more accurate summation. Either way is interesting; Vaniver:âŠpossibly a way to drive it home is to talk about dividing by

`log 1.05`

, which is essentially multiplying by 20.5. If you can make a one-time investment that pays off annually until you die, thatâs worth 20.5 times the annual return, and multiplying the value of something by 20 can often move it from not worth thinking about to worth thinking about.Vaniver notes that one reason I might be less confident than you would expect is that many substances or supplements lose effect over time as oneâs body regains homeostasis and compensates for the substance, building tolerance. Which is quite true, and a major reason I tested melatonin - I was sure it worked for me in the past, but did it

*still*work?â©For simplicity, in all my VoI calculations I assume that Iâll stop buying the supplement (or doing the activity) if I hit a negative result. The

*proper*way a real analyst would do this value of information question would be to say that the negative result gives us additional information which changes the expected-value of melatonin use.In my melatonin article article, I calculated that since melatonin saved me close to an hour while each dose cost literally a penny or two, the value was astronomical - $2350.60 a year! By Bayesâ formula, if I started with 80% confidence and had a 95% accurate test, a negative result drops my 80% all the way down to 17%. We get this by using a derivation of Bayesâs theorem:

$P(a\xe2\x88\u0141b)=\frac{P(b\xe2\x88\u0141a)\u0102\x97P(a)}{(P(b\xe2\x88\u0141a)\u0102\x97P(a))+(P(b\xe2\x88\u0141\xc2\u0179a)\u0102\x97P(\xc2\u0179a))}=\frac{0.05\u0102\x970.8}{(0.05\u0102\x970.8)+(0.95\u0102\x970.2)}=0.174$

But ironically if I now believed that melatonin only had a 17% chance of doing something helpful rather than nothing at all (as compared to my original 80% belief), well, 17% of $2350 ($117) is still way more money than the melatonin cost ($10), so Iâd use it

*anyway*!Would it make sense to iterate again and test melatonin a second time? Well, what does the calculation say? We have a new prior of 17; what happens if we get a negative result again? $\frac{0.05\u0102\x970.17}{(0.05\u0102\x970.17)+(0.95\u0102\x970.82)}=0.01$ and then the expected value is $0.0107...\u0102\x972350=25.7$, which is not much more than the cost of $10, and given the difficult-to-quantify possibility of negative long-term health effects, is not enough of a profit to really entice me.â©

*Technology Review*editor Emily Singer noticed the same problem when using her Zeo.â©The

`R`

interpreter session, loading a CSV as before:

â©`R> zeo <- read.csv("http://www.gwern.net/docs/zeo/2011-zeo-oneleg.csv") R> colnames(zeo)[24] <- "OneLeg" R> l <- lm(cbind(ZQ, Total.Z, Time.to.Z, Time.in.Wake, Time.in.REM, Time.in.Light, Time.in.Deep, Awakenings, Morning.Feel) ~ OneLeg, data=zeo) R> summary(manova(l)) Df Pillai approx F num Df den Df Pr(>F) OneLeg 1 0.177 1.37 9 57 0.23 Residuals 65 R> summary(l) Response ZQ : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 96.231 1.712 56.22 <2e-16 OneLeg -1.244 0.883 -1.41 0.16 Response Total.Z : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 514.67 8.84 58.2 <2e-16 OneLeg -4.09 4.56 -0.9 0.37 Response Time.to.Z : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 14.949 1.373 10.89 2.7e-16 OneLeg 0.469 0.708 0.66 0.51 Response Time.in.Wake : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.821 2.786 4.60 2e-05 OneLeg -0.369 1.436 -0.26 0.8 Response Time.in.REM : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 168.72 4.25 39.70 <2e-16 OneLeg -5.33 2.19 -2.43 0.018 Response Time.in.Light : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 277.15 6.06 45.75 <2e-16 OneLeg 2.76 3.12 0.88 0.38 Response Time.in.Deep : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 69.282 1.802 38.44 <2e-16 OneLeg -1.558 0.929 -1.68 0.098 Response Awakenings : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.1538 0.3690 11.26 <2e-16 OneLeg -0.0513 0.1902 -0.27 0.79 Response Morning.Feel : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.8718 0.1014 28.3 <2e-16 OneLeg -0.0525 0.0523 -1.0 0.32`

If we correct for multiple comparisons (see previous footnote on the Bonferroni correction) at

*q*-value=0.05, none of them survive:`R> p.adjust(c(0.16,0.37,0.51,0.80,0.02,0.38,0.10,0.79,0.32), method="BH") < 0.05 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE`

Oh well! Statistics is a harsh mistress indeed.â©

âSleep Behavior Disorders in a Large Cohort of Chinese (Taiwanese) Patients Maintained by Long-Term Hemodialysisâ (Chen et al 2006):

âŠThe increased odds of high PSQI score for greater hemoglobin level and for high ESS score for use of vitamin D analogues were unexpected results for which we cannot speculate about the cause or association and that may simply be spurious findings arising from statistical analysis.

âRelationships among dietary nutrients and subjective sleep, objective sleep, and napping in womenâ (Grandner et al 2010):

This study found a [statistically-]significant relationship between circadian phase of sleep and dietary Vitamin D intake. Later sleep acrophase, an indicator of sleep timing, was associated with more dietary Vitamin D. For most people, most Vitamin D is obtained through sunlight(44), though dietary Vitamin D is usually obtained through supplementation, usually in pills or in dairy products(44). It is currently unknown why those who consumed more Vitamin D would demonstrate a sleep phase delay, especially since in this same subject group, those exposed to more light had earlier circadian acrophases(45).

âThe midpoint of sleep is associated with dietary intake and dietary behavior among young Japanese womenâ (Sato-Mito et al 2011):

Late midpoint of sleep was [statistically-]significantly negatively associated with the percentage of energy from protein and carbohydrates, and the energy-adjusted intake of cholesterol, potassium, calcium, magnesium, iron, zinc, vitamin A, vitamin D, thiamin, riboflavin, vitamin B(6), folate, rice, vegetables, pulses, eggs, and milk and milk products.

âLow vitamin D levels in adults with longer time to fall asleep: US NHANES, 2005-2006â, Shiue 2013:

âŠTable 2 shows associations of serum 25(OH)D concentrations and sleep characteristics. After adjusting for age, sex, ethnicity, high blood pressure, body mass index, active smoking, depressive symptoms, and survey weighting, no association between serum 25(OH)D concentrations and sleeping hours was observed (beta 0.19, 95% CI â0.40 0.77,

*p*= 0.51) while a significant inverse association was found between serum 25(OH)D concentrations and minutes to fall asleep (beta â3.13, 95% CI â5.62 to â0.64,*p*= 0.02). Moreover, people with higher vitamin D levels could be more likely to complain sleep problems (OR 1.60, 95% CI 1.20 to 2.14,*p*= 0.004)âŠ.It was observed that serum 25(OH)D concentrations were significantly associated with minutes to fall asleep, indicating that people with lower vitamin D levels tended to have longer time to fall asleep. On the other hand, it was also observed that people with higher vitamin D levels had more sleep complaints, although the reason is unclear.The problem was the original vitamin D3 capsule: I couldnât squeeze out

*all*the oil, so I settled for squeezing out most, and then pushing the original capsule into the new capsule. So they contain everything they should, but they have a visible âbubbleâ inside them (the original capsule). Hence, the need for literal blinding. Otherwise, theyâre pretty good: identical shape and weight.â©See the general remarks in LiveStrong, âVitamin D warning: Too much can harm your heartâ, and the 2009 study âRelation of serum 25-hydroxyvitamin D to heart rate and cardiac work (from the National Health and Nutrition Examination Surveys)â.â©

For âQualityâ & âZQâ: higher = betterâ©

Headband came loose at some point, data uselessâ©

Headband came loose at some point, data uselessâ©

The preponderance of

`True`

is because while recording the scores, I normalized them; in retrospect, I shouldnâtâve bothered:

â©`logBinaryScore = sum . map (\(result,p) -> if result then 1 + logBase 2 p else 1 + logBase 2 (1-p)) logBinaryScore [(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.50), (True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.55),(True,0.55),(True,0.55), (True,0.60),(True,0.60),(True,0.60),(True,0.60),(True,0.60),(True,0.60),(True,0.60), (True,0.60),(True,0.65),(True,0.65),(True,0.65),(True,0.65),(True,0.65),(True,0.65), (True,0.65),(True,0.65),(True,0.70),(True,0.70),(True,0.70),(True,0.70),(True,0.75), (True,0.75),(False,0.55),(False,0.6),(False,0.6),(False,0.7),(False,0.7),(False,0.75)] 5.4`

The usual session:

â©`R> zeo <- read.csv("http://www.gwern.net/docs/zeo/2012-zeo-vitamind.csv") R> colnames(zeo)[26] <- "Vitamin.D" R> l <- lm(cbind(Total.Z, Time.in.REM, Time.in.Deep, Time.in.Wake, Awakenings, Morning.Feel, Time.to.Z) ~ Vitamin.D, data=zeo) R> summary(manova(l)) Df Pillai approx F num Df den Df Pr(>F) Vitamin.D 1 0.31 2.12 7 33 0.07 Residuals 39 R> summary(l) Response Total.Z : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 533.37 8.16 65.37 <2e-16 Vitamin.D -19.73 11.14 -1.77 0.084 Response Time.in.REM : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 175.63 4.44 39.5 <2e-16 Vitamin.D -14.54 6.07 -2.4 0.021 Response Time.in.Deep : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 55.00 2.04 26.98 <2e-16 Vitamin.D 2.32 2.78 0.83 0.41 Response Time.in.Wake : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 26.32 3.83 6.88 3.2e-08 Vitamin.D 2.50 5.22 0.48 0.63 Response Awakenings : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 7.579 0.598 12.7 2.1e-15 Vitamin.D 0.739 0.817 0.9 0.37 Response Morning.Feel : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.842 0.134 21.21 <2e-16 Vitamin.D -0.524 0.183 -2.86 0.0067 Response Time.to.Z : Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 17.58 3.43 5.12 8.6e-06 Vitamin.D 3.47 4.69 0.74 0.46`

Correcting for multiple comparisons at

*q*-value=0.05, of our 8 pessimistic*p*-values, 1 survives:`R> p.adjust(c(0.084,0.021,0.41,0.63,0.37,0.0067,0.46), method="BH") < 0.05 [1] FALSE FALSE FALSE FALSE FALSE TRUE FALSE`

Remarkable - the first time a

*p*-value survived. (That was the`Morning.Feel`

one.)â©I originally input the data as âOther Disruptions 4â through the Zeo web interface, since I assumed that if âOther Disruptions 3â was

`SSCF.12`

, that would put the data into`SSCF.13`

- but it turns out that does not get*exported in the CSV*! Apparently the CSV is limited to 1-3. So I edited the exported CSV and just reused`SSCF.1`

. Hopefully Zeo Inc. will fix the export functionality, since itâs very frustrating to be able to see the data used in the âCause & Effectâ tool, for example, but not export it.â©Gustavo Lacerda wondered if the two-sample

*t*-test (or linear regressions in general) were really justifiable to use - could days be correlated, in which case the*p*-values would be overstated and my results actually weaker than they look? He suggested testing my full Zeo dataset to see whether`Morning Feel`

can be predicted from day to day by a (relatively) simple linear autocorrelation regression looking at all previous recorded days:`R> zeo <- read.csv("http://www.gwern.net/docs/zeo/gwern-zeodata.csv") # Master Zeo export file is periodically updated; your results may not be identical R> n <- length(data$Morning.Feel); n [1] 1050 R> reg <- lm(Morning.Feel[2:n] ~ Morning.Feel[1:(n-1)], data=zeo) R> summary(reg) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.5727 0.0943 27.3 <2e-16 Morning.Feel[1:(n - 1)] 0.0689 0.0329 2.1 0.036 Residual standard error: 0.771 on 918 degrees of freedom (129 observations deleted due to missingness) Multiple R-squared: 0.00476, Adjusted R-squared: 0.00368 F-statistic: 4.39 on 1 and 918 DF, p-value: 0.0364 # Given that pretty much all the ratings are 2, 3, or 4, and the r^2 is <0.01 # with a residual error of 0.75, that doesn't seem very correlated. # although the _p_ does indicate there's a real (but very small) correlation from # day to day, so I guess the p-values may be a *little* overstated cor(zeo$Morning.Feel[2:n], zeo$Morning.Feel[1:(n-1)], use = "complete.obs") [1] 0.069 # we can also graph the lags: R> acf(zeo$Morning.Feel, na.action=na.pass, main="Do days predict subsequent days at various temporal distances?") # incidentally - 129 observations missing? What's going on? zeo$Morning.Feel [1] NA 2 3 3 4 3 3 2 NA NA 4 4 NA 3 NA 2 4 4 NA 4 3 3 3 4 2 3 2 3 NA 3 NA [32] NA 4 NA 4 NA NA NA NA NA NA NA NA NA NA NA NA NA 3 4 NA NA 4 4 3 4 NA NA NA NA NA NA [63] NA 4 NA 2 3 3 NA NA 3 NA 3 3 NA 2 NA NA NA NA 3 NA NA NA NA NA NA NA 3 4 NA 4 3 [94] 3 3 4 4 3 3 3 2 3 3 2 3 3 3 2 NA 3 3 4 3 NA 3 NA 3 NA 3 3 3 NA 3 3 [125] NA NA NA NA NA 2 NA NA 3 2 3 NA NA NA NA NA NA 3 2 3 2 2 2 2 2 3 3 3 3 NA 3 [156] 3 2 2 3 3 2 3 2 3 NA 2 NA NA 4 3 3 3 2 3 NA 4 3 2 3 3 3 3 3 3 4 3 [187] 4 3 3 3 3 3 2 3 2 3 3 3 NA 3 1 4 NA 3 2 4 4 2 2 3 3 3 3 3 3 3 3 [218] 3 3 4 3 3 2 2 3 3 2 3 3 3 2 2 3 3 3 3 3 4 3 3 2 2 2 1 2 3 3 NA [249] 3 3 3 3 3 3 3 3 2 3 2 3 2 3 3 3 2 3 3 2 3 3 3 3 4 3 3 4 3 4 2 [280] 3 NA 3 3 2 2 2 3 3 3 3 2 3 3 2 2 2 3 3 2 2 3 2 3 3 3 3 3 3 2 3 [311] 3 2 1 3 4 3 2 3 3 2 2 3 3 3 1 2 NA 2 3 2 2 3 3 2 3 3 NA 3 NA 3 3 [342] 2 3 2 2 3 3 3 3 1 3 3 3 2 1 3 NA 2 3 3 3 3 2 1 2 2 3 2 2 3 3 3 [373] 3 3 4 3 2 3 3 3 2 2 3 NA 3 2 3 4 4 3 3 2 4 3 2 3 3 4 3 4 3 3 NA [404] 2 2 3 3 3 4 4 3 1 3 3 2 4 3 3 3 2 3 2 4 2 4 3 3 3 4 NA 2 3 3 3 [435] 3 2 1 2 2 3 2 3 1 4 3 3 4 3 3 2 2 2 2 3 1 3 3 3 4 3 3 2 3 3 4 [466] 4 2 2 3 3 2 2 4 3 3 3 2 3 2 2 3 2 3 2 3 2 3 2 3 2 3 3 3 2 3 3 [497] 2 3 1 2 3 3 3 3 2 2 3 3 1 3 2 3 3 4 1 3 4 1 4 3 4 3 3 2 3 2 NA [528] 3 4 2 4 3 3 3 4 4 1 3 2 3 3 3 2 3 4 3 3 2 3 3 3 4 2 2 2 3 3 3 [559] 4 4 1 3 3 3 4 3 4 3 3 1 1 2 3 2 3 3 4 3 3 3 2 2 3 4 4 1 4 4 3 [590] 4 3 3 3 3 3 2 3 3 2 3 3 2 3 4 2 2 3 1 3 3 2 3 3 2 2 3 4 3 2 1 [621] 3 3 3 3 2 4 2 3 3 3 3 4 3 3 3 NA 3 NA 4 3 2 2 2 2 3 3 3 4 3 2 3 [652] 2 3 3 1 3 4 3 3 4 4 4 2 3 2 1 4 2 4 3 2 3 3 3 3 2 3 4 2 2 2 2 [683] 3 4 3 4 2 2 3 4 2 3 3 3 2 2 2 3 2 2 2 4 3 3 3 2 2 1 2 4 3 3 3 [714] 3 3 2 2 2 3 3 3 3 1 1 2 3 3 4 3 3 3 4 3 4 3 3 3 3 3 3 3 2 2 2 [745] 2 3 2 3 3 2 1 3 3 2 3 3 3 3 2 3 4 4 2 3 3 4 4 2 4 4 4 3 3 3 1 [776] 3 3 2 3 3 4 4 3 1 4 4 4 3 3 3 2 1 2 2 3 3 3 2 4 3 2 4 3 3 4 4 [807] 1 2 3 2 3 4 2 3 4 2 4 2 3 3 2 3 2 3 3 3 2 3 2 2 3 4 2 0 3 2 2 [838] 1 3 3 4 4 3 2 3 2 3 3 2 1 2 3 3 1 0 3 3 2 3 2 3 3 3 2 3 3 2 2 [869] 3 2 3 2 3 3 3 0 2 3 2 2 2 2 2 3 3 3 2 3 2 3 3 2 2 3 4 3 3 3 2 [900] 3 3 3 3 4 2 3 3 2 3 0 1 3 2 3 3 3 2 2 3 3 3 3 3 2 2 3 4 0 3 3 [931] 3 2 3 4 2 3 3 3 3 3 4 2 3 3 2 3 2 3 4 4 3 3 1 3 4 3 0 3 4 3 3 [962] 4 2 2 3 1 2 4 4 3 3 3 2 3 0 3 4 3 2 4 2 3 0 3 3 3 2 4 2 3 3 2 [993] 3 3 3 3 3 3 4 3 4 3 3 3 4 3 3 3 2 3 3 3 2 2 3 3 4 3 4 2 3 3 3 [1024] 3 3 2 3 2 3 3 3 3 3 3 3 3 4 4 3 3 3 0 4 3 2 2 3 3 3 2 # ah, I just wasn't good about recording "Morning Feel" early on, and since then # there have been occasional slips (literally, with the headband)`

Gustavo comments:

And by the way, instead of regressing

`Morning.Feel[n]`

on`Drug[n]`

(a discrete variable taking values in {0,1}), it would make more sense to regress on an Exponentially-Weighted Moving Average of`Drug`

, such as $Drug[n\xe2\x88\x921]+(\frac{1}{2}\u0102\x97Drug[n\xe2\x88\x922])+(\frac{1}{4}\u0102\x97Drug[n\xe2\x88\x923])+...$ which is modeling how much drug is present on the body. In the above example, Iâm assuming a half-life of 1 day, so lambda=$\frac{1}{2}$. You could arguably select the lambda that gives you the best fit; just be wary of multiple testing.The BEST analysis is powerful and provides much more information than a simple

*t*-test would, but the various parameters in the table or the image are not self-explanatory; the curious should read âBayesian estimation supersedes the*t*testâ (Kruschke 2012).In the CSV, an SSCF.1 of 0 indicates membership in the original experiment, 1 indicates the dry period July-September, 2 indicates the vitamin D resumption post-original-experiment, and 3 indicates the vitamin D resumption post-September. So:

â©`# set up data mydata <- read.csv("http://www.gwern.net/docs/zeo/2012-zeo-vitamind-morning-control.csv") originalcontrol <- subset(mydata, SSCF.1==0) newcontrol <- subset(mydata, SSCF.1==1) # clean missing data originalcontrol <- originalcontrol$Morning.Feel[!is.na(originalcontrol$Morning.Feel)] newcontrol <- newcontrol$Morning.Feel[!is.na(newcontrol$Morning.Feel)] # run BEST MCMC group estimations source("BEST.R") mcmc = BESTmcmc(originalcontrol, newcontrol) BESTplot(originalcontrol, newcontrol, mcmc, TRUE, ROPEeff=c(-0.1,0.1)) SUMMARY.INFO PARAMETER mean median mode HDIlow HDIhigh pcgtZero mu1 2.82199912 2.82184675 2.82109419 2.5425634 3.1008251 NA mu2 2.84712376 2.84744246 2.84233569 2.6205415 3.0777439 NA muDiff -0.02512464 -0.02542602 -0.03361140 -0.3874754 0.3339228 44.43593 sigma1 0.72900731 0.71760315 0.69447083 0.5330477 0.9474278 NA sigma2 0.88825472 0.88350888 0.87346099 0.7192899 1.0690516 NA sigmaDiff -0.15924742 -0.16410108 -0.17383105 -0.4269052 0.1171290 12.08159 nu 41.98417254 33.62743916 17.74077514 3.2649758 104.0648983 NA nuLog10 1.51048794 1.52669380 1.57284008 0.8699835 2.1138309 NA effSz -0.03198943 -0.03143175 -0.04438195 -0.4678744 0.4142259 44.43593`

As usual:

â©`mydata <- read.csv("http://www.gwern.net/docs/zeo/2012-zeo-vitamind-morning-control.csv") originalcontrol <- subset(mydata, SSCF.1==0) newcontrol <- subset(mydata, SSCF.1==1) Wilcoxon rank sum test with continuity correction data: originalcontrol$Morning.Feel and newcontrol$Morning.Feel W = 886, p-value = 0.7103`

The generating R code (see later analysis footnote for definitions of data variables like

`offtimeawake`

etc):

â©`plot(c(1:32), offtimeawake, col="blue", xlab="nth", ylab="latency/awakenings/awake (raw)") points(c(1:32), offlatency, col="blue") points(c(1:32), offawakenings, col="blue") points(c(1:30), ontimeawake, col="red") points(c(1:30), onlatency, col="red") points(c(1:30), onawakenings, col="red")`

After running

`zscore`

on each data variable, we repeat the previous code but with`ylab="latency/awakenings/awake (standardized)"`

in the call to`plot`

.â©Assuming the

`zscore`

conversion has been done:

â©`plot(c(1:32), offtimeawake+offlatency+offawakenings, col="blue", xlab="nth", ylab="standardized sleep disturbance score") points(c(1:30), ontimeawake+onlatency+onawakenings, col="red")`

The previously described composite measure and BEST test:

â©`# all the non-potassium days offlatency <- c(11,15,16,16,17,18,20,21,21,24,24,26,29,33,36,42,40,19,32,28,37,36,19,25, 30,22,11,20,33,33,42,31) offawakenings <- c(8,6,2,7,6,8,7,4,8,3,8,4,7,7,9,12,11,14,8,10,8,6,9,8,13,9,5,5,13,12,9,9) offtimeawake <- c(21,14,6,15,7,22,12,17,29,5,14,10,16,16,24,13,42,50,39,15,20,18,33,27,45, 23,21,6,25,28,31,61) # all the potassium days onlatency <- c(12,15,16,17,18,19,21,21,23,25,25,26,26,26,27,29,30,30,32,33,33,34,34, 54,30,31,30,22,26,23) onawakenings <- c(8,3,4,10,8,9,4,5,4,10,7,4,7,8,7,8,12,8,7,3,6,2,8,7,10,9,4,9,11,8) ontimeawake <- c(22,08,11,17,10,24,19,8,8,35,9,39,10,29,15,20,90,16,13,6,15,1,20,24, 17,60,10,50,22,18) # normalize zscore <- function(x,y) mapply(function(a) (a - mean(y))/sd(y), x) offlatency <- zscore(offlatency, c(offlatency, onlatency)) onlatency <- zscore(onlatency, c(offlatency, onlatency)) offawakenings <- zscore(offawakenings, c(offawakenings, onawakenings)) onawakenings <- zscore(onawakenings, c(offawakenings, onawakenings)) offtimeawake <- zscore(offtimeawake, c(offtimeawake, ontimeawake)) ontimeawake <- zscore(ontimeawake, c(offtimeawake, ontimeawake)) # zip together with sum to get a single measure of how deviate a night was off <- offlatency + offawakenings + offtimeawake on <- onlatency + onawakenings + ontimeawake # usual Bayesian two-group test source("BEST.R") mcmcChain = BESTmcmc(off, on) postInfo = BESTplot(off, on, mcmcChain) # graph postInfo SUMMARY.INFO PARAMETER mean median mode HDIlow HDIhigh pcgtZero mu1 0.1664 0.1655 0.1421 -0.71894 1.0555 NA mu2 2.4256 2.4210 2.4035 1.81175 3.0478 NA muDiff -2.2592 -2.2592 -2.2318 -3.34666 -1.1853 0.006 sigma1 2.3939 2.3607 2.2695 1.78291 3.0915 NA sigma2 1.6189 1.5988 1.5786 1.11009 2.1614 NA sigmaDiff 0.7750 0.7606 0.7341 -0.03236 1.6317 97.205 nu 32.0045 23.2730 9.6599 2.33645 88.0997 NA nuLog10 1.3607 1.3669 1.4214 0.67234 2.0337 NA effSz -1.1141 -1.1107 -1.0959 -1.69481 -0.5433 0.006`

Reusing the standardized data from before:

â©`wilcox.test(off, on) Wilcoxon rank sum test data: off and on W = 224, p-value = 0.0002168`

As before, we use BEST (the self-rating is mostly normal):

â©`Potassium <- c(1,1,0,1,0,1,0,0,1,1,1,0,0,1,1,1,0,1,1,0,1,0,1,1,0,1,0,0,0,0,1,0,0,0,1,0,1,1, 0,1,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1) MP <- c(4,4,3,4,4,3,3,2,3,3,3,3,4,4,3,4,2,2,2,3,4,3,4,3,4,3,4,4,3,3,2,3,2,4,4,3,4,2,3,4,2, 3,3,2,2,2,3,2,3,3,4,2,3,4,3,4,3,3,2,2,3,4,4,3,4,2,2,3,2) pot <- data.frame(Potassium, MP) # first graph: library(ggplot2) qplot(data=pot, y=MP, color=Potassium) # analysis: source("BEST.R") off <- pot$MP[pot$Potassium == 0] on <- pot$MP[pot$Potassium == 1] mcmcChain = BESTmcmc(off, on) postInfo = BESTplot(off, on, mcmcChain) # graph postInfo SUMMARY.INFO PARAMETER mean median mode HDIlow HDIhigh pcgtZero mu1 3.02651 3.02686 3.03576 2.7780 3.2677 NA mu2 3.10432 3.10390 3.07921 2.7939 3.4127 NA muDiff -0.07782 -0.07736 -0.07786 -0.4728 0.3119 34.96 sigma1 0.75685 0.74855 0.73261 0.5834 0.9427 NA sigma2 0.83168 0.81845 0.79169 0.6133 1.0677 NA sigmaDiff -0.07483 -0.07033 -0.05617 -0.3755 0.2195 31.15 nu 47.52944 39.43237 23.78338 4.6350 111.4156 NA nuLog10 1.58217 1.59585 1.63348 0.9931 2.1316 NA effSz -0.09844 -0.09761 -0.10476 -0.5879 0.3897 34.96 wilcox.test(off, on) Wilcoxon rank sum test with continuity correction data: off and on W = 552.5, p-value = 0.6789`

See previously for explanation:

â©`pot <- read.csv("http://www.gwern.net/docs/zeo/2013-gwern-potassium-morning.csv") # standardize & combine into a single equally-weighted synthetic index z-score pot$Disturbance <- scale(pot$Time.to.Z) + scale(pot$Awakenings) + scale(pot$Time.in.Wake) on <- pot[pot$Potassium==1,]$Disturbance off <- pot[pot$Potassium==0,]$Disturbance source("BEST.R") mcmcChain = BESTmcmc(off, on) postInfo = BESTplot(off, on, mcmcChain) # graph postInfo SUMMARY.INFO PARAMETER mean median mode HDIlow HDIhigh pcgtZero mu1 0.1329 0.13224 0.11468 -0.6505 0.9203 NA mu2 -0.2626 -0.26479 -0.22430 -1.1154 0.5966 NA muDiff 0.3956 0.39838 0.37996 -0.7724 1.5327 75.39 sigma1 1.9961 1.96663 1.89699 1.3978 2.6302 NA sigma2 1.9403 1.90682 1.86314 1.2797 2.6697 NA sigmaDiff 0.0558 0.06166 0.04212 -0.8615 0.9499 55.85 nu 33.0593 24.28680 9.49415 1.7036 90.8230 NA nuLog10 1.3674 1.38537 1.47058 0.6392 2.0655 NA effSz 0.2054 0.20334 0.18368 -0.3619 0.8119 75.39`

`on`

/`off`

defined and BEST loaded in previous analysis:

â©`mcmcChain = BESTmcmc(off$MP, on$MP) postInfo = BESTplot(off$MP, on$MP, mcmcChain) # graph postInfo SUMMARY.INFO PARAMETER mean median mode HDIlow HDIhigh pcgtZero mu1 2.999866 2.99993 2.99749 2.7134 3.2884 NA mu2 2.955535 2.95571 2.95990 2.6391 3.2689 NA muDiff 0.044331 0.04465 0.05384 -0.3831 0.4669 58.29 sigma1 0.739736 0.72787 0.71017 0.5371 0.9685 NA sigma2 0.731523 0.71670 0.68979 0.5081 0.9827 NA sigmaDiff 0.008212 0.01087 0.01340 -0.3210 0.3419 52.76 nu 41.545632 33.20153 18.29201 2.5717 103.6089 NA nuLog10 1.502165 1.52116 1.55933 0.8486 2.1209 NA effSz 0.060755 0.06100 0.07764 -0.5064 0.6339 58.29`

The geeky details: I found a error line in the X logs which appeared only when I invoked Redshift; the driver was

`fbdev`

and not the correct`radeon`

, which mystified me further, until I read various bug reports and forum problems and wondered*why*`radeon`

was not loading but the only non-`fbdev`

error message indicated that some driver called`ati`

was failing to load instead. Then I read that`ati`

was the default wrapper over`radeon`

, but then I saw that the package was not installed, installed it, noticed it was pulling in as a dependency useless Mach64 drivers, and had a flash: perhaps I had uninstalled the useless Mach64 drivers, forcing the package providing`ati`

to be uninstalled too, permitted its uninstallation because I knew it was not the package providing`radeon`

, which then caused the`ati`

load to fail and to not then load`radeon`

but X succeeding in loading`fbdev`

which does not support Redshift, leading to a permanent failure of all uses of Redshift. Phew! I was right.â©I donât use a timer, but instead count 400 full breaths. Depending on how fast and shallowly I breathe, this runs from 20-35 minutes (eg. 16 May 2012âs meditation ran 33 minutes long). To be conservative, I will assume the meditation is only 20 minutes. In mid-October, I bought and began using instead a timer which could be set to 15 minutes.â©

The exact processing steps, for those curious:

â©`zeo <- read.csv("~/wiki/docs/zeo/gwern-zeodata.csv") zeo$Sleep.Date <- as.Date(zeo$Sleep.Date, format="%m/%d/%Y") mp <- read.csv("mp.csv", colClasses=c("Date","factor")) zeo$MP <- ordered(mp[mp$Date %in% zeo$Sleep.Date,]$MP) zeo$Disturbance <- scale(zeo$Time.to.Z) + scale(zeo$Awakenings) + scale(zeo$Time.in.Wake) zeo <- zeo[!is.na(zeo$Disturbance) & !is.na(zeo$Morning.Feel),]`

Load & correlate:

â©`zeo <- read.csv("http://www.gwern.net/docs/zeo/2013-gwern-sleepdisturbances-productivity.csv") cor.test(zeo$Disturbance, as.integer(zeo$MP)) Pearson`s product-moment correlation data: zeo$Disturbance and as.integer(zeo$MP) t = 1.344, df = 414, p-value = 0.1798 alternative hypothesis: true correlation is not equal to 0 95% confidence interval: -0.03045 0.16102 sample estimates: cor 0.06589`

We regress a continuous predictor onto a categorical outcome:

â©`# turn into an ordinal variable zeo$MP <- ordered(zeo$MP) library(MASS) lmodel <- polr(MP ~ Disturbance, data = zeo); summary(lmodel) ... Coefficients: Value Std. Error t value Disturbance 0.0553 0.0429 1.29 Intercepts: Value Std. Error t value 1|2 -4.413 0.450 -9.808 2|3 -0.990 0.110 -8.965 3|4 1.101 0.113 9.711 Residual Deviance: 915.66 AIC: 923.66 exp(lmodel$coefficients) Disturbance 1.057`

Try out more variables:

â©`almodel <- polr(MP ~ Disturbance + ZQ + Total.Z + Time.to.Z + Time.in.Wake + Time.in.REM + Time.in.Light + Time.in.Deep + Awakenings + Morning.Feel, data = zeo); almodel Coefficients: Disturbance ZQ Total.Z Time.to.Z Time.in.Wake Time.in.REM Time.in.Light -0.431623 -0.276236 0.307941 0.045819 0.003266 -0.246901 -0.272593 Time.in.Deep Morning.Feel -0.227003 0.205541 Intercepts: 1|2 2|3 3|4 -2.9105 0.5465 2.6902 Residual Deviance: 903.01 AIC: 927.01`

Reduced by cutting out extraneous variables using stepwise regression:

â©`salmodel <- step(almodel); summary(salmodel) ... Coefficients: Value Std. Error t value Time.to.Z 0.0163 0.00713 2.29 Time.in.Deep -0.0152 0.00823 -1.85 Morning.Feel 0.1906 0.12683 1.50 Intercepts: Value Std. Error t value 1|2 -4.457 0.785 -5.675 2|3 -1.011 0.649 -1.557 3|4 1.113 0.649 1.713 Residual Deviance: 907.60 AIC: 919.60`