Vitamin D sleep experiments

Self-experiment on vitamin D effects on sleep: harmful taken at night, no or beneficial effects when taken in the morning. (experiments, biology, statistics, Zeo, R, Bayes)
created: 1 January 2012; modified: 10 Feb 2018; status: finished; confidence: highly likely;

Vitamin D is a hormone endogenously created by exposure to sunlight; due to historically low outdoors activity levels, it has become a popular supplement and I use it. Some anecdotes suggest that vitamin D may have circadian and zeitgeber effects due to its origin, and is harmful to sleep when taken at night. I ran a blinded randomized self-experiment on taking vitamin D pills at bedtime. The vitamin D damaged my sleep and especially how rested I felt upon wakening, suggesting vitamin D did have a stimulating effect which obstructed sleep. I conducted a followup blinded randomized self-experiment on the logical next question: if vitamin D is a daytime cue, then would vitamin D taken in the morning show some beneficial effects? The results were inconclusive (but slightly in favor of benefits). Given the asymmetry, I suggest that vitamin D supplements should be taken only in the morning.

Background

Seth Roberts has speculated that vitamin D, despite its myriads of other benefits, may harm sleep when taken in the evening and help sleep when taken in the morning based on some anecdotes (with 2 null results). The anecdotes are nearly worthless as sleep is pretty variable (look above or below, and you’ll see swings of over 20 ZQ points night to night), and just a little carelessness or selection bias will persuade one that there is a major effect where there is none - especially since they are not using Zeos or accelerometers or even giving basic quantities like I felt bad in the morning 3/5 days. But I began to wonder. Vitamin D is a chemical intimately involved in circadian rhythms (a zeitgeber), with some connections to systems involved in sleep (The steroid hormone of sunlight soltriol (vitamin D) as a seasonal regulator of biological activities and photoperiodic rhythms); given its links to the early day and sunlight, one would expect it to affect sleep for the worse.

To see what, if any existing research there was, I checked the 49 hits in PubMed and the first 10 pages of Google Scholar for vitamin D sleep. For the most part, hits were completely irrelevant, and the most relevant ones like Vitamins and Sleep: An Exploratory Study did not cover any relationship between vitamin D and sleep, much less the timing of vitamin D consumption. There’s some speculation the elderly may sleep badly in part due to lack of vitamin D (Some new food for thought: The role of vitamin D in the mental health of older adults, Cherniack et al 2009), but the only hard results I found were weak or tangential: a correlation with daytime sleepiness in Taiwanese dialysis patients1, a correlation with later sleep in American women2, a correlation with earlier sleep in Japanese women3, a correlation with reduced sleep difficulties in Americans, and a correlation of blood levels with both better and worse sleep in Americans4. This reads like noise.

In June 2012, after I finished my 2 experiments, a preprint appeared for Medical Hypotheses: The world epidemic of sleep disorders is linked to vitamin D deficiency, Gominak & Stumpf 2012; the lead author, unfortunately, had little to tell me when I emailed her, indicating that the use of vitamin D was not systematic or recorded:

• I don’t know about the overarching claims (I suspect most of the problem is lighting, and general demands on time), but the trial itself seems really important, especially since neither Roberts nor I had the slightest idea about it but seem to have reached similar results
• the 2 patients suggested it, in an interesting example of the value of self-experimentation
• the authors cover much more specific potential connections between vitamin D and sleep than just circadian rhythms
• the methodology section is non-existent; how were these 1500 patients picked? how long did each use vitamin D? Unfortunately, I nor Roberts has taken vitamin D blood tests (as far as I know) and so we cannot verify that the authors’ 60-80ng/ml range is what we fell into, but it’s plausible. How is sleep quality being measured? Are these results consistent or inconsistent with the one case of morning mood/restedness improvement but little else? Although even if they were inconsistent, that could be explained by neither of us being sleep disorder sufferers and the effect being weaker in us.

In July 2012, preprints of Huang et al 2012 became available; it is a case series - the authors followed a group of veterans with chronic pain who received vitamin D supplements, finding improvements to pain but also reduction in sleep latency and increase in sleep duration. While I did not observe any effect on latency or duration in my following experiments, this would still be a promising datapoint but unfortunately, the sample had substantial dropout, and had no control group (hence no randomizing or blinding). This renders the study not very useful - the improvements being perhaps just regression toward the mean or a selection bias. Blogger Chris L looked back in August 2012 on ~1 year of Zeo data and a quasi-experiment in which he started with 4000IU of vitamin D supplementation, then 5000IU, then none; he took them at night, then switched to morning; the results were that the length of his deep sleep started high, dropped, and then recovered. He interprets this as evidence that too much vitamin D hurts sleep. In 2013, a review (McCarty et al 2013) came out arguing that low vitamin D levels increase the risk for autoimmune disease, chronic rhinitis, tonsillar hypertrophy, cardiovascular disease, and diabetes. These conditions are mediated by altered immunomodulation, increased propensity to infection, and increased levels of inflammatory substances, including those that regulate sleep; this might handle negative effects on sleep from chronically low vitamin D, but doesn’t seem relevant to acute effects varying by time of administration. A 2017 vitamin D sleep RCT (Majid et al 2017) in people diagnosed with sleep disorders found a large benefit in self-rated sleep quality after the equivalent of 3570IU/daily, but for compliance the vitamin D was administered as a single shot every two weeks so while it provides evidence that vitamin D may help with sleep disorders, it doesn’t address the benefits in otherwise healthy people or the question of timing of doses.

Vitamin D at night hurts?

Setup

I decided to run a small double-blind experiment much like the Adderall and other trials. My Vitamin D is 360 5000IU softgels by Healthy Origins, bought on iHerb.com. The gel-capsules contain cholecalciferol dissolved in olive oil. This made preparing placebo pills a little more difficult. I wound up puncturing the capsules, squeezing out the olive oil contents into a new capsule (they were too wide to push in) and then pushing in the empty shell; all 20 were topped off with ordinary white baking flour. (I used up the last of my creatine preparing the placebos for the Modalert day trial.) For the 20 placebo pills, I spooned in some olive oil to each and topped them off with flour as well. Each set went into its own identical Tupperware container. The process was a little messier than I had hoped, but the pills seem like they will work.

The procedure at night will be: in the dark5 immediately before putting on the Zeo headband and going to bed, I will take my usual melatonin pill; then I will take the two containers blindly; mix them up; select a pill from one to take, and put the selected container on the shelf next to the Zeo. In the morning, I will see which one I took. (The Vitamin D olive oil was distinctly more yellow than the green placebo olive oil.) If I took placebo, I will take my usual daily dose of Vitamin D, and if active, I will skip it. This will blind me and keep constant my total Vitamin D intake. (This procedure may need to be amended with something more like the modafinil/Adderall procedure: a bag with replacement of the consumed placebos.) If I get a run of one kind of pills, I will re-balance the numbers.

Based on the first 10 days’ ZQs, I predict I’ll find in the final data set:

1. increased sleep latency; probably at least another 10 minutes to fall asleep, as my mind seems to churn away with ideas of things to do
2. increased awakenings; not that many, maybe 1 or 2 on average
3. decreased ZQ; by around 5-10 points (a large effect, on par with melatonin)

My best guess is that the ZQ hit is coming from reduced deep sleep, or maybe reduced deep & REM sleep. I don’t think the total amount of sleep has changed.

Roberts theorizes that besides vitamin D damaging sleep, it could actively improve your sleep if taken in the morning. As it happens, in this setup, on placebo days I do take vitamin D in the morning - so wouldn’t one expect to see scores improve on the nights following a placebo night (a vitamin D morning), regardless of whether that night was vitamin D or placebo? A quick analysis of the first 24 nights showed the lagged nights to average a ZQ of 94.5. My monthly averages for October and November were 96, so there is no large improvement here.

One thing I suspect but cannot confirm - since I do not have a heart rate monitor - is that ~10 minutes after taking the vitamin D pills, my heart rate increases. Not to any uncomfortable or worrisome degree, but when one expects one’s heart rate to go down after going to bed, even a small increase in the opposite direction is noticeable. On the 12th, I finally got around to writing down this impression; then I searched online a bit and found that low vitamin D levels are associated with arrhythmia and other issues, but so are very high levels, and increased heart rates in the studies and anecdotes are associated with higher heart rates6. I’m not worried about the heart rate, but I am concerned that this is defeating the double-blinding: if all I have to do is notice my heart rate (and lying swaddled in bed in complete silence, it would be hard for me not to), then I’ve unblinded myself before falling asleep. Other stimulants like caffeine or sulbutiamine might similarly increase my heart rate, but they’d also interfere with sleep, so I can’t create any active placebo even if I wanted to start over. (One promising future gadget is the Basis wristwatch which measures, among other things, heart-rate; I look forward to the early reviews.)

Vitamin D data

The data (trimmed CSV), covering January-February 2012:

Date Pill Quality7 ZQ Guess
31D-1J active bad 84 right 70%
1-2 placebo better 93 right 65%
2-3 active well 94 50%
3-4 active poor 86 right 60%
4-5 placebo well 98 wrong 60%
5-6 active mediocre 86 50%
6-7 placebo OK ?8 right 65%
7-8 placebo good 90 right 60%
8-9 active poor 84 right 65%
9-10 placebo good 95 right 65%
10-11 active good 100 wrong 70%
11-12 active mediocre 92 right 70%
12-13 active mediocre 88 50%
13-14 active poor 100 right 60%
14-15 placebo poor 83 wrong 60%
15-16 active poor 101 right 55%
16-17 placebo mediocre 90 50%
17-18 placebo mediocre 88 right 60%
18-19 placebo good 100 50%
19-20 active poor 86 50%
20-21 active mediocre 85 50%
21-22 placebo OK 91 right 60%
22-23 placebo OK 106 right 65%
23-24 active poor 91 right 65%
24-25 active 1 79 right 75%
25-26 placebo 3 85 right 65%
26-27 active 2 ?9 right 55%
28-29 active 3 85 50%
29-30 active 3 93 wrong 55%
30-31 placebo 3 100 right 60%
31J-1F active 3 94 50%
1F-2F active 2 89 right 60%
2-3 active 1 83 right 70%
3-4 placebo 2 81 wrong 70%
5-6 placebo 3 98 right 65%
6-7 active 2 88 50%
7-8 active 2 94 right 55%
8-9 active 3 94 wrong 75%
9-10 placebo 3 92 50%
10-11 placebo 3 95 right 60%
11-12 placebo 3 103 right 75%
12-13 placebo 3 84 right 70%

(Data input was for Other Disruptions 3; 0 = placebo, 1 = vitamin D.)

Vitamin D analysis

From a quick look at the prediction confidences, I was usually correct but perhaps underconfident: my proper scoring log score compared to a random guesser is 5.410, which is even better than my guesses in my Adderall experiment.

Looking at the data averages in the Zeo website, it looked like ZQ & total & REM sleep fell, deep increased slightly, time awake & awakenings both increased, and morning feel decreased. The R analysis11:

The MANOVA is tantalizingly close to statistical-significance (p=0.07); the variables:

Variable Effect p-value Coefficient’s sign is…
Total.Z -19.73 0.084 worse
Time.in.REM -14.54 0.021 worse
Time.in.Deep 2.32 0.41 better
Time.in.Wake 2.50 0.63 worse
Awakenings 0.739 0.37 worse
Morning.Feel -0.524 0.0067 worse
Time.to.Z 3.47 0.46 worse

Morning.Feel jumps out as having a large effect (-0.5, on a 1-3 rating, is huge) and accordingly, a very low p-value which survives multiple-correction12. Apparently I was waking up feeling like crap on the Vitamin D nights.

Going back to my predictions after the first 10 days, they’re sort of right:

1. sleep latency was increased, but not statistically-significantly and only by ~3m, which is less than half the predicted 10 minutes
2. increased awakenings was less than 1 additional awakening (compared to predicted 1-2) and didn’t reach statistical significance

My conclusion?

Vitamin D hurts sleep when taken at night. I know of no reason that one would want to take vitamin D late at night, so I will definitely be avoiding it at that time in the future.

VoI

For background on value of information calculations, see the first calculation.

The first experiment I had no opinion on. I actually did sometimes take vitamin D in the evening when I hadn’t gotten around to it earlier (I take it for its anti-cancer and SAD effects). There was no research background, and the anecdotal evidence was of very poor quality. Still, it was plausible since vitamin D is involved in circadian rhythms, so I gave it 50% and decided to run an experiment. What effect would perfect information that it did negatively affect my sleep have? Well, I’d definitely switch to taking it in the morning and would never take it in the evening again, which would change maybe 20% of my future doses, and what was the negative effect? It couldn’t be that bad or I would have noticed it already (like I noticed sulbutiamine made it hard to get to sleep). I’m not willing to change my routines very much to improve my sleep, so I would be lying if I estimated that the value of eliminating any vitamin D-related disturbance was more than, say, 10 cents per night; so the total value of affected nights would be $0.10 \times 0.20 \times 365.25 = 7.3$. On the plus side, my experiment design was high quality and ran for a fair number of days, so it would surely detect any sleep disturbance from the randomized vitamin D, so say 90% quality of information. This gives $\frac{7.3 - 0}{\ln 1.05} \times 0.90 \times 0.50 = 67.3$, justifying <9.6 hours. Making the pills took perhaps an hour, recording used up some time, and the analysis took several hours to label & process all the data, play with it in R, and write it all up in a clean form for readers. Still, I don’t think it took almost 10 hours of work, so I think this experiment ran at a profit.

Vitamin D at morn helps?

Setup

The logical next thing to test is whether there is any benefit to sleep by taking vitamin D in the morning as compared to not taking vitamin D at all, since we have already established that evening is worse than morning. (Besides anecdotes, Seth Roberts reported - after I concluded my experiment - that his own non-blind varying of doses seemed to help his subjective restedness but didn’t influence anything else.) I would expect any benefits in the morning to be attenuated compared to the evening effect: the morning is simply many hours away from going to bed again in the evening, giving time for many events to affect the ultimate sleep. So this experiment will run for more than 40 days of 20/20, but 56 days of 28/28; per Roberts’s suggestion, I will not randomize individual days but 8 paired blocks of 7 days. (Multiple days to give any slow effects time to manifest, which seem eminently possible with a fat-soluble vitamin like vitamin D; 7 days, so we don’t cycle around the week but instead have exactly the same number of eg. active Sundays and placebo Sundays since sleep often varies systematically over the week.)

I prepare 27 placebo pills & 27 actives as before, stored in separate baggies. To randomize blocks of 7-days - I will fill 2 opaque containers with 7 placebo and 7 actives (with a label on the inside of the active container), and pick a container at random to use for the next 7 days. I will take one each morning upon awakening, closing my eyes. On the 8th morning, the first container will be empty, so I set it aside and open the second; when the second is emptied, I will look inside it to see whether it has the label, which lets me infer which one it was, and record whether the 2 weeks were active/placebo or placebo/active. The 2 containers will be refilled as before, and blocks 3-4 will begin. I will do this 4 times, at which point I will analyze the data.

Analysis will be the same Zeo parameters as before, but this time augmented by a simple mood indicator: 1-5, with 3 being an ordinary mildly productive day and 1 being my car caught on fire and was totaled day (real data-point), recorded at the end of each day just before bed. (I considered a more complex mood indicator, the BOMS, while setting up my lithium experiment, but rejected it as being too heavy-weight for long-term use, and subjectively, my mood doesn’t vary that much.)

Morning data

1. Blocks:

• 17-25F: guess: placebo (last pill used morning 25; swapped jars and consumed pill from second jar the morning of 26); actual: placebo
• 26F-8M: skipped multiple days for modafinil (omit March 1, 2); actual: active
2. Blocks:

• 9M-15M: guess: active actual: placebo
• 16-25: active (omit March 21)
3. Blocks:

• 26M-1A: guess: placebo actual: placebo
• 2A-8: active
4. Blocks:

• 9A-19: (omit April 11, 12) guess: placebo actual: placebo
• 20-27: active (omit April 21, 22)

Placebo/active coded as 0/1 in SSCF.113 in the CSV export. Mood was coded as fractional integers as the Mood column.

Morning analysis

As before, we fire up R and analyze the spreadsheet with the usual assumptions14 about independence of the daily observations. The interpreter session:

zeo <- read.csv("https://www.gwern.net/docs/zeo/2012-zeo-vitamind-morning.csv")

# an example of the many intercorrelations which make simple t-tests misleading
# and motivate the use of multivariate linear regression:
cor(zeo[c(2,3,5:11, 25)], use="complete.obs")
#               Vitamin.D     Mood  Total.Z Time.to.Z Time.in.Wake Time.in.REM Time.in.Light
# Vitamin.D      1.000000 -0.06210  0.01007 -0.004528     -0.14399     0.01844      -0.02043
# Mood          -0.062097  1.00000  0.03038 -0.229114      0.13365    -0.05137       0.06783
# Total.Z        0.010067  0.03038  1.00000 -0.388734     -0.05258     0.77338       0.82402
# Time.to.Z     -0.004528 -0.22911 -0.38873  1.000000      0.17821    -0.29690      -0.28948
# Time.in.Wake  -0.143987  0.13365 -0.05258  0.178211      1.00000    -0.12396       0.15893
# Time.in.REM    0.018437 -0.05137  0.77338 -0.296904     -0.12396     1.00000       0.35087
# Time.in.Light -0.020427  0.06783  0.82402 -0.289484      0.15893     0.35087       1.00000
# Time.in.Deep   0.054670  0.05648  0.57647 -0.299816     -0.35438     0.37922       0.24574
# Awakenings    -0.074435  0.09076  0.07645  0.142952      0.67797     0.04007       0.21834
# Morning.Feel   0.053450  0.11313  0.62368 -0.285966     -0.04032     0.56241       0.51081
#               Time.in.Deep Awakenings Morning.Feel
# Vitamin.D          0.05467   -0.07444      0.05345
# Mood               0.05648    0.09076      0.11313
# Total.Z            0.57647    0.07645      0.62368
# Time.to.Z         -0.29982    0.14295     -0.28597
# Time.in.Wake      -0.35438    0.67797     -0.04032
# Time.in.REM        0.37922    0.04007      0.56241
# Time.in.Light      0.24574    0.21834      0.51081
# Time.in.Deep       1.00000   -0.28355      0.22280
# Awakenings        -0.28355    1.00000      0.02151
# Morning.Feel       0.22280    0.02151      1.00000

l <- lm(cbind(Total.Z,Time.in.REM,Time.in.Deep,Time.in.Wake,Awakenings,Morning.Feel,Time.to.Z,Mood)
~ Vitamin.D, data=zeo)
summary(manova(l))
#           Df Pillai approx F num Df den Df Pr(>F)
# Vitamin.D  1 0.0363    0.213      9     51   0.99
summary(l)
# Response Total.Z :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   525.21      10.06   52.20   <2e-16
# Vitamin.D       1.07      13.89    0.08     0.94
#
# Response Time.in.REM :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  162.172      4.711   34.42   <2e-16
# Vitamin.D      0.921      6.505    0.14     0.89
#
# Response Time.in.Deep :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    65.34       2.53   25.85   <2e-16
# Vitamin.D       1.47       3.49    0.42     0.68
#
# Response Time.in.Wake :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    27.76       3.10    8.94  1.4e-12
# Vitamin.D      -4.79       4.29   -1.12     0.27
#
# Response Awakenings :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    8.000      0.592   13.51   <2e-16
# Vitamin.D     -0.469      0.818   -0.57     0.57
#
# Response Morning.Feel :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   2.8276     0.1386   20.40   <2e-16
# Vitamin.D     0.0787     0.1913    0.41     0.68
#
# Response Time.to.Z :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   25.448      2.827    9.00  1.1e-12
# Vitamin.D     -0.136      3.904   -0.03     0.97
#
# Response Mood :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   3.0931     0.1127   27.45   <2e-16
# Vitamin.D    -0.0744     0.1556   -0.48     0.63

The MANOVA suggests no statistically-significant difference between days (p=0.99), and no variables seem to have changed much:

Variable Effect p-value Coefficient’s sign is…
Total.Z 1.07 0.94 better
Time.in.REM 0.92 0.89 better
Time.in.Deep 1.47 0.68 better
Time.in.Wake - 4.79 0.27 better
Awakenings - 0.47 0.57 better
Morning.Feel 0.08 0.68 better
Time.to.Z - 0.14 0.97 better
Mood - 0.07 0.63 worse

All the changes are junk, including ones I was fairly sure would change, like Time to Z or Mood. (An earlier version of this analysis found a statistically-significant effect increasing Morning Feel, but this turns out to be due to the t-tests’ assumption that variables were not correlated, and the multivariate linear regression reduces the effect to non-significance.) Mood arguably was affected by an exogenous event - my car burning ruined that particular week.. Graphing the raw data, I notice that when my car burned, my Mood takes a clearly visible fall for a week, while my sleep looks like it was affected less - it seems that during that period, waking up was literally the best part of the day…

I conclude that the vitamin D in the morning did not damage any of the measured variables, unlike the vitamin D in the evening.

(This experiment also afforded me a chance to test Seth Roberts’s reaction to faked data which contradicted his vitamin D theory; he did not take it gracefully, which is useful to know in weighing his future opinions.)

Control quality control

Like with melatonin, we might wonder: is taking vitamin D causing effects on the control days as well? With melatonin, the concern I often hear voiced is whether melatonin might in some way be addictive or suppress normal melatonin secretion, in which case the observed difference between control and experimental days - which we interpreted as improvement - may actually be the opposite, a negative effect caused by a sort of withdrawal (lowered melatonin secretion levels, since the body has not yet adapted to the absence of melatonin supplements and will not when supplementation resumes the next day).

In the case of vitamin D, I find the results (no effect on anything except Morning Feel) sufficiently surprising that I wonder if this fat-soluble vitamin was causing effects over periods even longer than a week; and that the true results were that both control and experimental weeks were better than unsupplemented weeks, but that Morning Feel was the only variable which reacted to placebo fast enough to show up as a difference. The previously-mentioned August 2012 report of Chris L that an increase of 1k IU in his vitamin D supplementation reduced his deep sleep with month-long lags reinforces my suspicion: with such a long lag, any reduction in my deep sleep would go unnoticed. A completely dry multi-month long control group is necessary.

The simplest solution, although I don’t know if it’s statistically correct, is to drop the vitamin D or melatonin for a long enough period that any long-term effects should have disappeared, and then compare this abstention period to the supposed control weeks. If the abstention weeks are worse than the control weeks, then this supports the long-term interpretation; if the abstention weeks are similar to the control weeks, then we can eliminate the long-term interpretation; and if the abstention weeks are better than the control weeks, then we ought to be puzzled and start thinking about other possibilities. (Not enough data/power? Misinterpreted results? Or, the original morning experiment was in spring, while the abstention periods were summer/autumn - does sleep get worse in summer, perhaps due to heat?)

I won’t bother with blinding this one since it’s just a double-check of an unlikely possibility. (If one wanted to blind it, the procedure would be the same as before, but with big blocks: say, 2 blocks of 62 days, first pick randomized, or blocks of 31 days, with 4 blocks randomized in 2 pairs.) This experiment is easy enough to run: simply stop taking vitamin D. To avoid the temptation to cheat on days I am feeling down, it’s easiest to just wait until I run out of vitamin D and procrastinate on ordering a fresh supply until a bunch of days have passed.

The vitamin D experiment terminated in April; the last day of vitamin D was 2 July 2012; and I resumed 6 September 2012 with the end of the dataset being 31 October 2012.

Analysis

The question is simple: does the Morning Feel differ between the control days in the original Vitamin D morning experiment and between vitamin-less days as part of a long later sustained period? Was there something funky about the original control days, was there some sort of vitamin D bleed-over or maybe some sort of long-term effect which we could describe as contamination or dependency?

The short answer is: no. When we compare the two groups of days, the Morning Feel ratings have identical means, as we expected.

A Bayesian MCMC analysis15 (using the BEST library) produces the following graphical summary, which shows the two groups almost completely overlapping on means, with the key graph in the lower-right corner: there is no visible effect size at all (centered on 0), much less an effect size of d>=0.1 which we might take seriously as indicating a real difference:

More precisely, the summary statistics indicate that the difference in means & medians is usually -0.03 (negligibly small), the full range of effect size estimates is -0.4678744 to 0.4142259, and 44.4% of the possibilities were simply zero effect size.

(I did a non-parametric test as well: p=0.710316.)

VoI

For background on value of information calculations, see the first calculation.

With the vitamin D theory partially vindicated by the previous experiment, I became fairly sure that vitamin D in the morning would benefit my sleep somehow: 70%. Benefit how? I had no idea, it might be large or small. I didn’t expect it to be a second melatonin, improving my sleep and trimming it by 50 minutes, but I hoped maybe it would help me get to sleep faster or wake up less. The actual experiment turned out to show, with very high confidence, no bad change (and a good change in my mood upon awakening in the morning).

What is the value of information for this experiment? Essentially - zero:

1. If the experiment had shown any benefit, I would have continued taking it in the morning
2. if the experiment had shown no effect, I would have continued taking it in the morning to avoid incurring the evening penalty discovered in the previous experiment
3. if the experiment had shown the unthinkable (a negative effect), it would have to be substantial to convince me to stop taking vitamin D altogether and forfeit its many other apparent health benefits, and it’s not worth bothering to analyze an outcome I would have given <=5% chance to.

So since I did, was then, and still do supplement vitamin D, why bother? But of course, I did it because it was cool and interesting! (Estimated time cost: perhaps half the evening experiment, since I had to manually record less data, and already had the analysis worked out from before.)

1. …The increased odds of high PSQI score for greater hemoglobin level and for high ESS score for use of vitamin D analogues were unexpected results for which we cannot speculate about the cause or association and that may simply be spurious findings arising from statistical analysis.

2. This study found a [statistically-]significant relationship between circadian phase of sleep and dietary Vitamin D intake. Later sleep acrophase, an indicator of sleep timing, was associated with more dietary Vitamin D. For most people, most Vitamin D is obtained through sunlight(44), though dietary Vitamin D is usually obtained through supplementation, usually in pills or in dairy products(44). It is currently unknown why those who consumed more Vitamin D would demonstrate a sleep phase delay, especially since in this same subject group, those exposed to more light had earlier circadian acrophases(45).

3. Late midpoint of sleep was [statistically-]significantly negatively associated with the percentage of energy from protein and carbohydrates, and the energy-adjusted intake of cholesterol, potassium, calcium, magnesium, iron, zinc, vitamin A, vitamin D, thiamin, riboflavin, vitamin B(6), folate, rice, vegetables, pulses, eggs, and milk and milk products.

4. …Table 2 shows associations of serum 25(OH)D concentrations and sleep characteristics. After adjusting for age, sex, ethnicity, high blood pressure, body mass index, active smoking, depressive symptoms, and survey weighting, no association between serum 25(OH)D concentrations and sleeping hours was observed (beta 0.19, 95% CI −0.40 0.77, p = 0.51) while a significant inverse association was found between serum 25(OH)D concentrations and minutes to fall asleep (beta −3.13, 95% CI −5.62 to −0.64, p = 0.02). Moreover, people with higher vitamin D levels could be more likely to complain sleep problems (OR 1.60, 95% CI 1.20 to 2.14, p = 0.004)….It was observed that serum 25(OH)D concentrations were significantly associated with minutes to fall asleep, indicating that people with lower vitamin D levels tended to have longer time to fall asleep. On the other hand, it was also observed that people with higher vitamin D levels had more sleep complaints, although the reason is unclear.

5. The problem was the original vitamin D3 capsule: I couldn’t squeeze out all the oil, so I settled for squeezing out most, and then pushing the original capsule into the new capsule. So they contain everything they should, but they have a visible bubble inside them (the original capsule). Hence, the need for literal blinding. Otherwise, they’re pretty good: identical shape and weight.

6. See the general remarks in LiveStrong, Vitamin D warning: Too much can harm your heart, and the 2009 study Relation of serum 25-hydroxyvitamin D to heart rate and cardiac work (from the National Health and Nutrition Examination Surveys).

7. For Quality & ZQ: higher = better

8. Headband came loose at some point, data useless

9. Headband came loose at some point, data useless

10. The preponderance of True is because while recording the scores, I normalized them; in retrospect, I shouldn’t’ve bothered:

logBinaryScore = sum . map (\(result,p) -> if result then 1 + logBase 2 p else 1 + logBase 2 (1-p))
logBinaryScore [(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.50),
(True,0.50),(True,0.50),(True,0.50),(True,0.50),(True,0.55),(True,0.55),(True,0.55),
(True,0.60),(True,0.60),(True,0.60),(True,0.60),(True,0.60),(True,0.60),(True,0.60),
(True,0.60),(True,0.65),(True,0.65),(True,0.65),(True,0.65),(True,0.65),(True,0.65),
(True,0.65),(True,0.65),(True,0.70),(True,0.70),(True,0.70),(True,0.70),(True,0.75),
(True,0.75),(False,0.55),(False,0.6),(False,0.6),(False,0.7),(False,0.7),(False,0.75)]
5.4
11. The usual session:

zeo <- read.csv("https://www.gwern.net/docs/zeo/2012-zeo-vitamind.csv")
colnames(zeo)[26] <- "Vitamin.D"
l <- lm(cbind(Total.Z, Time.in.REM, Time.in.Deep, Time.in.Wake,
Awakenings, Morning.Feel, Time.to.Z)
~ Vitamin.D, data=zeo)
summary(manova(l))
#           Df Pillai approx F num Df den Df Pr(>F)
# Vitamin.D  1   0.31     2.12      7     33   0.07
# Residuals 39
summary(l)
# Response Total.Z :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   533.37       8.16   65.37   <2e-16
# Vitamin.D     -19.73      11.14   -1.77    0.084
#
# Response Time.in.REM :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   175.63       4.44    39.5   <2e-16
# Vitamin.D     -14.54       6.07    -2.4    0.021
#
# Response Time.in.Deep :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    55.00       2.04   26.98   <2e-16
# Vitamin.D       2.32       2.78    0.83     0.41
#
# Response Time.in.Wake :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    26.32       3.83    6.88  3.2e-08
# Vitamin.D       2.50       5.22    0.48     0.63
#
# Response Awakenings :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    7.579      0.598    12.7  2.1e-15
# Vitamin.D      0.739      0.817     0.9     0.37
#
# Response Morning.Feel :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    2.842      0.134   21.21   <2e-16
# Vitamin.D     -0.524      0.183   -2.86   0.0067
#
# Response Time.to.Z :
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)    17.58       3.43    5.12  8.6e-06
# Vitamin.D       3.47       4.69    0.74     0.46
12. Correcting for multiple comparisons at q-value=0.05, of our 8 pessimistic p-values, 1 survives:

p.adjust(c(0.084,0.021,0.41,0.63,0.37,0.0067,0.46), method="BH") < 0.05
# [1] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE

Remarkable - the first time a p-value survived. (That was the Morning.Feel one.)

13. I originally input the data as Other Disruptions 4 through the Zeo web interface, since I assumed that if Other Disruptions 3 was SSCF.12, that would put the data into SSCF.13 - but it turns out that does not get exported in the CSV! Apparently the CSV is limited to 1-3. So I edited the exported CSV and just reused SSCF.1. Hopefully Zeo Inc. will fix the export functionality, since it’s very frustrating to be able to see the data used in the Cause & Effect tool, for example, but not export it.

14. Gustavo Lacerda wondered if the two-sample t-test (or linear regressions in general) were really justifiable to use - could days be correlated, in which case the p-values would be overstated and my results actually weaker than they look? He suggested testing my full Zeo dataset to see whether Morning Feel can be predicted from day to day by a (relatively) simple linear autocorrelation regression looking at all previous recorded days:

zeo <- read.csv("https://www.gwern.net/docs/zeo/gwern-zeodata.csv")
## Master Zeo export file is periodically updated; your results may not be identical
n <- length(data$Morning.Feel); n [1] 1050 reg <- lm(Morning.Feel[2:n] ~ Morning.Feel[1:(n-1)], data=zeo) summary(reg) # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) 2.5727 0.0943 27.3 <2e-16 # Morning.Feel[1:(n - 1)] 0.0689 0.0329 2.1 0.036 # # Residual standard error: 0.771 on 918 degrees of freedom # (129 observations deleted due to missingness) # Multiple R-squared: 0.00476, Adjusted R-squared: 0.00368 # F-statistic: 4.39 on 1 and 918 DF, p-value: 0.0364 ## Given that pretty much all the ratings are 2, 3, or 4, and the r^2 is <0.01 ## with a residual error of 0.75, that doesn't seem very correlated. ## although the _p_ does indicate there's a real (but very small) correlation from ## day to day, so I guess the p-values may be a *little* overstated cor(zeo$Morning.Feel[2:n], zeo$Morning.Feel[1:(n-1)], use = "complete.obs") # [1] 0.069 ## we can also graph the lags: acf(zeo$Morning.Feel, na.action=na.pass,
main="Do days predict subsequent days at various temporal distances?")

## incidentally - 129 observations missing? What's going on?
zeo$Morning.Feel # [1] NA 2 3 3 4 3 3 2 NA NA 4 4 NA 3 NA 2 4 4 NA 4 3 3 3 4 2 3 2 3 NA 3 NA # [32] NA 4 NA 4 NA NA NA NA NA NA NA NA NA NA NA NA NA 3 4 NA NA 4 4 3 4 NA NA NA NA NA NA # [63] NA 4 NA 2 3 3 NA NA 3 NA 3 3 NA 2 NA NA NA NA 3 NA NA NA NA NA NA NA 3 4 NA 4 3 # [94] 3 3 4 4 3 3 3 2 3 3 2 3 3 3 2 NA 3 3 4 3 NA 3 NA 3 NA 3 3 3 NA 3 3 # [125] NA NA NA NA NA 2 NA NA 3 2 3 NA NA NA NA NA NA 3 2 3 2 2 2 2 2 3 3 3 3 NA 3 # [156] 3 2 2 3 3 2 3 2 3 NA 2 NA NA 4 3 3 3 2 3 NA 4 3 2 3 3 3 3 3 3 4 3 # [187] 4 3 3 3 3 3 2 3 2 3 3 3 NA 3 1 4 NA 3 2 4 4 2 2 3 3 3 3 3 3 3 3 # [218] 3 3 4 3 3 2 2 3 3 2 3 3 3 2 2 3 3 3 3 3 4 3 3 2 2 2 1 2 3 3 NA # [249] 3 3 3 3 3 3 3 3 2 3 2 3 2 3 3 3 2 3 3 2 3 3 3 3 4 3 3 4 3 4 2 # [280] 3 NA 3 3 2 2 2 3 3 3 3 2 3 3 2 2 2 3 3 2 2 3 2 3 3 3 3 3 3 2 3 # [311] 3 2 1 3 4 3 2 3 3 2 2 3 3 3 1 2 NA 2 3 2 2 3 3 2 3 3 NA 3 NA 3 3 # [342] 2 3 2 2 3 3 3 3 1 3 3 3 2 1 3 NA 2 3 3 3 3 2 1 2 2 3 2 2 3 3 3 # [373] 3 3 4 3 2 3 3 3 2 2 3 NA 3 2 3 4 4 3 3 2 4 3 2 3 3 4 3 4 3 3 NA # [404] 2 2 3 3 3 4 4 3 1 3 3 2 4 3 3 3 2 3 2 4 2 4 3 3 3 4 NA 2 3 3 3 # [435] 3 2 1 2 2 3 2 3 1 4 3 3 4 3 3 2 2 2 2 3 1 3 3 3 4 3 3 2 3 3 4 # [466] 4 2 2 3 3 2 2 4 3 3 3 2 3 2 2 3 2 3 2 3 2 3 2 3 2 3 3 3 2 3 3 # [497] 2 3 1 2 3 3 3 3 2 2 3 3 1 3 2 3 3 4 1 3 4 1 4 3 4 3 3 2 3 2 NA # [528] 3 4 2 4 3 3 3 4 4 1 3 2 3 3 3 2 3 4 3 3 2 3 3 3 4 2 2 2 3 3 3 # [559] 4 4 1 3 3 3 4 3 4 3 3 1 1 2 3 2 3 3 4 3 3 3 2 2 3 4 4 1 4 4 3 # [590] 4 3 3 3 3 3 2 3 3 2 3 3 2 3 4 2 2 3 1 3 3 2 3 3 2 2 3 4 3 2 1 # [621] 3 3 3 3 2 4 2 3 3 3 3 4 3 3 3 NA 3 NA 4 3 2 2 2 2 3 3 3 4 3 2 3 # [652] 2 3 3 1 3 4 3 3 4 4 4 2 3 2 1 4 2 4 3 2 3 3 3 3 2 3 4 2 2 2 2 # [683] 3 4 3 4 2 2 3 4 2 3 3 3 2 2 2 3 2 2 2 4 3 3 3 2 2 1 2 4 3 3 3 # [714] 3 3 2 2 2 3 3 3 3 1 1 2 3 3 4 3 3 3 4 3 4 3 3 3 3 3 3 3 2 2 2 # [745] 2 3 2 3 3 2 1 3 3 2 3 3 3 3 2 3 4 4 2 3 3 4 4 2 4 4 4 3 3 3 1 # [776] 3 3 2 3 3 4 4 3 1 4 4 4 3 3 3 2 1 2 2 3 3 3 2 4 3 2 4 3 3 4 4 # [807] 1 2 3 2 3 4 2 3 4 2 4 2 3 3 2 3 2 3 3 3 2 3 2 2 3 4 2 0 3 2 2 # [838] 1 3 3 4 4 3 2 3 2 3 3 2 1 2 3 3 1 0 3 3 2 3 2 3 3 3 2 3 3 2 2 # [869] 3 2 3 2 3 3 3 0 2 3 2 2 2 2 2 3 3 3 2 3 2 3 3 2 2 3 4 3 3 3 2 # [900] 3 3 3 3 4 2 3 3 2 3 0 1 3 2 3 3 3 2 2 3 3 3 3 3 2 2 3 4 0 3 3 # [931] 3 2 3 4 2 3 3 3 3 3 4 2 3 3 2 3 2 3 4 4 3 3 1 3 4 3 0 3 4 3 3 # [962] 4 2 2 3 1 2 4 4 3 3 3 2 3 0 3 4 3 2 4 2 3 0 3 3 3 2 4 2 3 3 2 # [993] 3 3 3 3 3 3 4 3 4 3 3 3 4 3 3 3 2 3 3 3 2 2 3 3 4 3 4 2 3 3 3 # [1024] 3 3 2 3 2 3 3 3 3 3 3 3 3 4 4 3 3 3 0 4 3 2 2 3 3 3 2 ## ah, I just wasn't good about recording "Morning Feel" early on, and since then ## there have been occasional slips (literally, with the headband) Gustavo comments: And by the way, instead of regressing Morning.Feel[n] on Drug[n] (a discrete variable taking values in {0,1}), it would make more sense to regress on an Exponentially-Weighted Moving Average of Drug, such as $Drug[n-1]+(\frac{1}{2} \times Drug[n-2])+(\frac{1}{4} \times Drug[n-3])+...$ which is modeling how much drug is present on the body. In the above example, I’m assuming a half-life of 1 day, so lambda=$\frac{1}{2}$. You could arguably select the lambda that gives you the best fit; just be wary of multiple testing. 15. The BEST analysis is powerful and provides much more information than a simple t-test would, but the various parameters in the table or the image are not self-explanatory; the curious should read Bayesian estimation supersedes the t test (Kruschke 2012). In the CSV, an SSCF.1 of 0 indicates membership in the original experiment, 1 indicates the dry period July-September, 2 indicates the vitamin D resumption post-original-experiment, and 3 indicates the vitamin D resumption post-September. So: # set up data mydata <- read.csv("https://www.gwern.net/docs/zeo/2012-zeo-vitamind-morning-control.csv") originalcontrol <- subset(mydata, SSCF.1==0) newcontrol <- subset(mydata, SSCF.1==1) # clean missing data originalcontrol <- originalcontrol$Morning.Feel[!is.na(originalcontrol$Morning.Feel)] newcontrol <- newcontrol$Morning.Feel[!is.na(newcontrol$Morning.Feel)] # run BEST MCMC group estimations source("BEST.R") mcmc = BESTmcmc(originalcontrol, newcontrol) BESTplot(originalcontrol, newcontrol, mcmc, TRUE, ROPEeff=c(-0.1,0.1)) # SUMMARY.INFO # PARAMETER mean median mode HDIlow HDIhigh pcgtZero # mu1 2.82199912 2.82184675 2.82109419 2.5425634 3.1008251 NA # mu2 2.84712376 2.84744246 2.84233569 2.6205415 3.0777439 NA # muDiff -0.02512464 -0.02542602 -0.03361140 -0.3874754 0.3339228 44.43593 # sigma1 0.72900731 0.71760315 0.69447083 0.5330477 0.9474278 NA # sigma2 0.88825472 0.88350888 0.87346099 0.7192899 1.0690516 NA # sigmaDiff -0.15924742 -0.16410108 -0.17383105 -0.4269052 0.1171290 12.08159 # nu 41.98417254 33.62743916 17.74077514 3.2649758 104.0648983 NA # nuLog10 1.51048794 1.52669380 1.57284008 0.8699835 2.1138309 NA # effSz -0.03198943 -0.03143175 -0.04438195 -0.4678744 0.4142259 44.43593 16. As usual: mydata <- read.csv("https://www.gwern.net/docs/zeo/2012-zeo-vitamind-morning-control.csv") originalcontrol <- subset(mydata, SSCF.1==0) newcontrol <- subset(mydata, SSCF.1==1) Wilcoxon rank sum test with continuity correction data: originalcontrol$Morning.Feel and newcontrol\$Morning.Feel
W = 886, p-value = 0.7103