LessWrong and cryonics

How does LessWrong usage correlate with cryonics attitudes and signup rates?
statistics, transhumanism, survey, R, survey
2013-01-012016-03-05 in progress certainty: possible importance: 5


Back in Decem­ber 2012, Yvain noticed some­thing odd in the 2012 Less­wrong sur­vey’s cry­onic respons­es: split­ting between LW ‘vet­er­ans’ and ‘new­bies’, the new­bies esti­mated a low prob­a­bil­ity that cry­on­ics would work and none were signed up (as one would expect) but among the vet­er­ans, despite a sixth of them being signed up (an aston­ish­ingly high rate com­pared to the gen­eral pop­u­la­tion), their esti­mated prob­a­bil­ity was not higher than the new­bies but lower (the oppo­site of what one would expec­t). This is sur­pris­ing since you would expect the esti­mated prob­a­bil­ity and the signup rates in var­i­ous sub­groups to move in the same direc­tion: if one group believes cry­on­ics is sure to work, then they will be more likely to decide the expense and social stigma are worth it; while if another group is cer­tain cry­on­ics can­not work, none of them will be signed up. So this result was a bit odd. This pat­tern also seemed to repli­cate in the 2013 sur­vey results as well:

Pro­to-ra­tional­ists thought that, on aver­age, there was a 21% chance of an aver­age cry­on­i­cally frozen per­son being revived in the future. Expe­ri­enced ratio­nal­ists thought that, on aver­age, there was a 15% chance of same. The dif­fer­ence was mar­gin­ally sig­nif­i­cant (p < 0.1).

…On the other hand, 0% of pro­to-ra­tional­ists had signed up for cry­on­ics com­pared to 13% of expe­ri­enced ratio­nal­ists. 48% of pro­to-ra­tional­ists rejected the idea of sign­ing up for cry­on­ics entire­ly, com­pared to only 25% of expe­ri­enced ratio­nal­ists. So although ratio­nal­ists are less likely to believe cry­on­ics will work, they are much more likely to sign up for it. Last year’s sur­vey shows the same pat­tern.

Yvain’s expla­na­tion for this anom­aly is that it reflects the vet­er­ans’ greater will­ing­ness to ‘play the odds’ and engage in activ­i­ties with pos­i­tive even if the odds are against those activ­i­ties pay­ing off, and that this may be a causal effect of spend­ing time on LW. (Although of course there’s other pos­si­bil­i­ties: eg older LWers may be drawn more from hard­core tran­shu­man­ists with good STEM or sta­tis­ti­cal expe­ri­ence, explain­ing both aspects of the pat­tern; newer LWers, espe­cially ones brought in by Yud­kowsky’s Harry Pot­ter and the Meth­ods of Ratio­nal­ity nov­el, may be more likely to be from the gen­eral pop­u­la­tion or human­i­ties.)

The fig­ures Yvain use are the pop­u­la­tion aver­ages (a 2x2 con­tin­gency table of expe­ri­enced vs signed-up), how­ev­er, and not the result of indi­vid­u­al-level regres­sions. That is, as pointed out by Mitchell Porter, these fig­ures may be dri­ven by issues like . For exam­ple, per­haps among the vet­er­ans, there’s two pop­u­la­tions - some opti­mists with very high prob­a­bil­i­ties who all sign up, but also many pes­simists with very low prob­a­bil­i­ties (low enough to slightly more than coun­ter­bal­ance the opti­mists) who never sign up; in this hypo­thet­i­cal, there is no one who is both pes­simistic and signed up (as pre­dicted by the aver­ages), yet, the num­bers come out the same way. Just from the over­all aggre­gate num­bers, we can’t see whether some­thing like this is hap­pen­ing, and after a quick look at the 2013 data, Mitchell notes:

…“expe­ri­enced ratio­nal­ists” who don’t sign up for cry­on­ics have an aver­age con­fi­dence in cry­on­ics of 12%, and “expe­ri­enced ratio­nal­ists” who do sign up for cry­on­ics, an aver­age con­fi­dence of 26%.

Is this right?

Statistical considerations

A few things make this ques­tion a bit chal­leng­ing to inves­ti­gate. We don’t have as much data as we think we have, and analy­sis choices can reduce that a great deal:

  1. being signed up for cry­on­ics is rare (ALCOR has a grand total of 1,062 peo­ple signed up world­wide, and 144 cry­op­re­served mem­ber­s), and while there are rel­a­tively a lot of cry­on­ic­sers on LW, they’re still rare. Rare things are hard to inves­ti­gate and prone to sam­pling error
  2. high karma users aren’t that com­mon either
  3. the sur­vey datasets are not huge: gen­er­ally less than a thou­sand responses each
  4. the datasets are heav­ily rid­dled with : peo­ple did­n’t answer a lot of ques­tions. The usual stats pro­gram response is that if a par­tic­u­lar sub­ject is miss­ing any of the vari­ables looked at, it will be dropped entire­ly. So if we’re look­ing at a regres­sion involv­ing time-on-LW, prob­a­bil­i­ty-cry­on­ic­s-will-­work, and cry­on­ic­s-­some­thing-or-other, from each sur­vey we won’t get 1000 usable respons­es, we may have instead 300 com­plete-­cas­es.
  5. dichotomiz­ing con­tin­u­ous vari­ables (such as time & karma into vet­eran vs new­bie, and cry­on­ics sta­tus into signed-up vs not) loses infor­ma­tion
  6. per­cent­ages are not a good unit to work in, as lin­ear regres­sions do not respect the range 0-100%, and the extremes get crushed (0.1 looks much the same as 0.001)
  7. our analy­sis ques­tion is not exactly obvi­ous: we are not inter­ested in a sim­ple test of dif­fer­ence, nor a regres­sion of two inde­pen­dent vari­ables (such as glm(SignedUp ~ Karma + Probability, family=binomial)) but com­par­ing two alter­nate mod­els of the three vari­ables, one model in which karma pre­dicts both prob­a­bil­ity (-) and signed-up sta­tus (+) and prob­a­bil­ity pre­dicts signed-up (+) as well, and a sec­ond model in which karma pre­dicts prob­a­bil­ity (-) but only prob­a­bil­ity pre­dicts signed-up sta­tus (+).
  8. these vari­ables may have con­sid­er­able mea­sure­ment error in them: the total karma vari­able may reflect time-wast­ing as much as con­tri­bu­tions, but using instead length of LW/OB involve­ment has a sim­i­lar prob­lem.

Some fixes for this:

  • 1-3: we can scrape together as much data as pos­si­ble by com­bin­ing all the sur­veys together (2009, 2011, 2012, 2013, 2014; there was no 2010 or 2015 sur­vey), drop­ping responses where they indi­cated they had answered a pre­vi­ous sur­vey, and includ­ing a year vari­able to model het­ero­gene­ity
  • 4: we can reduce miss­ing­ness by the assump­tion of com­plete­ly-miss­ing-at-ran­dom and using a mul­ti­ple- pack­age like MICE or any impu­ta­tion options in our cho­sen library
  • 5: we can avoid dichotomiz­ing vari­ables at all
  • 6: the probability/percentages can be trans­formed into logit units, which will work much bet­ter in lin­ear regres­sions.
  • 7: the two com­pet­ing mod­els can be eas­ily writ­ten down and com­pared using the gen­er­al­iza­tion of lin­ear mod­els, (SEM). We can then see which one fits bet­ter and quan­tify the rel­a­tive Bayesian strength of the two mod­els
  • 8: we can pull in addi­tional vari­ables related to com­mu­nity par­tic­i­pa­tion and using them, kar­ma, and length of involve­ment, extract a latent vari­able hope­fully cor­re­spond­ing mean­ing­fully to experience/veteranness; if we’re using a SEM pack­age and can deal with miss­ing­ness, this will be eas­ier than it sounds.

Analysis

Data preparation

PCryonics, KarmaScore, CryonicsStatus were asked every year and have not changed AFAIK. The com­mu­ni­ty-re­lated sur­vey vari­ables have changed from year to year, though, and match­ing up 2009-2014 vari­ables we have avail­able:

2009:                  OB.posts,   Time.in.Community,  Time.per.day
2011:                  Sequences,  TimeinCommunity,    TimeonLW,     Meetups
2012: LessWrongUse,    Sequences,  TimeinCommunity,    TimeonLW,     Meetups,  HPMOR
2013: Less.Wrong.Use,  Sequences,  TimeinCommunity,    TimeonLW,     Meetups,  HPMOR
2014: LessWrongUse,    Sequences,  TimeinCommunity,    TimeonLW,     Meetups,  HPMOR
  • LessWrongUse: con­sis­tent
  • OB.posts=Sequences; both are %s, and the phras­ing is con­sis­tent:
    • OB.posts:

      What per­cent­age of the Over­com­ing Bias posts do you think you’ve read? (Eliezer has about 700, and Robin prob­a­bly around the same. If you’ve read all of Eliez­er’s but none of Robin’s, or vice ver­sa, please men­tion that in the answer)

    • Sequences:

      About how much of the Sequences - the col­lec­tion of Eliezer Yud­kowsky’s orig­i­nal posts - have you read? You can find a list of them at http://wiki.lesswrong.com/wiki/Sequences

  • TimeinCommunity: this vari­able turns out to be unus­able, as read­ing through respons­es, the data is too dirty and incon­sis­tent to be used at all with­out mas­sive edit­ing (ev­ery­one for­mat­ted their response dif­fer­ent­ly, some did­n’t under­stand the units, etc. Reminder to any­one run­ning a sur­vey: free response is the dev­il.)
  • TimeonLW=Time.per.day: con­sis­tent, phras­ing:
    • Time.per.day:

      In an aver­age day, how many min­utes do you spend on Over­com­ing Bias and Less Wrong?

    • TimeonLW:

      How long, in approx­i­mate num­ber of min­utes, do you spend on Less Wrong in the aver­age day?

  • Meetups; incon­sis­tent:
    • 2009, 2011, 2012: True/False
    • 2013, 2014: “Yes, reg­u­larly”/“Yes, once or a few times”/“No”
  • HPMOR: con­sis­tent

Read­ing in, merg­ing, clean­ing, and writ­ing back out the sur­vey data:

# 2009: http://lesswrong.com/lw/fk/survey_results/
# https://docs.google.com/forms/d/1X-tr2qzvvHzWpRtZNeXHoeBr30uCms7SlMkhbCmuT4Q/viewform?formkey=cF9KNGNtbFJXQ1JKM0RqTkxQNUY3Y3c6MA
survey2009 <- read.csv("https://www.gwern.net/docs/lwsurvey/2009.csv", header=TRUE)
s2009 <- with(survey2009, data.frame(Year=2009, PCryonics=Probability..Cryonics, KarmaScore=LW.Karma, CryonicsStatus=Cryonics, Sequences=OB.posts, TimeonLW=Time.per.day))
s2009$Meetups <- NA; s2009$LessWrongUse <- NA; s2009$HPMOR <- NA
s2009$Year <- 2009

# 2011: http://lesswrong.com/lw/8p4/2011_survey_results/
# https://docs.google.com/forms/d/1f2oOFHPjcWG4SoT57LsWkYsnNXgY1gkbISk4_FDQ3fc/viewform?formkey=dHlYUVBYU0Q5MjNpMzJ5TWJESWtPb1E6MQ
survey2011 <- read.csv("https://www.gwern.net/docs/lwsurvey/2011.csv", header=TRUE)
survey2011$KarmaScore <- survey2011$Karma # rename for consistency
## no 'PreviousSurveys' question asked, err on the side of inclusion
s2011 <- subset(survey2011, select=c(PCryonics, KarmaScore, CryonicsStatus, Sequences, TimeonLW, Meetups))
s2011$LessWrongUse <- NA; s2011$HPMOR <- NA
s2011$Year <- 2011

# 2012: http://lesswrong.com/lw/fp5/2012_survey_results/
# https://docs.google.com/spreadsheet/viewform?formkey=dG1pTzlrTnJ4eks3aE13Ni1lbV8yUkE6MQ#gid=0
survey2012 <- read.csv("https://www.gwern.net/docs/lwsurvey/2012.csv", header=TRUE)
s2012 <- subset(survey2012, PreviousSurveys!="Yes", select=c(PCryonics, KarmaScore, CryonicsStatus, LessWrongUse, Sequences, TimeonLW, Meetups, HPMOR))
s2012$Year <- 2012

# 2013: http://lesswrong.com/lw/jj0/2013_survey_results/
# https://docs.google.com/spreadsheet/viewform?usp=drive_web&formkey=dGZ6a1NfZ0V1SV9xdE1ma0pUMTc1S1E6MA#gid=0
survey2013 <- read.csv("https://www.gwern.net/docs/lwsurvey/2013.csv", header=TRUE)
survey2013$LessWrongUse <- survey2013$Less.Wrong.Use # rename for consistency
survey2013$PCryonics <- survey2013$P.Cryonics.
survey2013$KarmaScore <- survey2013$Karma.Score
survey2013$CryonicsStatus <- survey2013$Cryonics.Status
survey2013$TimeonLW <- survey2013$Time.on.LW
s2013 <- subset(survey2013, Previous.Surveys.1!="Yes", select=c(PCryonics, KarmaScore, CryonicsStatus, LessWrongUse, Sequences, TimeonLW, Meetups, HPMOR))
s2013$Year <- 2013

# 2014: http://lesswrong.com/lw/lhg/2014_survey_results/
# https://docs.google.com/forms/d/1h4IisKq7p8CRRVT_UXMSiKW6RE5U5nl1PLT_MvpbX2I/viewform
survey2014 <- read.csv("https://www.gwern.net/docs/lwsurvey/2014.csv", header=TRUE)
s2014 <- subset(survey2014, PreviousSurveys!="Yes", select=c(PCryonics, KarmaScore, CryonicsStatus, LessWrongUse, Sequences, TimeonLW, Meetups, HPMOR))
s2014$Year <- 2014

all <- rbind(s2009, s2011, s2012, s2013, s2014)

# Clean up:
all[!is.na(all$HPMOR) & (all$HPMOR == "" | all$HPMOR == " "),]$HPMOR <- NA
all[!is.na(all$HPMOR) & (all$HPMOR == "Yes all of it"),]$HPMOR <- "Yes, all of it"
all$HPMOR <- as.factor(all$HPMOR)

all$TimeonLW <- as.numeric(as.character(all$TimeonLW))

all$Meetups <- grepl("Yes", all$Meetups)

all[!is.na(all$TimeonLW) & all$TimeonLW>300,]$TimeonLW <- NA

Sequences <- regmatches(all$Sequences, regexec("[[:digit:]].", as.character(all$Sequences)))
all$Sequences <- as.integer(unlist({Sequences[sapply(Sequences, length)==0] <- NA; Sequences}))

all[!grepl("^I", all$LessWrongUse),]$LessWrongUse <- NA
all$LessWrongUse <- sub(",", "", all$LessWrongUse)

## PCryonics is *supposed* to be a percentage written down as a naked number, but some people include "%" as well or other text;
## so remove '%', convert to numeric, and then convert to decimal probability, rounding >100 down to 100 & <0 to 0
probabilityRange <- function(x) { if (is.na(x)) { return(NA);} else {if (x>100) { return(100); } else { if (x<0) { return(0); } else {return(x)}}}}
all$PCryonics <- sapply(as.numeric(sub("%", "", as.character(all$PCryonics))), probabilityRange) / 100
## Karma score relatively clean integer (note: can be negative but karma is integral & not real):
all$KarmaScore <- round(as.integer(as.character(all$KarmaScore)))
## CryonicsStatus has 14 levels and is tricky; first, code all the missing data
all[all$CryonicsStatus == "" | all$CryonicsStatus == " ",]$CryonicsStatus <- NA

# Done:
write.csv(all, file="~/wiki/docs/lwsurvey/2009-2015-cryonics.csv", row.names=FALSE)

Analysis

First we need to load the data and con­vert the tex­tual mul­ti­ple-­choice responses to ordi­nal fac­tors which we can treat as numeric val­ues:

cryonics <- read.csv("https://www.gwern.net/docs/lwsurvey/2009-2015-cryonics.csv",
 colClasses=c("factor", "numeric", "numeric", "factor", "numeric", "numeric","logical","factor","factor"))
## now, express as ordinal, ranging from most extreme no to most extreme yes;
## so we can treat it as categorical, ordinal, or integer; we have to manually specify this metadata unless we want to drop down to
## integer-coding and delete the character responses, oh well.
cryonics$CryonicsStatus <- ordered(cryonics$CryonicsStatus,
 levels=c("No, and don't plan to", "No, and not planning to", "No - and do not want to sign up for cryonics", "No", "Never thought about it / don't understand", "No, never thought about it", "No, but considering it", "No - still considering it", "No - would like to sign up but haven't gotten around to it","No - would like to sign up but unavailable in my area", "Yes - signed up or just finishing up paperwork", "Yes"))
cryonics$LessWrongUse <- ordered(cryonics$LessWrongUse,
 levels=c("I lurk but never registered an account", "I've registered an account but never posted", "I've posted a comment but never a top-level post", "I've posted in Discussion but not Main", "I've posted in Main"))
cryonics$HPMOR <- ordered(cryonics$HPMOR,
 levels=c("No", "Started it but haven't finished","Yes, all of it"))
summary(cryonics)
#    Year        PCryonics           KarmaScore
#  2009: 153   Min.   :0.0000000   Min.   :  -20.000
#  2011: 930   1st Qu.:0.0100000   1st Qu.:    0.000
#  2012: 649   Median :0.1000000   Median :    0.000
#  2013: 962   Mean   :0.2129401   Mean   :  192.535
#  2014:1226   3rd Qu.:0.3000000   3rd Qu.:   43.000
#              Max.   :1.0000000   Max.   :15000.000
#              NA's   :565         NA's   :1120
#                                                     CryonicsStatus   Sequences
#  No - still considering it                                 :902    Min.   : 0.00000
#  No - and do not want to sign up for cryonics              :610    1st Qu.:25.00000
#  No, but considering it                                    :594    Median :25.00000
#  No - would like to sign up but haven't gotten around to it:393    Mean   :41.75323
#  No, and not planning to                                   :330    3rd Qu.:50.00000
#  (Other)                                                   :532    Max.   :99.00000
#  NA's                                                      :559    NA's   :1598
#     TimeonLW          Meetups                                                  LessWrongUse
#  Min.   :  0.00000   Mode :logical   I lurk but never registered an account          :1071
#  1st Qu.:  5.00000   FALSE:3174      I've registered an account but never posted     : 370
#  Median : 10.00000   TRUE :746       I've posted a comment but never a top-level post: 611
#  Mean   : 16.27418   NA's :0         I've posted in Discussion but not Main          : 233
#  3rd Qu.: 20.00000                   I've posted in Main                             :  95
#  Max.   :300.00000                   NA's                                            :1540
#  NA's   :703
#                              HPMOR
#  No                             : 516
#  Started it but haven't finished: 319
#  Yes, all of it                 :1018
#  NA's                           :2067
total <- nrow(cryonics); total
# [1] 3920
full <- nrow(cryonics[!is.na(cryonics$PCryonics) & !is.na(cryonics$KarmaScore) & !is.na(cryonics$CryonicsStatus),]); full
full / total
# [1] 0.6420918367
cryonics$CryonicsStatusN <- as.integer(cryonics$CryonicsStatus)
cor(subset(cryonics, select=c(CryonicsStatusN, PCryonics, KarmaScore)), use="complete.obs")
#                 CryonicsStatusN      PCryonics     KarmaScore
# CryonicsStatusN   1.00000000000  0.27390817380  0.05904632281
# PCryonics         0.27390817380  1.00000000000 -0.04139505356
# KarmaScore        0.05904632281 -0.04139505356  1.00000000000

So for the fun­da­men­tal triplet of probability/karma/signup, we lose 36% of responses to miss­ing­ness in one or more of the 3 vari­ables. We can see in the com­plete cases the first-order cor­re­la­tions Yvain was spot­ting: karma has the pre­dicted pos­i­tive cor­re­la­tion with signup but the pre­dicted neg­a­tive cor­re­la­tion with prob­a­bil­i­ty. The size of the cor­re­la­tions are not impres­sive but we can guess why not: karma is heav­ily skewed (me­dian 0, mean 192, max 15,000) and prob­a­bil­ity has a mis­lead­ingly nar­row range (0-1), so one vari­able is grossly vari­able while the other is not vari­able enough.

To deal with the karma prob­lem, we shift that -20 up to 0, and then we shrink them with a log-­trans­form. This gives us a more com­pre­hen­si­ble dis­tri­b­u­tion 0-10. To deal with the prob­a­bil­i­ty, we want to do the oppo­site: expand them out so 0.99 is mean­ing­fully smaller than 0.9999 etc, with a logit trans­form. The logit trans­form has a wrin­kle: a lot of peo­ple (par­tic­u­larly those hos­tile to cry­on­ics) are bad at prob­a­bil­i­ties and gave 0/1 val­ues, which is unhelp­ful, so I round 0/1s; after that, the logit trans­form works nice­ly.

Now when we graph the triplet, our graph shows us some­thing inter­est­ing:

cryonics$KarmaScoreLog <- log1p(cryonics$KarmaScore + abs(min(cryonics$KarmaScore, na.rm=TRUE)))
### 0 & 1 are not real subjective probabilities; truncate probabilities
cryonics[!is.na(cryonics$PCryonics) & cryonics$PCryonics<1e-15,]$PCryonics <- exp(-20)
cryonics[!is.na(cryonics$PCryonics) & cryonics$PCryonics==1,]$PCryonics <- exp(10)
### now that the probabilities are real, convert them into logits to avoid decimal's distortion of extremes
cryonics$PCryonicsLogit <- log(cryonics$PCryonics / (1 - cryonics$PCryonics))

## visualize
library(ggplot2)

qplot(KarmaScoreLog, jitter(PCryonicsLogit), color=CryonicsStatusN, data=cryonics,
      ylab="Probability of cryonics working (logits)", xlab="LessWrong karma (logged)") + geom_point(size=I(3))
LW karma vs esti­mated prob­a­bil­ity of cry­on­ics work­ing vs degree to which one wants to sign up (2009-2014 LW sur­vey data)

We see with a strik­ing decrease in vari­ance with karma / fun­nel shape, which looks as if with increas­ing LW kar­ma, there is a con­ver­gence on a mean logit of ~2.5 or 12%. Such a con­ver­gence would be a strike against the idea that high­-karma users sign up only because they are over­con­fi­dent about cry­on­ics - the het­eroscedas­tic­ity should work in the other direc­tion, then, with higher vari­ance in the high karma users, per­mit­ting the low-prob­a­bil­ity & non-signup respon­dents to over­com­pen­sate for the high­-prob­a­bil­ity & signup users. The data is not visu­ally clear about whether among high karma users (say, 6+) a high prob­a­bil­ity pre­dicts a higher signup vari­able; maybe it does, maybe it does­n’t, I can see it either way.

So to go back to our inter­pre­ta­tions: we want to pre­dict cry­on­ics signup. We have 2 com­pet­ing mod­els: in one mod­el, signup is influ­enced by prob­a­bil­ity

blavaan: pre­req­ui­sites are not on CRAN but Bio­con­duc­tor, so you need to do

source("http://bioconductor.org/biocLite.R"); biocLite(c("graph", "Rgraphviz"))

# turn ordered factors into integers, and Year into dummy variables (Year2009, Year2011, Year2012, Year2013, Year2014) because Blavaan isn't smart enough to do that on its own:
cryonicsExpanded <- with(cryonics, data.frame(model.matrix(~Year+0), CryonicsStatusN, KarmaScoreLog, PCryonicsLogit, Sequences, TimeonLW, Meetups, LessWrongUse, HPMOR))
cryonicsExpanded$Meetups <- as.integer(cryonicsExpanded$Meetups)
cryonicsExpanded$LessWrongUse <- as.integer(cryonicsExpanded$LessWrongUse)
cryonicsExpanded$HPMOR <- as.integer(cryonicsExpanded$HPMOR)
cryonicsExpanded$SequencesLog <- log(cryonicsExpanded$Sequences)
cryonicsExpanded$TimeonLWLog <- log1p(cryonicsExpanded$TimeonLW)

library(blavaan)
Cryonics.model1 <- '
                    PCryonicsLogit  ~                  KarmaScoreLog
                    CryonicsStatusN ~ PCryonicsLogit + KarmaScoreLog
                   '
b <- bsem(model = Cryonics.model1, data = na.omit(cryonicsExpanded[,1:8]), dp=dpriors(beta="dnorm(0,1e-1)", nu = "dnorm(0,1e-2)")); summary(b)

l <- bsem(model = Cryonics.model1, data = na.omit(cryonicsExpanded[,1:8])); summary(l)


Cryonics.model.latent <- '
                    Experience =~ KarmaScoreLog + SequencesLog + TimeonLWLog + Meetups + LessWrongUse + HPMOR
                    PCryonicsLogit  ~                  Experience
                    CryonicsStatusN ~ PCryonicsLogit + Experience
                   '
blat <- bsem(model = Cryonics.model.latent, data = na.omit(cryonicsExpanded), dp=dpriors(beta="dnorm(0,1e-1)", nu = "dnorm(0,1e-2)")); summary(blat)


semPaths(llat,"est", edge.label.cex = 0.5, exoVar = FALSE, exoCov = FALSE, nCharNodes=10, layout="tree2", sizeMan=10, sizeMan2=10, residuals=FALSE, label.prop=5, edge.label.cex=1.3, mar=c(1.5,1.5,1.5,1.5))


Cryonics.fit1.cb <- bsem(model = Cryonics.model1.c, data = na.omit(cryonicsExpanded[,1:8]), jagcontrol=list(method="rjparallel")); Cryonics.fit2.cb <- bsem(model = Cryonics.model2.c, data = na.omit(cryonicsExpanded[,1:8]), jagcontrol=list(method="rjparallel")); BF(Cryonics.fit1.cb, Cryonics.fit2.cb)

## turn Year into dummy variables (Year2009, Year2011, Year2012, Year2013, Year2014) because Lavaan isn't smart enough to do that on its own:
cryonics <- with(cryonics, data.frame(model.matrix(~Year+0), CryonicsStatusN, KarmaScoreLog, PCryonicsLogit, CryonicsStatusPbroad,  CryonicsStatusPnarrow, Sequences, TimeonLW, Meetups, LessWrongUse, HPMOR))

library(lavaan)

## the two models, continuous/ordinal response to cryonics status:
Cryonics.model1.c <- '
                    PCryonicsLogit  ~ KarmaScoreLog   + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusN ~ PCryonicsLogit  + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit1.c <- sem(model = Cryonics.model1.c, missing="fiml", data = cryonics)
summary(Cryonics.fit1.c)
Cryonics.model2.c <- '
                    PCryonicsLogit ~ KarmaScoreLog + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusN ~ PCryonicsLogit + KarmaScoreLog + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit2.c <- sem(model = Cryonics.model2.c, missing="fiml",  data = cryonicsExpanded)
summary(Cryonics.fit2.c)
anova(Cryonics.fit1.c, Cryonics.fit2.c)

## broad dichotomization:
cryonics$CryonicsStatusPbroad <- as.integer(cryonics$CryonicsStatusPbroad)
Cryonics.model1.d.b <- '
                    PCryonicsLogit        ~ KarmaScoreLog   + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusPbroad ~ PCryonicsLogit  + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit1.d.b <- sem(model = Cryonics.model1.d.b, link="logit", missing="fiml", data = cryonics)
summary(Cryonics.fit1.d.b)
Cryonics.model2.d.b <- '
                    PCryonicsLogit ~ KarmaScoreLog + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusPbroad ~ PCryonicsLogit + KarmaScoreLog + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit2.d.b <- sem(model = Cryonics.model2.d.b, link="logit",  missing="fiml",  data = cryonics)
summary(Cryonics.fit2.d.b)
anova(Cryonics.fit1.d.b, Cryonics.fit2.d.b)

## narrow dichotomization:
Cryonics.model1.d.n <- '
                    PCryonicsLogit        ~ KarmaScoreLog   + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusPnarrow ~ PCryonicsLogit  + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit1.d.n <- sem(model = Cryonics.model1.d.n, link="logit", missing="fiml", data = cryonics)
summary(Cryonics.fit1.d.n)
Cryonics.model2.d.n <- '
                    PCryonicsLogit ~ KarmaScoreLog + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusPnarrow ~ PCryonicsLogit + KarmaScoreLog + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit2.d.n <- sem(model = Cryonics.model2.d.n, link="logit",  missing="fiml",  data = cryonics)
summary(Cryonics.fit2.d.n)
anova(Cryonics.fit1.d.n, Cryonics.fit2.d.n)


cryonics$LessWrongUse <- as.integer(cryonics$LessWrongUse)
cryonics$HPMOR <- as.integer(cryonics$HPMOR)

Cryonics.model3.d.n <- '
                    Experience =~ KarmaScoreLog + Sequences + TimeonLW + Meetups + LessWrongUse + HPMOR + Year2009 + Year2011 + Year2012 + Year2013
                    PCryonicsLogit ~ Experience + Year2009 + Year2011 + Year2012 + Year2013
                    CryonicsStatusPnarrow ~ PCryonicsLogit + Experience + Year2009 + Year2011 + Year2012 + Year2013
                   '
Cryonics.fit3.d.n <- sem(model = Cryonics.model3.d.n, link="logit",  missing="fiml",  data = cryonics)
summary(Cryonics.fit3.d.n)

TODO:

  • age, degree, physics spe­cial­ty? http://slatestarscratchpad.tumblr.com/post/114103150216/su3su2u1-i-was-more-referring-to-the-further http://slatestarscratchpad.tumblr.com/post/114100766461/su3su2u1-slatestarscratchpad-when-i-defined http://slatestarscratchpad.tumblr.com/post/114097328806/thinkingornot-su3su2u1-thinkingornot-maybe-you (can steal degree unfold­ing from )