Online education like Khan Academy has been hailed as a major innovation which will revolutionize higher & lower education, educate students better, and cut costs. They’re an interesting idea and worth trying though overall, I take a fairly skeptical attitude towards MOOCs: they seem like a clear example of Amara’s Law (“We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.”), and in general are not doing a good job of exploiting the possibilities of the Web (no MOOC I’ve seen provides a learning tool even a tenth as good as Bret Victor’s Up and Down the Ladder of Abstraction).

One of the questions that interests me is the possible long-term effects. In general, changes do not preserve all relative positions or ratios - someone benefits disproportionately, someone benefits only a little. It seems highly unlikely to me that online education will reduce all costs equally, or educate all students better by the same degree.

So what differentials can we expect from online education? Hoary articles from the ‘90s about the’digital divide’ might make one predict that it will benefit middle and upper-class whites; but on the other hand, proponents love to talk about favored minorities (eg a foreign black female, like a girl in an African village) who can now access online education through cheap cellphones, so one might predict instead that online education will instead help level playing fields. No longer will there be a big gap between receiving essentially no education and receiving a real education, a gap that perpetuates cycles of poverty. As Internet access becomes more common than access to quality schools, quality school delivered through the Internet will lead to an equalizing effect (the elites will be no better off than before, and the non-elites now have the chance to obtain a prerequisite to becoming an elite).

# Success factors

It may help to ask what causes success in education and see how online education affects it. To a first approximation, ignoring environment, one earns educational success through:

1. IQ/g

IQ obviously predicts a huge chunk of educational success (leading to the ironic accusation that IQ tests are only academic questions) since the smarter one is, the easier learning anything is, much less one’s schoolwork.
2. Conscientiousness (a personality trait in the Big Five; of the hard work, grit, effort)

If one is not smart enough that one can simply inhale lessons and pass tests, one still has the option of working hard: doing extra practice problems, asking for help, etc. Success will not come easy, but it will still come. These 2 factors together will correlate somewhere like 0.7 with educational success: someone who is smart and hard-working will go to the top, and someone who is stupid and lazy will not.
3. Miscellaneous

The rest of the correlation is made up of socioeconomic status, culture (eg. East Asian?) and random other things: random life events or hard-to-measure environmental factors like an extra-inspiring teacher, etc.

## Conscientiousness

IQ is very well studied, with a thorough literature going back nearly a century; it correlates with scads of good outcomes. But Conscientiousness is far more obscure, so it’s worth giving background on why we might mention it in the same breath as IQ.

A famous & much-cited 1991 meta-analysis, Mount & Barrick’s “The Big Five Personality Dimensions and Job Performance: A Meta-Analysis” found that Conscientiousness correlated (~0.2; possibly ~0.31) with various job performance measurements even after controlling for all the obvious thing like IQ & education, as did a followup survey in 1996. Conscientiousness correlates weakly with IQ in the first place (but maybe not); correlates with success in medical school or as a teacher or in spelling bees along with all the correlations with educational success (Noftle & Robins 2007; Poropat 2009; Hsu & Schombert 2010; Chamorro-Premuzic & Furnham 2008) and in particular may determine one’s success in online education (Elvers et al 2003); correlates with educational credentials after mental ability has been controlled; correlates with not having been in jail and predicts later criminal records; correlates more strongly (summary) than IQ with socioeconomic status (SES) and lifetime income, and almost as strongly as IQ with occupational status (and predicts employment); like IQ, Conscientiousness correlates with being thinner, reduced mental & physical disease, and longevity (both as children and adults; see also Bogg & Roberts 2004, Hampson & Goldberg 2006, Turiano et al 2013); correlates (0.4) with ‘overall quality of life’ and (0.25) ‘happiness’ (Steel et al 2008, “Refining the relationship between personality and subjective well-being”). Some studies show correlations to divorce rates, SES, and longevity; or simply on nearly every behavior relevant to longevity (see Roberts & Bogg 2004). A study of the gifted Terman kids (similar to SMPY results), found that for these bright-to-brilliant kids, Conscientiousness affects lifetime earnings (usually $2-3 million) even more than IQ (although only a bit more than Extraversion); this is not due solely to it increasing how much education the participants got. Eyeballing the graphed correlations on page 45, it seems that going from the 10th percentile of Conscientiousness to the 90th was worth ~$800,000. (It’s worth noting that there is a ‘Grit’ which is very similar to Conscientiousness with longer-term perspective and less feedback, but which seems to correlate better with GPA, military academy graduation, and spelling bee performance.) Along with Openness to Experience, Conscientiousness is one of the main correlations with creative scientists (stronger than Introversion!).

And there is one key difference between IQ and Conscientiousness: increasing IQ is a tricky and often impossible task, but there is weak evidence that Conscientiousness can be improved by trying harder tasks. (There is an irony here - it’s hard tedious work to develop the ability to do hard tedious work, so how does one start?) Interestingly, Conscientiousness increases steadily over a lifetime (in contradistinction to IQ’s steady fall), which is a hopeful observation. I like how Richard Hamming put it in “You and Your Research”:

…Now for the matter of drive. You observe that most great scientists have tremendous drive. I worked for ten years with John Tukey at Bell Labs. He had tremendous drive. One day about three or four years after I joined, I discovered that John Tukey was slightly younger than I was. John was a genius and I clearly was not. Well I went storming into Bode’s office and said, “How can anybody my age know as much as John Tukey does?” He leaned back in his chair, put his hands behind his head, grinned slightly, and said, “You would be surprised Hamming, how much you would know if you worked as hard as he did that many years.” I simply slunk out of the office!

What Bode was saying was this: “Knowledge and productivity are like compound interest.” Given two people of approximately the same ability and one person who works 10% more than the other, the latter will more than twice outproduce the former. The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the opportunity - it is very much like compound interest. I don’t want to give you a rate, but it is a very high rate. Given two people with exactly the same ability, the one person who manages day in and day out to get in one more hour of thinking will be tremendously more productive over a lifetime. I took Bode’s remark to heart; I spent a good deal more of my time for some years trying to work a bit harder and I found, in fact, I could get more work done. I don’t like to say it in front of my wife, but I did sort of neglect her sometimes; I needed to study. You have to neglect things if you intend to get what you want done. There’s no question about this.

Or to quote some Harry Potter fanfiction instead:

[Harry:] “Where would I go, if not Ravenclaw?”

[Sorting Hat:] “Ahem. ‘Smart kids in Ravenclaw, evil kids in Slytherin, wannabe heroes in Gryffindor, and everyone who does the actual work in Hufflepuff.’ This indicates a certain amount of respect. You are well aware that Conscientiousness is just about as important as raw intelligence in determining life outcomes, you think you will be extremely loyal to your friends if you ever have some, you are not frightened by the expectation that your chosen scientific problems may take decades to solve…”

# Online education’s factors

How does online education affect them - reducing the need for that factor to reach a certain level of attainment, leaving it alone, or increasing the need for that factor?

1. IQ seems like it could go any way:

• Any effects could roughly cancel out, perhaps in some sort of compensating mechanism where students only aim at particular levels of mastery or performance and better or worse methods only change how much time they need to invest before they go off to play video games.
• It could increase the need for IQ, because now all the extraneous time-wasting ‘gunk’ like sharpening pencils or doing roll-call can be cleared away by the technical solutions, leaving more time for pure learning. By eliminating all the environmental hindrances and variation, the only variation left will come from the student’s innate intellectual abilities: IQ. Students will race through courses until they hit their natural limits; even Sal Khan’s videos can’t make a dim bulb calculate solutions to Schrodinger’s equation.

It has been noted in the psychometric literature that successful attempts to eliminate socio-economic penalties and provide quality environments for all children would necessarily increase the apparent contribution of heredity: if every child is in an environment that lets them develop and flourish to their fullest extent, then any remaining differences in their development will be due to hereditary factors! If variations in IQ are the joint product of variations in heredity and environment, then eliminating all variation in environment, setting environment to 0, means the remaining variation will be just the variation in heredity.
• It could reduce the need for IQ, since online education will lead to a marketplace of lessons where only the clearest, most insightful, easily understood lessons survive. In ordinary classrooms staffed by ordinary teachers, extemporaneous lectures or explanations are necessarily more opaque and lower-quality compared to a lecture that the world-class presenter has spent months or years honing.

But it is a utopian thought that perhaps everyone will be successful at education; so the question becomes, what trait or environmental factor would then become the best predictor of attainment? If you reduce the need for brains, then perhaps you still need motivation and appetite for work, which in conjunction with the previous point about joint products leads us to the next observations…
2. Conscientiousness is the joker. There is one clear possible change: online education will increase demand for Conscientiousness compared to offline education.

This has been suggested on more than one occasion12. This tallies with my personal experience with online courses and classes with online assignment components like computer science classes (where class attendance may be optional and programming projects or homework are submitted remotely). I had a good deal of trouble just sitting down to do the course or assignment, even though it was not necessarily that difficult. The distractions on my laptop beckoned: I would go use crufty old Solaris boxes in the computer labs just to avoid the distractions and get something done. Other experiences were more dramatic: one CS exam was done on computers, with a built-in test suite you could run to get your exact grade, so one could spend hours working on it until one had a perfect 100 (which wasn’t terribly hard), which of course I did - so I was shocked when the teacher showed us the grade distribution and it looked like a normal CS exam distribution, with plenty of <100 scores and outright failures!
3. Miscellaneous is too varied and heterogeneous to be predictable, so we won’t discuss it further.

## Existing research

The general background of online education demonstrated in a large DoE 2009 meta-analysis is that a lot of studies are poor or not randomized controlled trials (unsurprisingly) but in the quality studies, online learning slightly outperforms regular classes and mixed classes outperform both online & offline classes3 (see also the similar results Zhao et al 2005 & Bowen et al 2012). The age of students doesn’t seem to matter much although the meta-analysis seemed to see a shrinking effect going from undergraduate college students to professionals or postgraduates4. This meta-analytic result is broadly consistent with the picture previously painted: if we accept that online education should be better then a small increase in average scores is consistent with some students benefiting much more and some losing a little; the mixed classes, with their face-to-face elements, compensate for lacks in Conscientiousness, giving more students the best of both worlds; and independent study hurts results for the exact same reason.

There is also some academic research directly examining personality factors, which compare online and offline performance and also collect personality data5; I currently know of these relevant studies:

1. “Procrastination in Online Courses: Performance and Attitudinal Differences”, Elvers et al 2003; result:

There were no reliable differences between the 2 sections of the class on the measures of procrastination, exam performance, or attitudes toward the class. Yet, procrastination was negatively related with exam scores and with attitudes toward the class for the online students, but not for the lecture students. This difference may partially explain why online courses designed to increase the educational efficacy of a course often show no difference in performance when compared to lecture classes.

Background:

If procrastination is a problem in online classes, it would be desirable to know which students are most at risk for procrastination. Instructors could then offer the at-risk students interventions designed to reduce dilatory behaviors. Watson (2001) and Schouwenburg and Lay (1995) correlated self-reported procrastination with five factors of personality. Both found a reliable relation between self-reported procrastination and low conscientiousness. Watson found a reliable relation between procrastination and neuroticism. Schouwenburg and Lay also found some, but not all, facets of neuroticism to be related to procrastination.

What did the students say and what difference was found in their scores?

One question asked in the end-of-semester questionnaire was whether the student disliked the class because it was easy to get behind in the class. In the online class, 19 of 21 students reported that they disliked the class because it was easy to get behind. Only 13 of 23 students in the lecture class reported that they disliked the class because it was easy to get behind…However, the magnitude of the relation between procrastination and class performance and attitudes seemed to be larger for the online class than for the traditional class. Procrastination was a good predictor of performance for each of the five tests in the class for the online students, but not a good predictor of performance for any of the five tests for the lecture students.

Finally, the quote that really sums it all up:

Pedagogy suggests that activities such as online discussions, group writing projects, and immediate feedback on performance should lead to better performance. Thus, students in online classes, which often contain these activities, should have better performance in the class compared to traditional lecture classes, which often lack these activities. However, this is rarely the case. Russell (1999) cited more than 300 studies that failed to find any reliable difference in performance between traditional classes and classes at a distance (including correspondence courses, online courses, and telecourses). The observation that the magnitude of the relation between procrastination and exam scores was larger in this online class than in the lecture class could be a possible explanation for these null results. The additional activities in online classes that should increase performance may do just that. However, the decrements associated with dilatory behaviors in online classes may attenuate the increments associated with the additional activities. By reducing dilatory behaviors, the benefits of online classes may become more apparent.

2. Irani et al 2004, “Personality type and its relationship to distance education students’ course perceptions and performance”: non-randomized case study using MBTI
3. Kim & Schniederjans 2004, “The role of personality characteristics in web-based distance education courses”: in its sample of 140 students, online education worked best for those high on the Wonderlic PCI Success Scales for ‘Commitment to Work’ (“The tendency to remain on a job for a long time, and not be undependable, irresponsible, impulsive, disorganized, or lack persistence.”) and ‘Learning Orientation’ (“The tendency of an individual to be willing to engage in activities to acquire knowledge, skills, and behaviors and to learn new methods and procedures to improve job effectiveness, how interested they are in developing themselves, seek opportunities to learn new and different ways of doing things, and enrolled in training programs that they are likely to be active and fully engaged participants.”) Unfortunately, these are not exactly equivalent to Conscientiousness.
4. Schniederjans & Kim 2005, “Relationship of Student Undergraduate Achievement and Personality Characteristics in a Total Web-Based Environment: An Empirical Study”; similar to Kim & Schniederjans 2004, 260 students. It found Conscientiousness statistically-significant, but also 3 others (Openness, Neuroticism, and Openness) and not Extraversion. (Neither seems to include any effect size or whether Conscientiousness out-predicts the other factors; this may be due to my inability to interpret some of the provided statistics.)
5. Bassili 2006, “Promotion and prevention orientations in the choice to attend lectures or watch them online” measured only Neuroticism and Openness, so cannot tell us anything about Conscientiousness.
6. Bishop-Clark et al 2007, “The effects of personality type on web-based distance learning”; MBTI, unfortunately
7. Berenson et al 2008, “Emotional Intelligence as a Predictor for Success in Online Learning”; correlational study, collapses Conscientiousness with other items into a “persuasiveness” item which does correlate with higher online grades.
8. Bolliger & Avgerinou 2009, “Student Satisfaction with Online Courses Based on Personality Type”; just an abstract (and MBTI). More importantly, it’s not clear we can learn anything from surveys of satisfaction or happiness or enjoyment: Nemanich et al 2009, a quasi-experiment, found that in classrooms, higher enjoyment = higher scores, but that correlation was much weaker in their online setting
9. Avgerinou 2010, “Teacher vs. student satisfaction with online learning experiences based on personality type”: MBTI, does not report detailed information.
10. Abzug 2010, “E-conscientiousness and e-performance in online undergraduate management education”; did not measure Conscientiousness via standard questionnaire but via activity in the online course (a performance measure similar to that of Hedengren & Stratmann 2012’s item non-response way of measuring Conscientiousness which seemed to correlate well with traditional questionnaires)
11. Chahino 2011, “An exploration of personality type success in online classes”; uses DISC assessment (not Big Five or MBTI), finding no correlation with DISC results
12. Mellish 2011; correlational, using MBTI; 102 online students (83 female) were equally distributed among personality types (no offline control/comparison), no obvious personality correlation with performance
13. Varela et al 2012, “Online learning in management education: an empirical study of the role of personality traits”; quasi-experimental comparison of offline & online:

In testing H2, learning was regressed on conscientiousness. Results support the ability of conscientiousness to explain learning variance across groups (β=4.11, SE=1.41, p<.05). Then, learning was regressed on conscientiousness, initially, in the face-to-face sample and then, in the online group. While the regressor coefficient was not statistically-significant for the face-to-face group (β=3.49, SE=2.01, p>.05), the regression coefficient exhibits a stronger and [statistically-]significant effect size for the online group (β=4.59; SE=1.96, p<.05). Consistent with the expectations in H2, results corroborate that conscientiousness has a stronger ability to account for learning variance in online settings (R2=.079) than in face-to-face contexts (R2=.040).

14. Ellis & Howard 2012, “The Effects of Gender and Dominant Mental Processes on Hypermedia Learning”: MBTI, no offline
15. Yang et al 2012, “The impact of social capital and personality traits on students’ e-learning experience”; no randomization or comparison group, notable mainly that in their online marketing class, “Contradictory to this common belief, our findings show that the conscientiousness trait does not influence students’ e-learning experience. However, the social orientation trait does. Furthermore, this positive influence from the social orientation trait becomes stronger when larger social capital exists.”
16. Punnoose 2012, “Determinants of Intention to Use eLearning Based on the Technology Acceptance Model”; masters degree Thai students, did not investigate any correlates of achievement but did find Conscientiousness had small correlations with attitudes towards the course.
17. Keller & Karau 2013, “The importance of personality in students’ perceptions of the online learning experience”: “The current research examined the relationship between the Big Five personality dimensions and five specific types of online course impressions (engagement, value to career, overall evaluation, anxiety/frustration, and preference for online courses). Results revealed that conscientiousness was the most consistent predictor of an individual’s impressions of online courses.” They did not record any grades or exam scores.
18. Fariba 2013, “Academic Performance Of Virtual Students Based On Their Personality Traits, Learning Styles And Psychological Well Being: A Prediction”; survey of self-selected online students which found large negative correlation between grades & Neuroticism, and smaller correlations with Conscientiousness/Extraversion/Openness.
19. Shih et al 2013, “The Relationship Among Tertiary Level EFL Students’ Personality, Online Learning Motivation And Online Learning Satisfaction”: “extraversion and conscientiousness were the two important traits among the Big Five in predicting motivation and satisfaction” (but no measure of grades or online/offline experimental design)
20. Santo, S.A.: “Virtual learning, personality, and learning styles”. Dissertation Abstracts International Section A, Humanities & Social Sciences, 62, pp. 137 (2001)
21. Zobdeh-Asadi, S.: “Differences in personality factors and learners’ preference for traditional versus online education”. Dissertation Abstracts International Section A: Humanities & Social Sciences, 65(2-A), pp. 436 (2004)

# Factor changes

Now, we discarded #3 as being impossible to generalize about, and #2 suggests that Conscientiousness will increase in its correlation with success, while to me the more plausible outcome for #1 is that it will reduce the need. But to be conservative, let’s assume the need for IQ remains unchanged. This suggests the following argument:

1. Material presented in an online education format: requires the same amount of IQ to understand6
2. Material presented in an online education format: also requires more Conscientiousness than the same material presented in a classroom
3. there are no other factors; then
4. less of the general population will be able to learn it.

To belabor the obvious and dress it up in mathematical garb: for a particular static set/population Z, the number of Z members which satisfy the requirements $IQ+C, because the fraction of the population with both the necessary IQ and the necessary Conscientiousness must be equal to or smaller than the fraction with just the necessary IQ; for any properties $P\left(a\wedge b\right)\le P\left(a\right)$. See also the conjunction fallacy. When it comes to normally distributed traits like IQ, modest selection pressure can drive down the fraction of eligible people to near-zero rates; for example, far less than 1% of the population will be 2 standard deviations above the mean on both IQ and Conscientiousness and this holds true even if we assume that both traits are highly correlated with each other (they’re not), see the appendix for formulas & calculations.

(A major caveat here is that the premises really do need absolute values of IQ and Conscientiousness. If you only have correlations, I believe it is possible for IQ’s correlation for educational success remain the same and Conscientiousness’s correlation go up while the fraction of the general population succeeding goes up also. For example, if online education reduced the need for Conscientiousness, but reduced the need for IQ even more, more people will pass by the opposite of our conjunctive reasoning, but any attempt to predict success will benefit less from information about IQ than about Conscientiousness.)

Now, to discuss claim #2 in more detail. The first study cited previously on online education stressing Conscientiousness, Elvers 2003, is particularly interesting (see the quotes in the footnote). Now, given the evidence from this study that online education scores correlate with Conscientiousness, it seems very likely that #2 is true. However, the result that the online students had the same average as the offline students indicates that the conclusion #4 is not true; the obvious candidate to reject via modus tollens is assumption #1. As one would hope! But if #1 is not true, it could be true to a very large degree - as already mentioned, computerized education could make education a lot less correlated with your raw IQ because it’s presented better or whatever (to listen to the most rapturous users of Khan Academy). However, the equality in scores between the online and offline classes indicates that whatever the drop in IQ requirements, it was offset by the increase in Conscientiousness requirements.

1. First, it suggests that blended learning will be intermediate in results: I’d expect partial online education to be ‘weaker’ than full online education in loading on Conscientiousness.

You have to force yourself to go to class, but then it’s still easier to learn without burdening your willpower/Conscientiousness. (You can always, say, not bring your laptop to class - difficult or impossible with online education!) I’d expect the effect of non-mandatory to be intermediate, much like I’d expect frequent mandatory deadlines in online education to help only a little.
2. Second, if one lone course shows such a hit from lack of Conscientiousness, what happens as ever more material goes online and students might be expected to do entire semesters just online?

Will we see the correlation go up, as students expend all their willpower and run completely dry (see eg. Baumeister & Tierney 2010, Willpower)? (You may be able to lift 1 weight up to your head and do that 10 times in a row, but if given 10 weights simultaneously to lift, you’ll drop them all.) It seems that the tradeoff might extend well beyond a single course to all courses.

# Consequences

Is loading outcome more on Conscientiousness a bad thing? I think it is, for a few reasons, some of which follow directly from the tradeoff and some of which are speculation about future consequences:

1. there is no particular reason to favor Conscientiousness as an additional reward for ‘good’ people. Whether we should favor it over IQ depends on the consequences such as what mental traits we need more of in our elites.

Conscientiousness is not a ‘virtue’ in the sense that the (non-existent) homunculus in your brain is ‘good’ or ‘bad’ for choosing to be Conscientious or not, any more than it is morally laudable to be high IQ than low IQ. Despite folk psychology & moralizing, the Big Five personality traits are stable over lifetimes like IQ, are turning out to be influenced by heredity like IQ, and progress is being made on tracing the traits to the underlying neurological factors like IQ. You can no more ‘try hard to be able to try hard’ (how circular) than you can try hard to be more intelligent.

Even if we find that Conscientiousness is not affected or Conscientiousness does not correlate with any problematic traits like psychopathy, that doesn’t exclude other personality traits: Varela et al 2012 finds that online performance was also correlated with being low on “gregariousness”, a subfactor of Extraversion matching on “individuals who confine themselves from social settings” (rather than just being quiet and reserved in social settings) - is this a good, bad, or neutral thing, morally? Or practically?
2. As already observed, the school system already rewards Conscientious grinds, and oppresses creativity. Do we need to make the former even more true? Think of how this will penalize bright creative potential-future-great-scientists - but uninterested in forcing themselves to do mandated drudge-work - nerds. We have all heard stories of geniuses like Einstein or Darwin or Jung who despised lower or higher education, or did their best to ignore it while educating themselves - Simonton’s 1994 Greatness: Who makes history and why estimates that this is not a few anecdotes but 60% of his sample. (Conscientiousness is necessary for scientific greatness, but not that much.)

If we stand idly by and let Conscientiousness shifts happen, saying that it must be a good thing since it is happening, I believe we are guilty of status quo bias. A useful thought here is Bostrom’s reversal test: why do we think that the current demands for Conscientiousness are optimal? Or the double-reversal test: suppose some alien or technology suddenly intervened in our educational system and made it load even more heavily on IQ but someone came up with a simple way to place burden back on Conscientiousness - would we accept their solution? I suspect in both cases, we would be unable to produce any good answer to this important issue.

It’s worth noting that a little appreciated property of the bell curve or normal distribution is that a very small shift in the average can have unintuitively large consequences at the tails: a shift of 1 IQ point in the general population can result in considerable changes in the population all the way out at, say, 160+ IQs. We can grasp this by looking at the changes in rarity by standard deviation: someone at 2 deviations is 1 in 22, 3 deviations 1 in 370, 4 = 1 in 15,787 (42x fewer than 3), and 5 = 1 in 1,744,277 (110x fewer than 4). A common standard deviation for particular IQ tests is 15 points, so 3 deviations out is ~145 IQ, which is around the observed minimum for great Nobel-winning scientific or mathematical work7. This is interesting because some interventions like iodization can have shockingly large effects on IQ in the worst-off environments - such as an average increase of as much as 15 IQ points - which would suggest that if a hypothetical intervention moved a population a standard deviation from 85 to 100 IQ, its subpopulation at 145 goes from being 4 deviations away to 3 - which increases that subpopulation’s ranks by 42x or 4200%!8 (One meta-analysis of iodine effects, Scrimshaw 1998, apparently did find a shift of the overall bell curve; and further, iodization benefits females more than males, so there may have been nontrivial consequences there…)

The implications are obvious for any academic system that forced its membership’s average IQ down a few points in exchange for researchers higher on Conscientiousness: it may have outsized effect on how much of its membership are the very smartest researchers around.
3. The tradeoff resulting in online education favoring Conscientiousness was neither designed in nor realized by the designers; it is purely accidental and undesired. Wouldn’t it be extraordinary if an accidental tradeoff turned out to be exactly optimal? How very convenient!

This is a good time to apply the status-quo reversal test: suppose online education did not result in any such tradeoff but a Khan Academy staffer unilaterally made some changes meant solely to make KA scores reflect Conscientiousness more (perhaps your progress would be deleted if you did not Conscientiously log in every week and do a few problems). Would you approve of this change? Suppose further online education actually reduced the need for Conscientiousness (maybe because the service pings your cellphone with a quick practice problem every so often); would you approve of the staffer’s change then? If you would not approve in the latter scenario where the shift along the tradeoff curve is intentional, why would you approve of a shift caused accidentally?
4. The cheapness of online education may prove irresistible and a case of worse is better: the cost of human teachers is nontrivial and may be increasing (whether this is due to backloaded pension compensation, growth of the education sector & diminishing returns, Baumol’s cost disease etc.), and this has prompted reactions like the death of university tenure & wholesale use of adjuncts, attacks on unions, and interest in automated methods of teaching… like online education. Already cuts have begun. Even if online education is worse, there may be no choice about whether to use it or not - a sort of educational enclosure movement. This shift may or may not be economically efficient (if the public sector is able to force the losses onto the public which is not organized enough to avoid it, perhaps due to ideological divisions).
5. Economic growth is increasingly captured in the US by the most-educated, with income growth going mostly to graduate degree holders. So anything which may lessen the ranks of the most highly educated seems like it would exacerbate the inequality of returns to education. Is some general increases in the net wealth of the economy worth it? People do not eat absolute wealth increases, they eat relative increases - more egalitarian economies are happier populaces. (Note the same question can be asked of other ‘cheaper’ things like globalization and outsourcing, and the answer in those other cases is not trivial. Pareto-efficient does not mean everyone is better off, just that no one is worse off, and this assumes humans do not care about their rankings or place - a patently false approximation.)

What other consequences may there be?

This prodigious event is still on its way, still wandering; it has not yet reached the ears of men. Lightning and thunder require time, the light of the stars requires time, deeds, though done, still require time to be seen and heard. This deed is still more distant from them than the most distant stars - and yet they have done it themselves.

# Appendices

## Selection on multiple normally distributed traits

### Simple questions

Suppose an elite university like Harvard decided to set a new admissions standard: they will only admit people who are 2 standard deviations above the mean on both IQ and Conscientiousness. If the filter is for 2 standard deviations above the mean and the variables are correlated with 1 (identical), then 2.3% of the population will pass; if the variables are uncorrelated with 0, then 2.3% of 2.3% (or 0.000529%) of the population will pass.

### Correlated

But what about intermediate values? For example, the psychology literature has reported a correlation of -0.21 between Conscientiousness & IQ, so we would expect an even tinier fraction of the population to pass, but what if we were optimistic and thought there was a positive correlation?

I consulted Wikipedia on bivariate normal distributions, but I didn’t understand much of it. The closest I found was sum of correlated normal random variables, but in this case what I want is closer to a min function.

#### Simulation

I was able to work up a R simulation to see how that worked, and it seemed in line with my intuitions:

# install.packages("fMultivar")
library ("fMultivar")

x <- rnorm2d(10000000, rho=0.5)

xgreater <- length(subset(x, x[,1] > mean(x[,1])+2*sd(x[,1])))
xandygreater <- length(subset(x, x[,1] > mean(x[,1])+2*sd(x[,1]) & x[,2] > mean(x[,2])+2*sd(x[,2])))

c(xgreater, xandygreater); c(xgreater / length(x), xandygreater / length(x), xgreater / xandygreater) * 100

# example results for different values of 'rho='
0.1
[1] 454,664  17,570
[1] 2.273e+00 8.785e-02 2.588e+03

0.2
[1] 458,284  82,552
[1]   2.2914   0.4128 555.1458

0.5
[1] 454,484  80,872
[1]   2.2724   0.4044 561.9794

0.9
[1] 455,242 267,912
[1]   2.276   1.340 169.922

0.95
[1] 455,162 321,024
[1]   2.276   1.605 141.784

0.99
[1] 455,260 394,448
[1]   2.276   1.972 115.417

#### Exact calculation

##### Bivariate min

I really was hoping for more of a precise analytic solution, so some more searching eventually turned up a paper, “Exact Distribution of the Max/Min of Two Gaussian Random Variables”, which gives a definition for the min of 2 correlated normal variables. This seems to be what I want; top of pg1, second column:

…where $\varphi \left(.\right)$ and $\Phi \left(.\right)$ are, respectively, the pdf and the cumulative distribution function (cdf) of the standard normal distribution. It is known that the pdf of $Y=\mathrm{min}\left({X}_{1},{X}_{2}\right)$ is $f\left(y\right)={f}_{1}\left(y\right)+{f}_{2}\left(y\right)$, where

1. ${f}_{1}\left(y\right)=\frac{1}{{\sigma }_{1}}\varphi \left(\frac{y-{\mu }_{1}}{{\sigma }_{1}}\right)×\Phi \left(\frac{p\left(y-{\mu }_{1}\right)}{{\sigma }_{1}\sqrt{1-{p}^{2}}}-\frac{y-{\mu }_{2}}{{\sigma }_{2}\sqrt{1-{p}^{2}}}\right)$
2. ${f}_{2}\left(y\right)=\frac{1}{{\sigma }_{2}}\varphi \left(\frac{y-{\mu }_{2}}{{\sigma }_{2}}\right)×\Phi \left(\frac{p\left(y-{\mu }_{2}\right)}{{\sigma }_{2}\sqrt{1-{p}^{2}}}-\frac{y-{\mu }_{1}}{{\sigma }_{1}\sqrt{1-{p}^{2}}}\right)$

They give an R implementation on pg6 (first column); it seems to have a pnorm typo, but I fixed that. Once it was working, I tried generating a slightly (0.1) correlated bivariate distribution, which look OK:

fmin <- function (y,mu1,mu2,sigma1,sigma2,rho)
{t1<-dnorm(y,mean=mu1,sd=sigma1)
tt<-rho*(y-mu1)/(sigma1*sqrt(1-rho*rho))
tt<-tt-(y-mu2)/(sigma2*sqrt(1-rho*rho))
t1<-t1*pnorm(tt)
t2<-dnorm(y,mean=mu2,sd=sigma2)
tt<-rho*(y-mu2)/(sigma2*sqrt(1-rho*rho))
tt<-tt-(y-mu1)/(sigma1*sqrt(1-rho*rho))
t2<-t2*pnorm(tt)
return(t1+t2)}

fmin(c(1:200),100,100,15,15,0.1)
[1] 1.849e-11 2.864e-11 4.418e-11 6.784e-11 1.037e-10 1.578e-10 2.392e-10 3.608e-10 5.418e-10
...

Now, I understand the PDF to be “a function that describes the relative likelihood for this random variable to take on a given value. The probability for the random variable to fall within a particular region is given by the integral of this variable’s density over the region”. So I suppose I should sum up every point in the pdf >130 (since 130 is 2 standard deviations up, by construction when I specified SD=15) and that’s my probability that a random deviate will be min(130,130). What’s the total probability someone will be over 130 on both variables? I think that would be:

sum(fmin(c(1:200),100,100,15,15,0.1)[130:200])
[1] 0.001004

If I increase the r to 0.9, the result is 0.01455 which is satisfyingly larger.

A sanity check - as the correlation goes to 1.0, there should be no decrease. So we do the same question for a single normal distribution defined the same way:

sum(dnorm(c(1:200), 100, 15)[130:200])
[1] 0.02459

# the function blows NaN chunks on 1.0, so we'll try a lot of 9s:
sum(fmin(c(1:200),100,100,15,15,0.9999999999)[130:200])
[1] 0.02459
##### Bivariate double integral

An acquaintance gave me a double-integral formula:

${\int }_{2}^{\infty }{\int }_{2}^{\infty }\frac{1}{2\pi \sqrt{1-{\rho }^{2}}}\mathrm{exp}\left(-\frac{1}{2\left(1-{\rho }^{2}\right)}\left[{x}^{2}+{y}^{2}-2\rho xy\right]\right)dxdy$

To calculate our 2-standard deviation minimum bivariate problem with a r=0.9 in R:

llim <- 2
ulim <- Inf
rho <- 0.9

f <- function(x,y) {
(1 / (2 * pi * sqrt(1 - rho^2))) * exp(-(1 / (2 * (1 - rho^2))) * (x^2 + y^2 - 2 * rho * x * y))
}

# double-integration
integrate(function(y) {
sapply(y, function(y) {
integrate(function(x) f(x,y), llim, ulim)\$value
})
}, llim, ulim)

0.01336 with absolute error < 1.6e-05

In Python using SciPy:

from numpy import *;
from scipy.integrate import *;

from matplotlib.pyplot import *;

def func(x, y, rho):
return 1.0/(2.0*pi*sqrt(1.0-rho**2.0)) * exp(-0.5/(1.0-rho**2.0) * (x**2.0 + y**2.0 - 2.0*rho*x*y))

return dblquad(func, 2, 100, lambda x: 2, lambda x: 100, args=(rho,))[0]

x = arange(-0.99, 0.99, 0.1)
y = zeros(len(x))
for i in range(len(x)):
print x[i], y[i]

plot(x, y)
show()
###### Monster

Incidentally, we could also use this code for more frivolous purposes; for example, the critically-regarded manga Monster centers around two fraternal twins, one a psychopathic genius, and one might wonder how frequently pairs of fraternal twins come as pairs of geniuses (~3 standard deviations up) given that the fraternal correlations r=0.5-0.7? We modify the R parameters:

llim <- 3
ulim <- Inf
rho <- 0.6
...
0.0001397 with absolute error < 3.4e-05

Twins in general make up 1-2% of the population, so one can tack another two zeroes to get an estimate of genius twins as being 0.0001397% of the global population or ~9,779 ($7000000000×0.0001397×0.01$); this is a bit of an underestimate since identical twins have much higher correlations like r=0.86, but could also be an overestimate since twins may have IQs lower by a third of a standard deviation (although not all studies are consistent) and this implicitly assumes a global average IQ of 100 (actual mean is more like 89). Finally, how many of those ~9,779 might we expect to be psychopathic? The correlation between IQ and psychopathy has been found to be weakly positive, non-correlated, or weakly negative (once selection effects like imprisonment are dealt with), for no apparent correlation; so we can simply multiply the 9,779 against the estimated population prevalence of ~1% for a final estimate of 98 genius psychopathic twins worldwide. (What fraction of those twins that might be raised in abusive orphanages and go on to star in manga is impossible to estimate.)

1. One might say it’s the obvious challenge for distance/online learners, especially to anyone who has tried. Eg. Coombs-Richardson 2007:

Personality types and learning styles also may affect student performance in distance learning. Participants with an extraverted personality type-who enjoy the physical interaction of working with others (Meisgeier and Richardson 1996)-may feel isolated from the human experience and become disillusioned. Considering learning styles, Elkins et al. (2002) found in a two-year study of Web-assisted courses that divergent learners-who seek broad elaborate ideas prompted by a problem or stimulus-did not perform nearly as well as convergent learners-who are able to bring material from a variety of sources to solve a problem…What does an online course demand that a face-to-face class does not? Online learning requires self-discipline and a greater amount of work than a face-to-face course. Students must demonstrate a high degree of autonomy and motivation (Ladyshewsky 2004).

• Meisgeier, C., and R. C. Richardson. 1996. “Personality types of interns in alternative teacher certification programs”. The Educational Forum 60(4): 350-60 TODO
• Elkins, V., C. Rafter, R. Eckart, E. Rutz, and C. Maltbie. 2002. “Investigating learning and technology using the MBTI and Kolb’s LSI”. Paper presented at the 2002 ASEE Annual Conference and Exposition, June 16-19, Montreal, Quebec, Canada.
• Ladyshewsky, R. 2004. “Online learning versus face to face learning: What is the difference?” Paper presented at the 2004 Teaching and Learning Forum, February 9-10, Murdoch University, Murdoch, Western Australia.
2. Coombs-Richardson 2007:

…Successful distance learners share some distinctive features in their mode of study (Littlefield 2005):

• They work independently, are self-motivated and persistent, and do better without people giving them constant guidance.
• They seldom procrastinate, realizing that timelines are important and that neglecting to turn in their work on schedule may end up delaying completion of their studies.
• They demonstrate good reading and writing skills, which are essential for acquiring most of the course information. Though some distance learning courses offer video recordings and audio clips, these are not sufficient to master the competencies.
• They are able to remain on task in spite of relentless distractions, such as frequent interruptions while learning at home.
3. In an interesting comment on the “possibilities” argument for online learning (that such courses can add material and media that offline courses cannot or will not), identity of courses turns out to be an important moderator:

Studies in which analysts judged the curriculum and instruction to be identical or almost identical in online and face-to-face conditions had smaller effects than those studies where the two conditions varied in terms of multiple aspects of instruction (+0.13 compared with +0.40, respectively)…In many of the studies showing an advantage for blended learning, the online and classroom conditions differed in terms of time spent, curriculum and pedagogy. It was the combination of elements in the treatment conditions (which was likely to have included additional learning time and materials as well as additional opportunities for collaboration) that produced the observed learning advantages. At the same time, one should note that online learning is much more conducive to the expansion of learning time than is face-to-face instruction.

4. This might be evidence against: since Conscientiousness increases modestly with age/experience, we would expect older people (professional/post-grad) to benefit more than younger (undergrads). This may reflect differences in the kinds of subjects or the material - perhaps older people taking training are different from the studied undergrads or perhaps their more advanced/specialized material has not been pedagogically polished as more common undergrad material, etc.

5. In studies which don’t collect the necessary information on Conscientiousness, I believe it may be possible to observe this effect by looking at standard deviations in the scores: the online class should have a greater range, as the un-Conscientious flunk out by doing less while the Conscientious thrive on the optimized presentation. On the other hand, it could be that the online class has higher averages and similar standard deviations, and the un-Conscientious just tend to make up the lower half of the test scores, so standard deviations don’t seem like a reliable indicator. Unfortunate, since there are many more studies simply comparing online and offline education than comparing them while also collecting personality data on subjects.

6. One wonders how much hope can we place in the falsity of #1. Just how much can education’s IQ requirements be brought down? Advocates seem optimistic, but is the current material all that bad? How dumb can you be before even the best highest-quality of calculus becomes unlearnable with feasible amounts of time and effort on your part? How close are existing online courses to this lower bound on IQ? At what point does #1 resume being true?

7. When this is mentioned, some wit often tries to bring up Richard Feynman’s supposed IQ score in the 130s; while this is obvious bunk (a case of “one man’s modus ponens is another man’s modus tollens”), a closer look at the source of the anecdote reveals many reasons why the score is either false or unreliable.

8. Arthur Jensen, discussing the slight deviations from the bell curve in real-world populations at the extremes (<70 and >140) points out the consequence of shifting the means:

Examination of this normal curve can be instructive if one notes the consequences of shifting the total distribution up or down the IQ scale. The consequences of a give shift become more extreme out toward the “tails” of the distribution. For example, shifting the mean of the distribution from 100 down to 90 would put 50% instead of only 25% of the population below IQ 90; and it would put 9% instead of 2% below IQ 70. And in the upper tail of the distribution, of course, the consequences would be the reverse; instead of 25% above IQ 110, there would be only 9%, and so on. The point is that relatively small shifts in the mean of the IQ distribution can result in very large differences in the proportions of the population that fall into the very low or the very high ranges of intelligence. A 10 point downward shift in the mean, for example, would more than triple the percentage of mentally retarded (IQs below 70) in the population and would reduce the percentage of the intellectually “gifted” (IQs above 130) to less than one-sixth of their present number. It is in these tails of the normal distribution that differences become most conspicuous between various groups in the population that show mean IQ differences, for whatever reason, of only a few IQ points. From a knowledge of relatively slight mean differences between various social class and ethnic groups, for example, one can estimate quite closely the relatively large differences in their proportions in special classes for the educationally retarded and for the “gifted” and in the percentages of different groups receiving scholastic honors at graduation. It is simply a property of the normal distribution that the effects of group differences in the mean are greatly magnified in the different proportions of each group that we find as we move further out toward the upper or lower extremes of the distribution.

I indicated previously that the distribution of intelligence is really not quite “normal,” but show certain systematic departures from “normality.” These departures from the normal distribution are shown in Figure 2 in a slightly exaggerated form to make them clear. The shaded area is the normal distribution; the heavy line indicates the actual distribution of IQs in the population. We note that there are more very low IQs than would be expected in a truly normal distribution and also there is an excess of IQs at the upper end of the scale. Note, too, the slight excess in the IQ range between about 70 and 90.

…The “excess” of IQs at the high end of the scale is certainly a substantial phenomenon, but it has not yet been adequately accounted for. In his multifactorial theory of the inheritance of intelligence, Burt (1958) has postulated major gene effects that make for exceptional intellectual abilities represented at the upper end of the scale, just as other major gene effects make for the subnormality found at the extreme lower end of the scale. One might also hypothesize that superior genotypes for intellectual development are pushed to still greater superiority in their phenotypic expression through interaction with the environment. Every recognition of superiority leads to its greater cultivation and encouragement by the individual’s social environment. This influence is keenly evident in the developmental histories of persons who have achieved exceptional eminence (Goertzel & Goertzel, 1962). Still another possible explanation of the upper-end “excess” lies in the effects of assortative mating in the population, meaning the tendency for “like to marry like.” If the degree of resemblance in intelligence between parents in the upper half of the IQ distribution were [substantially] greater than the degree of resemblance of parents in the below average range, genetic theory would predict the relative elongation of the upper tail of the distribution. This explanation, however, must remain speculative until we have more definite evidence of whether there is differential assortative mating in different regions of the IQ distribution.

…The reason is simply that assortative mating increases the genetic variance in the population. By itself this will not affect the mean of the trait in the population, but it will have a great effect on the proportion of the population falling in the upper and lower tails of the distribution. Under present conditions, with an assortative mating coefficient of about .60, the standard deviation of IQs is 15 points. If assortative mating for intelligence were reduced to zero, the standard deviation of IQs would fall to 12.9. The consequences of this reduction in the standard deviation would be most evident at the extremes of the intelligence distribution. For example, assuming a normal distribution if IQs and the present standard deviation of 15, the frequency (per million) of persons above IQ 130 is 22,750. Without assortative mating the frequency of IQs over 130 would fall to 9,900, or only 43.5% of the present frequency. For IQs above 145, the frequency (per million) is 1,350 and with no assortative mating would fall to 241, or 17.9% of the present frequency. And there are now approximately 20 times as many persons above an IQ of 160 as we would find if there were no assortative mating for intelligence.3