Open Questions

Some anomalies/questions which are not necessarily important, but do puzzle me or where I find existing explanations to be unsatisfying.
topics: biology, cats, politics, history, genetics, nootropics, psychology, sociology
created: 17 Oct 2018; modified: 13 Aug 2019; status: finished; confidence: possible; importance: 3


A list of some questions which are not necessarily important, but do puzzle me or where I find existing ‘answers’ to be unsatisfying, categorized by subject (along the lines of & ; see also ).

AI

  • Three superimposed question marks sharing the same dot.

    What, algorithmically, are mathematicians doing when they do math which explains how their ?

    Is it equivalent to a kind of tree search like or something else? They wouldn’t seem to be doing a literal tree search because then there would almost never be mistakes in the proof (as the built-up tree of theorems only explores valid inferential steps), but if they’re not, then how are they handling ‘logical uncertainty’? Are they doing something like MCTS’s random playouts where lemmas are not proven but simply heuristically given a truth value to shortcut exploration and the heuristic is accurate enough to usually guess correctly and this is why the proofs are wrong but the results are right?

  • NN overparameterization: We can train large deep slow neural networks to human-level performance on many tasks, and we can then train small shallow fast versions of those NNs to save energy/enable mobile deployment, so why can’t we train small shallow fast NNs in the first place? And what would happen if we did figure it out?

Biology

  • Why do humans, , and even kept in controlled lab conditions on standardized diets appear to be increasingly obese over the 20th century? What could explain all of them simultaneously becoming obese? (Is it something literally in the water?)
  • Does moderate alcohol or consumption have any health benefits, or not?

Jeanne Calment

Jeanne Calment holds the verified record for human longevity at ~122.5 years at her death over 22 years ago: Calment is history’s first & only 122 year old; and also the first & only 121 year old; and also the first & only 120 year old. No challenging centenarian has come close to her record, and arithmetically, they will not for years to come: she will have held the record for a minimum of 3 decades, despite countless countervailing factors. Some statistical simulations suggest that Calment-like record gaps are not expected from the distribution of human life expectancies, and as time passes, her record becomes increasingly anomalous.

This truly remarkable longevity raises the question of whether Calment’s longevity is due to the same factors as all other centenarians: did she benefit from some unique factor like genetic mutations, or, as accused in late 2018 of being, is she, in fact, merely a fraud which has escaped previous verification?

Why did live so many more years than other centenarians (to 122 years & 164 days), breaking all records and setting a life expectancy record which decades later has not just not been broken, but As of 17 August 2019, the oldest person record is held by , then age 116 years, 227 days (2,128 days less than Calment), who would have to survive until 14 June 2025 to match Calment; in other words, even if Tanaka turned out to be the first person to break Calment’s record, Calment’s record would have stood from ~1995 to 2025, a remarkable minimum of 30 years.

Graph of time each “oldest person” record holder held record before dying (2013 Gerontology Research Group data); outlier is Jeanne Calment (her predecessor Florence Knapp died in 1988, she died in 1997)
Graph of time each “oldest person” record holder held record before dying (2013 Gerontology Research Group data); outlier is Jeanne Calment (her predecessor Florence Knapp died in 1988, she died in 1997)

Which is extraordinary considering that she smoked, medicine has continuously advanced, the global population has increased, life expectancy in general has increased, and the implies that, with mortality rates approaching 50%, centenarians should die like flies and ever closer in age to each other and not have occasional enormous permanent >3 year gaps between the record setter (Calment) and everyone since then. (I did some Gompertz curve simulations, and Calment-like records .)

It isn’t necessarily odd that the first well-validated longest-lived person might exceed previous records from sparse poorly-kept datasets by a large margin (much as it is not odd now to see Olympics sports or weather records shattered by large margins1), but it is odd that decades are passing and still no validated centenarians have reached, much less surpassed, Calment’s record. (I have a similar question about the “Dream Market” darknet market, as its longevity is extremely anomalous, especially when one looks at how .) Typically, if one looks at record datasets such as the , as one would expect from order statistics, the ‘gap’ between each successive record holder is smaller and smaller, particularly as the number of ‘competitors’ increases; in running, the number of runners has increased dramatically over time, it has become a major sport/profession with concomitant improvements in training and so on, and this resulted in records being regularly set but by smaller intervals each time, as the extreme of what is humanly possible is approached. Similarly, with longevity, we should see early on large gaps between successive verified record holders as a small number of reasonably-reliably verified super-centenarians from the most industrialized & bureaucratized countries (as opposed to the enormous number of frauds/errors pre-documentation: Newman 2019) reach the longevity frontier, with gaps regularly shrinking as the rest of the world ‘comes online’ with proper documentation, hundreds of millions of people start competing for the record, improved medicine pushes out the average life expectancy & makes it much more probable to reach an extreme, there is greater scientific & public interest in tracking the extremes, and so on. Instead, what we see is this steady order statistic effect of shrinking record breakers—except for Jeanne Calment, who smashes the record and continues to hold it despite decades of challengers from an exponentially growing population.

The easiest answer is that she is a fake like so many supposed centenarians, but against that, she doesn’t fit the usual fake profile of existing only like paper like the , being male, or being in a Third World illiterate country where old age is extremely culturally valued, dates exhibit blatant , no contemporary paper records exist or their paper trail only began late in life, etc; she was female, born in Third Republic France in a highly bureaucratic well-organized well-documented literate society which did not especially value extreme old age, was apparently fairly social & not an unknown recluse, was known for longevity in her lifetime (as opposed to afterwards), was vetted by the & others, etc.

On the other hand, & Yuri Deigin (/) in 2018 accused Calment of having been a fraud, specifically, having died and been replaced by her young daughter Yvonne Calment who supposedly died unexpectedly in 1934. The motive for the fraud would be evading the estate taxes which would have been due (on top of the estate taxes paid due to two deaths in the family just 3 years before) & Jeanne Calment’s later annuity (which would’ve been considerably underpriced since she was supposedly much older); aside from the observation that Calment is such an outlier and was remarkable healthy & youthful-looking for her ostensible ages (but more consistent with how old the daughter Yvonne would’ve been), Novoselov notes the suspiciousness of the Calment family archives being destroyed by them, some anomalies in Calment’s passport, oddities in family arrangements, apparent inconsistency of Calment’s recollections & timing of events & photos, facial landmarks like ear features not seeming to match up between young/old photos, and an obscure 2007 accusation in a French book that a French bureaucrat and/or the insurance company had uncovered the fraud but the French state quietly suppressed the findings because of Calment’s national fame. has criticized some of the points. Presumably DNA testing offers a definitive answer, if the Calment family cooperates, and allows access to .

Cats & Earwax

While petting cats, I accidentally discovered cats are fascinated by the smell & taste of , particularly that of humans, and this interest can last indefinitely. Dogs & humans, for comparison, are not. A number of anecdotes have reported this over the years, but no formal research appears to have been done on this. What makes earwax attractive to cats? Pheromones? Some nutrient?

Massaging cat ears while petting them, I accidentally discovered that cats can enjoy fingers inserted into their ears, perhaps because, like humans, earwax can build up to uncomfortable levels (and I once discovered undiagnosed ear mites in a kitten this way); after testing about 7 cats, I then discovered that cats are fascinated by the smell and taste of their own earwax. (I’m not entirely sure if they exhibit a , as Kolb 1991 claims. Most of my tests occurred before I learned what a Flehmen response was.) This has held true of most cats I have tested this on. Searching, I’ve found a number of comments in publications and online also noting this phenomenon. There are not many contexts a cat owner would notice cats’ interest in earwax, and many of them are actively discouraged (a cat licking your ear hurts!), but when there is, it appear that earwax interest is often noted. I’ve found that cat earwax is not even the most interesting earwax: cats are more interested in dog earwax, and human earwax most of all: the reaction can be quite strong—given the opportunity, a cat will lick a hearing aid for quite a while, and I’ve had to take away hearing aids from my cats because the intensity of their licking made me worry about them damaging it. Aside from the cost of replacement, this is a safety concern: hearing aids, earphones, or (especially) disposable foam earplugs are small enough to be eaten & endanger cats’ health.

I first thought that it might be like sniffing butts, a way to learn about the health/status of another cat, but that doesn’t explain why dog & human earwax is more interesting, and cats don’t seem to seek out earwax even after they know about it (with the exception of one of my cats who’d sometimes try to lick my ears when I was in bed)—I find that I sometimes have to touch their noses with a waxy finger before they abruptly become interested (suggesting that whatever the odor is, it doesn’t travel far). Dogs, on the other hand, typically neither appreciate fingers in ears nor show much interest in smelling or licking earwax.

What is it about earwax that fascinates cats? Is there a particular chemical responsible, like a fat or salt or ?2 Earwax in humans comes in ‘dry’ and ‘wet’ forms, differing by race and I would describe cat/dog earwax as being like ‘wet’ human earwax; do cats like ‘dry’ earwax as well? (While there is no shortage of confident assertions that the reason is the “incredibly high concentration of fatty acids and cholesterol” or salt in earwax and similar claims, exactly zero evidence is ever offered for these explanations.)

Searching Google Scholar/Google (cat earwax OR "ear wax" smell OR taste -"CAT scan"), I’ve found the following anecdotes:

  • Arny 1990, : letter to Nature asking if anyone knew anything after noting

    I dismissed this as a curiosity until I found that our second Siamese cat also liked it. In fact, the second cat leaps on the bed in the morning hoping to be offered some. I mentioned this odd behaviour to 3 other people and have learned from them that their cats also liked the wax.

    I emailed Arny in November 2019, and he said he “got a few replies from people who had noticed the same behavior with their pet.” but otherwise nothing useful.

  • Kolb 1991, “Chapter 25: Animal models for human PFC-related disorders. The Prefrontal Its Structure, Function and Cortex Pathology”, a paper which notes in passing that:

    The most common components of the response pattern include approaching, sniffing, and touching the urine source with the nose, flicking the tip of the tongue repeatedly against the anterior palate behind the upper incisors, withdrawing the head from the urine, and opening the mouth in a gape or ‘Flehmen response’, and licking the nose. This behavioural pattern apparently allows olfactory stimuli to reach the secondary olfactory system, which appears to be specialized to analyse odours that are species-relevant. Cats show this response to urine of other cats, and oddly enough to humans, but they do not show it to urine of rhesus monkeys, dogs, rats, or hamsters. They also do not show it to cat fecal matter or cat fur, although they do show it to cat earwax!

  • 2007, , an interview with an artist who mentions:

    At one time we had 13 cats and 7 dogs. I began being very interested in communications with cats and dogs and the subtle body languages that animals use to communicate. We had a cat named Que tu bu who loved to lick the earwax out of our ears, which was a strange scratchy affair, though clearly a cat showing affection and love toward a human. Later, as a teenager I became interested in Marine Biology…

    Rinaldo also mentioned this in a 2016 essay:

    After minutes of stroking, Catabu would suddenly pop up on his back paws and place his front paws on my shoulder. He would then begin to probe my inner ear with his scratchy tongue. His whiskers tickled as he dug further, licking my ear slowly and deliberately. This was somehow a pleasurable experience, though his tongue was sticky. Cat behaviorists, would speculate he was claiming me as litter-mate. I think we were exchanging love and affection. This was my first trans-species experience. Here was a cat, finding pleasure in the taste of my earwax while we provided mutual affection. This cat/human relationship eft a lasting legacy and deep-probing questions for me about animal-human communication, symbiosis and the contemporary notion of the computer interface.

  • Lynch 2007, “‘No Writer Nor Scholar Need Be Dull’: Recollections Of Paul J. Korshin”, in a memoir of English professor , recollects:

    At the Osage house, Paul revealed himself as a doting cat lover. He and Debra had adopted brother and sister tabbies he’d named Oscar and Sherwin for his undergraduate mentor at City College. Oscar, the color of an orange creamsicle, would jump into Paul’s lap and purr during the seminars. Paul would cradle him, saying, “He likes to be made much of.” Both cats had free run of his exquisite suits, though he kept lint tape handy to pick up the fur. When Paul met my tuxedo cat Edgar, I mentioned that Edgar was partial to earwax. Intrigued, Paul said, “I don’t know if I have any,” but he put a finger to his ear and allowed Edgar to lick off the spoils. In time, Gaylord and Holly would join Paul and Debra’s cat family. We exchanged Christmas cards over the years “from our cat house to yours.”

On the general Internet, some cat owners have noted this behavior:

Genetics

Psychology

  • What is “personal productivity” and why does it vary from day to day so much? And why does it not seem to correlate with environmental variables like or sleep quality (at least using my non-sleep-deprived ), nor manifest as the usual kind of latent variable in my factor analyses? Is it something much weirder than the usual kind of latent variable, like a set of ?

  • Does listening to music while working serve as a ?

  • one of the best stimulants on the market: legal, cheap, effective, relatively safe, half-life much less than 6 hours. It also affects one of the most important and well-studied receptors. Why are there eg by making it somewhat longer-lasting or less blood-pressure-raising, when there are so many variants on other stimulants like amphetamines or modafinil or caffeine? (The one exception I currently know of is a biotech company, Targacept, which attempted to develop for //Alzheimer’s/bladder problems such as variants on , but their drugs failed in clinical trials and they . Given the highly risky nature of drug development, it’s unclear how much to infer from their failure about whether better nicotines exist—Alzheimer’s disease is where exciting drugs go to die, and a useful stimulant may not have so large a benefit as to be compelling in trials for ADHD or depression—I doubt caffeine or modafinil could justify large Phase III trials on the basis of their effects on ADHD!)

  • does build tolerance, or not? The academic literature’s consistent claim that it doesn’t completely contradicts the equally consistent anecdotes from most modafinil users that it does, and seems a priori implausible.

  • Why does (anecdotally so far) seem to be so effective for writers, even ones who are not morning persons? While programmers, which seems like a similar occupation, are invariably owls?

  • Richard Feynman made a famous critique of poor experimental controls in psychology exemplified by flaws/side-channels in mouse experiments as demonstrated by a Mr Young; . It’s not like Feynman to make things up, but all attempts to find the original research in question have failed and it’s unclear who Young was.

  • in 1935, the psychologist David Wechsler compiled a dataset of human performance on everything from running to punch-card processing, where absolute/cardinal measurements were possible (rather than ordinal ones like IQ) and observed that the absolute range of human capabilities is ~2–3x (best/worst out of 1000 healthy people): . Looking through the rare citations of it, his generalization does not appear to have been meaningfully gainsaid since.

    Since running across this in, I believe, Epstein 2013’s , I have felt like this is a neglected observation that should tell us something important about human biology or genetics or intelligence—why only 3x? and so consistently 2–3x?—but nothing has ever gelled.

  • how common are, and what is going on psychologically, in the occasional eruption of large shared fantasy worlds () among children & adolescents?

    There are many cases of a (typically pubescent, typically female) child or adolescent building such an intense fantasy-world that they wind up sucking in & convincing friends/classmates. They typically go unreported except in extreme cases (such as the 3, the , the ), often reported only in passing4 or via anecdotes—I have been told of 3 cases (2 from acquaintances, one indirectly), all of which follow the same pattern of a young female teenager building up a fantasy world (with heavy input from dreams) and engrossing friends/classmates.

    But there doesn’t seem to be any recognized name for this pattern (“”? “ complex”? ) or discussion of epidemiology. Is it an expansion of ? Is prevalence underestimated due to (similar to how s are not anomalous but may be had by the majority of children, though they forget as adults)? Are the dynamics the same as proto-religions (the ways in which the paracosms are extended, particularly by dreaming, bear a great deal of resemblance to the origins of religions like Christianity)?

Mouse Utopia

One of the most famous experiments in psychology & sociology was John Calhoun’s Mouse Utopia experiments in the 1960s–1970s. In the usual telling, Mouse Utopia created ideal mouse environments in which the mouse population was permitted to increase as much as possible; however, the overcrowding inevitably resulted in extreme levels of physical & social dysfunctionality, and eventually population collapse & even extinction. Looking more closely into it, there are reasons to doubt the replicability of the growth & pathological behavior & collapse, and if it does happen, whether it is driven by the social pressures as claimed by Calhoun or by other causal mechanisms at least as consistent with the evidence like disease or mutational meltdown.

What really happened in ‘s “Mouse Utopia” experiments? Mouse Utopia is a legendary experiment in which mice were put in a high-density enclosure (“Universe 25”) with unlimited food, a ’mouse utopia’—only to see the initial population growth be followed by a population collapse generations later, while the late mouse population exhibited bizarre physical & social abnormalities such as autistic-like behavior & homosexuality & failure to reproduce. Mouse Utopia is interpreted as illustrating the damaging effects of the environment & overcrowding by John B. Calhoun and others. After he published an extremely popular article in Scientific American in 1962 describing the first phase of Mouse Utopia experiments, it became a stock example employed by liberals in application to human populations, particularly for global & urban population growth and any human problem that might be caused by environments, such as the urban decay and riots and spiking crime rates of that era.

As WP puts it, describing the most famous Mouse Utopia, Universe 25:

Initially, the population grew rapidly, doubling every 55 days. The population reached 620 by day 315, after which the population growth dropped markedly, doubling only every 145 days. The last surviving birth was on day 600, bringing the total population to a mere 2200 mice, even though the experiment setup allowed for as many as 3840 mice in terms of nesting space. This period between day 315 and day 600 saw a breakdown in social structure and in normal social behavior. Among the aberrations in behavior were the following: expulsion of young before weaning was complete, wounding of young, inability of dominant males to maintain the defense of their territory and females, aggressive behavior of females, passivity of non-dominant males with increased attacks on each other which were not defended against.[2]

After day 600, the social breakdown continued and the population declined toward extinction. During this period females ceased to reproduce. Their male counterparts withdrew completely, never engaging in courtship or fighting and only engaging in tasks that were essential to their health. They ate, drank, slept, and groomed themselves – all solitary pursuits. Sleek, healthy coats and an absence of scars characterized these males. They were dubbed “the beautiful ones.” Breeding never resumed and behavior patterns were permanently changed. The conclusions drawn from this experiment were that when all available space is taken and all social roles filled, competition and the stresses experienced by the individuals will result in a total breakdown in complex social behaviors, ultimately resulting in the demise of the population.

Calhoun saw the fate of the population of mice as a metaphor for the potential fate of man. He characterized the social breakdown as a “second death,” with reference to the “second death” mentioned in the Biblical book of Revelation 2:11.[1] His study has been cited by writers such as Bill Perkins as a warning of the dangers of living in an “increasingly crowded and impersonal world.”[3]

If Calhoun had merely found that rat/mouse populations had an optimal equilibrium population density which they naturally reached when permitted, and that if a population was forced beyond this density, various things began to get worse to some degree, I do not think anyone would have been too surprised or his research so world-famous & textbook material. What he found was more dramatic: the mouse population was not self-regulating and would grow to unsustainable levels, resulting in not just moderate decrease in quality of life, but an explosion of all sorts of strange & novel pathologies followed by total population collapse and possibly extinction. This narrative of growth→pathology→collapse→extinction fed into anxieties over the apparent meltdown of American cities and widespread fears like Ehrlich’s 1968 that the had totally failed, human populations were increasing exponentially without bound, and within years there would be global mass famine deaths of “hundreds of millions of people”.

So Mouse Utopia quickly became one of the most famous experiments in psychology (and highly influential on not just psychology but sociology, urban planning, American politics, and science fiction, inspiring eg The Rats of NIMH), and continues to be discussed (eg by ); as , Ramsden & Adams 2008/ put it:

Calhoun published the results of his early experiments with the rats at NIMH in a 1962 edition of Scientific American. That paper, , went on to be cited upwards of 150 times a year.5 It has since been included as one of “Forty Studies that Changed Psychology,” joining papers by such figures as Freud, Pavlov, Milgram, Rorschach, Skinner, and Watson ([pg249, , ] Hock 2004). Like Pavlov’s dogs or Skinner’s pigeons, Calhoun’s rats came to assume a near-iconic status as emblematic animals, exemplary of the ways in which behavioral experimentation at once marks and violates the human-animal distinction. The macabre spectacle of crowded psychopathological rats and the available comparisons with human life in the densely-packed inner cities ensured the experiments were quickly adopted as “scientific evidence” of social decay. Referenced far outside of the fields of ecology and mental health, Calhoun’s rats have—or certainly had—come to seem part of the common cultural stock, shorthand for the problems of urban crowding just as Pavlov’s dogs were for respondent conditioning. Along with their public popularity, the experiments played a critical role in the development of disciplines and research fields, so much so that sociologist and human ecologist would remark that the extent of their influence was itself a “curious phenomenon.”

Like any symbol, it has shown adaptability—in 2015, now some reinterpret it to reflect contemporary political debates and explain it not as about crowding & social breakdown but as being about :

…Today, the experiment remains frightening, but the nature of the fear has changed. A recent study pointed out that Universe 25 was not, if looked at as a whole, too overcrowded.6 Pens, or “apartments” at the very end of each hallway had only one entrance and exit, making them easy to guard.7 This allowed more aggressive territorial males to limit the number mice in that pen, overcrowding the rest of the world, while isolating the few “beautiful ones” who lived there from normal society. Instead of a population problem, one could argue that Universe 25 had a fair distribution problem.

However, there are red flags:

  • the immediate and long-enduring popularity in liberal politics & pop culture was fed by Calhoun’s own highly-anthropomorphized description of the various kinds of mice, and he fully endorsed the grand applications of Mouse Utopia to contemporary American problems (eventually culminating in an angry NIMH resignation letter in 1986 heavy on references to George Orwell’s 1984).8

    One might note that historically, a number of high-profile ideologically-friendly psychology results dating ~1950–1970, often used to justify policy, have proven to be seriously flawed (even more so than one would expect from the contemporary social psychology replication crisis): the Pygmalion effect, the Robbers Cave experiment, the bystander effect/Kitty Genovese, the Third Wave, Zimbardo’s Stanford Prison Experiment, the double-bind & refrigerator-mother theories of schizophrenia, etc frequently fail to replicate and often involve heavy analytic bias or outright interference by the experimenter to make the experiment ‘work’ and tell the desired story.

  • animal studies in general often than similar human studies: even smaller n, large between-strain genetic differences (in addition to all the between-species differences), pseudo-replication from group housing/relatedness, typically non-blinded ratings, heavy publication bias, etc.

  • Mouse Utopia is almost completely unpublished. Despite working on it and similar experiments with NIMH funding for decades (he ), Calhoun appears to have published almost nothing substantive about his research, limited to a handful of short summary articles or passing references. (Calhoun’s 1963 book describes only his “quarter-acre” experiments which ended in June 1949, before the NIH experiments on overcrowding.) The major citation for his Mouse Utopia experiments is the aforementioned 1962 Scientific American article (published 33 years before his death), which consists of 9 pages of popular writing, of which about half is generic illustrations of mice (rather than data-based figures or plots or tables). Calhoun 1963, touches briefly on behavioral sinks & mortality in some of the earlier experiments. Calhoun 1971, presents a brief display of data from “Universe 14” and “Universe 15” but goes into more detail about 25 unspecified universes done to followup the Kessler 1966 thesis and mentions that they are following Universe 25, still waiting for it to actually collapse.9 The article Calhoun 1973, “Death Squared: The Explosive Growth and Demise of a Mouse Population”, begins with an extended analogy to the Book of Revelation, and presents some limited information on Universe 25, which has only halved in population at this point. His 1970 article, for example, is just a redaction of the 1962 one, and despite apparently consulting Calhoun’s manuscripts at the NLM, Ramsden & Adams 2009 shed little light on Calhoun’s research or cite much beyond the 1962 article. (Considering how little he published, it’s surprising that NIMH funded Calhoun until 1983, leading to what Ramsden & Adams 2009 describe as a “forced retirement” in 1986—apparently Calhoun was unable to get funding anywhere else.)

    This causes considerable confusion in reading since it’s unclear what papers refer to what. The 1962 article describes high infant mortality but not collapse in unidentified ‘universes’, which are not the famous Universe 25, which was started later; Calhoun 1971 had not yet seen a collapse in Universe 25; Marsden 1972 describes a population in slight decline and extrapolates out to possible collapse in a single unspecified universe; while Calhoun 1973 article shows a graph of a universe’s population definitely decreasing and apparently doomed by sterility & aging, which is identified as Universe 25. Are these all the same population, and if not, how many different ‘series’ or ‘universes’ are being described? How many exhibited the ‘senescence’ phase, much less population collapse? (Indeed, how many were done in total?)

  • it is unclear just how many experiments Calhoun had to run to get the one result which is always talked about; the name “Universe 25” implies at least 24 prior experiments, and Calhoun speaks vaguely of multiple “series” of experiments, referencing earlier experiments with stable populations (unlike Universe 25), some which were apparently controlled to fixed population sizes and some which apparently were not. Nor did all of the overpopulated universes develop the “behavioral sink” phenomenon Calhoun lays so much stress on, which he attributes to an otherwise-unexplained change in the food type. The number of experiments Calhoun ran implies that variance in outcomes was high, and in Kessler 1966, the two experimental group replicates were nevertheless different on many measures.

  • aside from 2 studies on brain hormones prior to 1973 in “mice selected (by Dr Calhoun, Dr Marsden, and their associates) to represent specific behavioural states existing during the declining crowded populations”, there appear to be no followup or secondary analyses of any kind, so there are no archived biological samples anywhere which could be checked; Calhoun’s 1973 claim that removing mice for analysis would disturb the colony dynamics suggests that few or no samples were kept in the first place

  • only 2 partial replications have ever been done by third parties that I know of10; likewise, if unique aspects of Calhoun’s experiment like the “beautiful ones” have been reported since, I have not encountered any references to them. They do not convincingly support the Universe 25 Mouse Utopia narrative. They are:

    1. Kessler 1966 thesis, : Kessler used 4 strains of mice simultaneously (rather than Calhoun’s use of a single kind of mice), in 2 separate experimental high-density groups (plus a control) and achieved a remarkably high & apparently stable population density. ‘s summary does not mention any population collapse nor whether there were Calhoun’s pathologies like the ’beautiful ones’. Kessler’s abstract reports that he achieved densities “several times greater” than prior experiments, with stable populations maintained by low pregnancy & high infant mortality rates; while Kessler notes “aberrations of sexual behavior”, the high-density mouse behavior normalized (eg they were able to reproduce) when transplanted to lower-density environments or when environments were connected in an ‘emigration’ experiment’. The two experimental groups showed variance, differing from each other in many ways (“Cohorts in Pop A and B differed with respect to reproduction physiology, mortality, and behavior, and intercohort differences persisted at all levels of population density.”), despite being generated the same way & put into the same kind of environment. Kessler further saw signs of natural selection, as indicated by changes in genetically-influenced coat color proportions (which were consistent in both groups). Kessler sums up as:

      The large sizes and unusual degree of crowding attained by the freely growing populations in this study compared with previous studies may be related to the types of animals used, to the number of individuals in the founder nuclei, and to the physical structure of the enclosures. Extreme crowding was compatible with general physical health. The decline of fertility and fecundity, the decreased survival of newborns, and the appearance of behavioral aberrations—rather than disease or an increase in adult mortality—represented the major self-regulatory mechanisms that eventually limited population growth. The growth of individuals was not inhibited. Social withdrawal and the decline of social interaction rather than a rise of interaction characterized the populations. Such findings cast doubt about the generality of the so-called “Stress” theory of social ecology that emphasizes increased interaction and pituitary-adrenal hyperactivity as the principal mechanisms involved in self-regulation of vertebrate populations.

      Overall, despite achieving a density far higher and one that would be expected to have a far larger harmful effect, Kessler 1966 only somewhat resembles Calhoun’s results: while Kessler does describe deviant mice behavior driven by density (such as homosexual matings) and high infant mortality/cannibalism, on the other hand, there are no population crashes or cessation of reproduction but stable populations after initial growth, there are no behavioral sinks, any ‘beautiful ones’ or ‘drinkers’ or ‘autistic’ mice are not described as such by Kessler, the mice are healthy overall, and transplanted mice revert. Further, Kessler’s observation of considerable between-population variance & genetic changes raise questions about statistical power & interpretation of any effects.

    2. Hammock 1971, : uses a different mouse strain, Swiss Webster, and in the primary experiment, following up a pilot, obtained “a total lack of overpopulation.”

      The groups reached a certain population and then maintained it, bouncing back after any culling (and raising questions about Calhoun’s claim that a population which had stopped reproducing after reaching an equilibrium must be doomed). Hammock notes extensive pathology in the pilot similar but not identical to Calhoun’s (eg no ‘beautiful ones’ but instead the pilot mice began to groom only their head), some indication of a population decline during the short duration, and no appearance of harems/territories/behavioral sinks. In the main experiment, however, the experimental population quickly reached a low-density equilibrium and no pathologies were observed other than high infant mortality (primarily from cannibalism, maintaining the equilibrium). Hammock notes “No other experiment reviewed had this phenomenon occur. In all other research, the populations first overpopulated then reduced their numbers. This experiment suggests an inborn population control mechanism based upon the density available per mouse…”

  • other research on animal social dynamics & population density find that there are (of course) relationships between them, and changes in social patterns with density, but nothing like Calhoun’s results of explosive population growth, utter social decay, widespread sterility, uniquely pathological types emerging, and completely collapse/extinction (eg compare Mouse Utopia with the changes observed in for a colony of rhesus monkeys).

  • Calhoun fails to seriously consider any hypothesis other than purely-density-based social breakdown:

    • disease: for example, the sterility noted is also a side-effect of many contagious diseases or parasite load, which are greatly assisted by density in spreading, and density fosters “evolution towards virulence” of existing diseases as diseases can be more lethal to spread faster (while infections in more isolated individuals must be more careful to not kill their hosts before infecting another host). Nothing was done to prevent disease nor to check for its presence, and Calhoun simply denies it could be a factor.11

    • genetics: the described collapse closely resembles experiments (which also feature sterility and subsequent population collapse); that is typically demonstrated in asexual organisms by removing all reproductive constraints like resources and so eliminating natural selection as much as possible, allowing the continuous buildup of mutations until finally organisms are no longer even able to reproduce, but should be possible in sexually-reproducing organisms as well.

      Meltdown should be much harder to induce in sexual organisms (the recombination theoretically allows much greater selection and is part of the justification for the extremely complex, expensive, error-prone process of sexual reproduction) and it’s unclear if Universe 25 ran enough generations to plausibly generate mutational meltdown, but it will be faster in tiny populations (eg Calhoun mentions that some used 56 rats as a seed but not which strain—many laboratory strains are unhealthy & reproductively unfit to begin with, highly adapted to the lab environment, and highly inbred or even clonal)12. Universe 25 appears to have been begun with just “4 pair of mice”, based on figure 2 in Calhoun 1973. (Based on , they were probably mice; the WP article describes them as inbred & notes that they tend towards anxiety & males towards aggression.) Further increasing inbreeding, Calhoun 1962 describes ‘harem’-like behavior where the dominant male could ensure near-exclusive access to all the female in one subdivision of the cage, dubbed “brood pens”, and force out rival males. Calhoun appears to admit in discussions that they would be highly inbred but denies any possibility of relevant genetic change.13

      As well, highly social organisms with complex colony mechanisms, dependent on subtle interactions between members (eg proper use of alarm pheromones and border guarding), where members can inflict a great deal of harm on each other, may be especially sensitive to genetic mutations, as the genes of individual mice affect cage mates (, Baud et al 2018), causing “indirect genetic effects” (IGEs) or “social epistasis”.

      Calhoun did not do anything to check or avoid these alternative mechanisms, such as running fostering experiments with the survivors (if the problem is genetic, the offspring of the survivors would, even if fostered into a normal healthy mouse colony, still be unhealthy, while if it’s a contagious disease, introducing a few survivors into a healthy colony should result in noticeable colony-wide damage); Calhoun notes a quasi-fostering experiment in his 1962 paper (8 of the healthiest from one unspecified universe were spared culling, had fewer litters & no surviving offspring), but does not note that this more strongly supports a genetic rather than social dysfunctionality explanation, as the rest of the colony had been removed and could no longer exert any negative effects. describes the Kessler 1966 thesis, “Interplay between social ecology and physiology, genetics and population dynamics of mice” as using 4 different strains (16 pairs total) as founders (increasing total genetic variance greatly) and had “unusual attainment of very high density” without any collapse despite “less than three square inches per mouse”; Calhoun assumes that it is again due to the environment, related purely to social effects stemming from the number of founders (rather than the great increase in genetic variance from using more individuals from more strains), ignoring Kessler’s other finding of natural selection operating on the mouse populations (showing that noticeable genetic change is possible within a single experiment), and in describing his followup experiments to Kessler (with unclear use of strains but almost certainly only 1 strain as Calhoun’s papers seem to typically only use the BALB/c mice, so changes in founder population would not be as effective as in Kessler 1966), and finds little effect from variation in founder size and again does a quasi-fostering experiment where again despite the absence of their toxic environment the surviving mice had only a few pups & are often sterile & unable to even get pregnant by normal mice.

      One wonders what Calhoun would have found if the universes had been run with wild-type mice in a fully-sterilized environment, universes followed until actual extinction, and all universes were fully reported.

Overall, Mouse Utopia is a sketchy and unreliable result: it is selectively and scantily reported, it is unclear how often the claimed behavioral sinks or population collapses happen even just within Calhoun’s experiments, whether any such problems are due to exogenously-forced density increases rather than the colonies naturally regulating population density close to their optimum, the few replications replicate only parts of it (if at all), it is entirely possible that it is a fluke of that particular mouse colony or mouse strain, and if the experiment ever was replicated exactly (assuming the unpublished materials are adequately informative), it would be unclear what the actual causal mechanism of the collapse would be as the design & analysis is ambiguous and Calhoun tested no hypotheses (much less the most likely ones of disease or genetics, which he resolutely ignored)14. I am left confused what happened in Mouse Utopia, to what extent it reflects any real natural dynamics involving population growth & density, and extremely doubtful of the perennial attempt to link it to humans.

Psychiatry

Sociology

  • Face-to-face meetings, even brief ones, appear to cement personal connections of trust and liking to an extent not achieved by even years of more mediated contact like phone calls or Internet text discussions / emails / chat; this appears to be true in almost every context, even ones like British inventors meeting their heroes (in a different field) just once, with large step functions in connections despite the apparent near-zero marginal information conveyed by a brief physical visit after long-term interactions & track records. (This might be related to 18.)

    Is there something qualitatively different about personal meetings, and if so, where is it? Is it eye contact? Body language? (It’s probably not .)19 Is it mere physical proximity and a certain “inability to suspend disbelief” about a technologically mediated person? Can large wall-sized TV screens for teleconferencing achieve the same effects as regular conferencing? Or do they need to be 3D? What about VR headsets, are they adequate already with avatars and hand-tracking gestural control, or do they require eyetracking, or facial expression mapping? How much is enough?

  • Given the crucial role of trust and shared interests in success stories like Xerox PARC or the Apollo Project or creative collaborations in general, why are there so few extremely successful pairs of identical twins, and relatively few examples of duos like the , or Hollywood’s & ? The reader will struggle to think of more than a handful, or even any other examples (the , over half a century ago? some random football or baseball people?). As identical twins are ~0.5% of the population, and a large fraction of the population has at least one sibling, and the benefits seems so clear (thus leading to enormous elite overrepresentation by the usual tail/order statistic effects eg Jews/East Asians which have similar base-rates as identical twins)—where are they?

    Identical twins should have collaborative superpowers, between shared genetics & upbringing, in their much-envied abilities to completely implicitly trust each other, predict what the other would agree to or be interested in, and so on (collaboration taken to the point of identical twins reportedly sometimes developing a private language or creole in childhood); siblings should also have similar (but much smaller) advantages in collaboration compared to working with strangers. Is the answer something relatively boring like “the slight health/IQ penalty for being an identical twin plus the low base-rate of identical twins plus their remaining variance meaning that one of the pair won’t clear various thresholds means you wouldn’t expect to see many and this is consistent with what we see” or is there some deeper lesson here about greatness/creativity/risk-taking? (The most amusing explanation, of course, would be “most successful people are in fact secretly identical twins”.)

  • Why did it take until the late 20th century for to develop and the crush almost all other unarmed martial arts at the start of (or perhaps ), when humans have engaged in unarmed combat for millions of years and every major country has long lineages of specialized competitive martial arts and tremendous incentive to find martial arts which worked and quick feedback loops? (Regardless of whether the Gracies’ early achievements were overhyped, it still seems like MMA had a enormous impact on the practice of traditional martial arts and that MMA continues to resemble BJJ much more than most things pre-MMA.)

  • Is physical beauty relative or absolute and if the latter, is it objectively increasing over time? Photographs of exceptionally beautiful women from the 1800s or early 1900s, or nude/erotic paintings from before then, strike most people are being drab and unattractive. Given the stability and cross-cultural consistency of beauty ratings (), it seems unlikely that it is merely a matter of shifting norms or preferences or fashion but represents a real ‘absolute’ gain in attractiveness.

    What is going on? Has cosmetics and hairdressing really advanced that much or should we look at explanations like vastly superior vaccines, elimination of childhood disease, superior nutrition, elimination of hard (especially agricultural) labor20, poverty etc? (Large gains in means would not be unprecedented: when we look at photos of children or people from those time periods, one common observation is how short, scrawny, and stunted they look—and indeed, as an objective fact about an accurately-measured cardinal measure with absolute values, they were short & scrawny, and things really have improved that much.) If physical beauty is not zero-sum, how far can it go? Can we expect weird effects akin to ‘the tails come apart’ or the Spearman effect where after sufficient baseline gains, ‘beauty’ starts to diverge in orthogonal directions/specialized types? Or might, like the Flynn effect and height, we already be experiencing a reversal due to the obesity crisis or other factors like mutation load and we have already seen ‘Peak Beauty’ (at least for the average person, of course CGI/growing populations/cosmetic tech implies that models & actors will continue their evolution into superstimuli)?

Miscellaneous

  • Who committed the 2013 and why? Further, why have there been no similar attacks since?

  • Whatever happened to Blake Benthall (“Defcon”) of Silk Road 2? In almost all other cases, arrested DNM staff/operators have been extradited, tried, plea-bargained or convicted, and largely done with within a few years and were well-documented publicly throughout. In the case of Benthall, however, 4 years later, not only is the resolution of his case unknown, his PACER docket hasn’t updated since shortly after his arrest though the case remains open & charges pending. In leaks finally indicated Benthall was still alive and it seemed like he would be prosecuted only for tax evasion‽ If he has been cooperating with LE, what on earth did he have to offer them all this time when the SR2 server was seized in its entirety, and SR2 quickly became ancient history for the DNMs and any personal connections or inside info have long since gone stale?

    • On a similar note, how did the FBI really find the Silk Road 1 server in Iceland—which was so key to finding the Pennsylvania backup server and then Ross Ulbricht himself in SF? Agent Tarbell’s story never made sense (sounding suspiciously like an obfuscated SQLi attack, raising questions about legality) and he decamped bizarrely quickly for the private sector after what should have been a career-defining triumph, nor has the FBI ever gone into any detail about it (it did not come up at trial due to major strategic errors by the defense). It is also highly suspicious that some fake IDs Ross Ulbricht bought to rent servers were intercepted & he was interviewed in SF by LE not long before the server was supposedly located—quite a coincidence in timing. The SR1 investigation was riddled with corruption and questionable actions, and the finding of the SR1 server smells like another case, of a rogue agent or perhaps parallel construction. What really happened in Iceland?
  • How does the , where any advertisement on a website appears to reduce broadly-defined usage by ~10%, work when most users cannot be bothered to install adblock and don’t seem to care? Is there a subtle average effect on all users, who are simply unaware of the irritation or have never experienced the alternative and so are simply mistaken in claiming to not mind & not using adblock, or is there heterogeneity where a relatively small fraction of users do mind intensely, and that drives the effect?

Appendix

Physical Beauty

Is , masculine or feminine, a negative-sum, zero-sum (positional) or positive good? And has beauty increased or decreased over time? Thinking over various anecdotes and examples and changes in public health and environmental factors like nutrition and infectious disease and dentistry, I suggest that physical attractiveness of men & women in the West is not purely positional & relative, but has increased in an absolute sense over the past few centuries (albeit possibly decreasing recently as a consequence of trends like obesity).

“Your teeth are like a flock of sheep just shorn, coming up from the washing. Each has its twin; not one of them is alone.”

4:2 (praising the beauty of the beloved for still having all her teeth)21

In looking at historical paintings & statues, I’ve always been struck by how, even in erotic artwork or work meant to depict the epitome of human beauty or artwork intended to flatter a patron (or serve as an advertisement for a possible betrothal), they just aren’t that beautiful. (Yes, them being ‘Rubenesque’ may be part of it but the modern age of obesity should have long ago negated that.) The disparity gets worse when you look at American photographs from the 1800s onward, such as in biographies; a woman might be described as stunningly beautiful but look quite average in the provided photograph. Or when reading about classic Hollywood starlets such as , after making allowance for the fashions like hideous eyebrows and frying their hair, I can only find them odd looking; was really ? Or when highschool/college class photos are provided from the early 1900s, I can compare them to my own high school class photos, and the sets are almost disjoint in attractiveness—perhaps the top quarter of the old photos overlaps with the bottom quarter of the new photos. But on the other hand, American material from the 1970s or 1980s, does not strike me as any worse than in the 1990s or 2000s (perhaps even better), with most of the increase being perhaps in the 1920–1960 time range. (There may have been increases before then, but while related things like adult life expectancy & height can be documented to have increased considerably before the 1920s, there are no high-quality photographs from before then to judge beauty by.) So if I can see such a clear trend in increasing beauty over time, does that mean that beauty is increasing?

Few would deny that Olympic athletes have, objectively, become much better over the past few centuries—the runners run far faster, the powerlifters lift far heavier weights, and so on, due to professionalization, better equipment, better training, larger populations to recruit from, and many other points of progress. Similarly, boxers and bodybuilders are objectively far more impressive than they were less than a century ago in the 1930s (thanks to ultra-cheap protein and gyms everywhere and drugs and improved training): who would bet a bent penny on boxing world champ , who against a Mike Tyson, much less a MMA star? (Sullivan hardly even looks like he ‘lifts’—because he didn’t.) puzzle solvers have dropped solve times from minutes to seconds, and video game players or speedrunners have achieved similar improvements, and mountain climbers or cliff climbers make impossible climbs now, and all of these are quite objective and difficult to dispute. If all of these can improve so much, why not beauty? Surely physical attractiveness should benefit from many of the same things: more knowledge about physical fitness and diet, cheaper food and travel, better communications, the spread of ‘tricks’ (like lubing a Rubik’s cube for speed), a larger population to draw from, etc.

If it has, then there are many possible reasons. The 20th century in particular saw major progress in nutrition (eg iodization eliminating goiters, which surely are not beautiful), vaccinations eliminating harmful and disfiguring diseases like smallpox, an almost total shift from outdoors work to indoors work (bringing with it protection from the sun and the elements), delayed entry into the workforce, far less manual labor22, cheaper clothing and cosmetics (not to mention a radical expansion in the kinds of cosmetics available such as the creation from almost nothing of the plastic surgery industry), lower lifetime birth rates etc. Many of these changes happened during the 1920–1960 time window, in which iodization went nationwide, key vaccines like polio were rolled out or used to eradicate diseases in the USA, almost doubled, per capita GDP doubled, etc.

All of these could be expected to improve physical beauty, and we can see first-hand proof of how ‘aging’ life in poor countries can be when we look at photographs of women: for example, there is a famous photograph “Migrant Mother” from the Great Depression of a , who one might guess was in her 40s or 50s—she was 32. An interesting datapoint comes from American high school yearbooks (, Ginosar et al 2015); high school yearbooks are homogenous portraits that students prepare for, which haven’t changed much over time, offering a relatively controlled comparison, particularly using composite/average faces, and the differences in attractiveness over time is striking. The main argument of Ginosar et al 2015 is that smiling has increased, but looking at them, I am convinced that the difference between the 1900 average and, say, 1970, is not merely a matter of smiling, and of course, why did smiling or longer hair length become popular? ‘Photographic improvements’ aren’t an answer since cameras got better rapidly and were effectively instantaneous for most of that sample. Improved nutrition and overall health, and optometry & dentistry especially, or cost/quality improvements of soap & indoor plumbing, might have had something to do with that… (Possibly because they could—someone missing most of their teeth, or unable to grow more than scraggly clumps of hair, is not going to be so eager to smile or adopt long styles.)

Overseas, a striking example is provided by the before/after of the famous : from the original photograph, one might guess at her 20s (she was 12), and when she was refound 17 years later at age 30, one might guess she was in her 60s from how haggard and worn her face is. , traveling in impoverished central Japan in 1878, was struck in the mountains by the sight of the people: “The married women look as if they have never known youth, and their skin is apt to be like tanned leather. At Kayashima I asked the house-master’s wife, who looked about 50, how old she was (a polite question in Japan), and she replied 22—one of many similar surprises.” (Unbeaten Tracks in Old Japan, pg94, Letter XII) comparing them unfavorably to the women of the , who “look cheerful, and even merry when they smile, and are not like the Japanese, prematurely old, partly perhaps because their houses are well ventilated, and the use of charcoal is unknown.” One can also see this phenomenon in other countries like Russia with jokes about how ‘devushkas’ turn into ‘babushkas’ overnight on their 30th birthday. In the 1800s, King collected a , a collection of portrait paintings of the most beautiful women he could find regardless of station, ranging from an accountant or cobbler or pawnshop clerk’s daughter to his own daughter, including several mistresses famed for their beauty, such as or ; a similar 1600s gallery, the , depicts many mistresses of (eg , “one of the most beautiful of the Royalist women”), and there is the somewhat later ()—my own impression is that they are clearly trying towards beauty consistent with modern standards but don’t get too far, despite Ludwig in particular casting a wide net. I was struck watching by how the carefully-restored video footage of WWI-era England revealed many of the drafted men—those who were not rejected for reasons of health—were stunted and short, with teeth already missing (perhaps because of—shades of —all that jam on white bread we see them eating), and draftees reportedly gained “1 stone” of weight on average due to being fed real food & exercise. Even as late as 1968 in England, 36% of the population aged >16yo were “edentulous” ie had no natural teeth left; this is not merely driven by the elderly, either, since 25–34yos average ~8%, and by the 35–44yo age bracket, the rate reaches ~20% (); this makes the occasional claim of not so implausible. (Needless to say, English dental health .) In the US, only came about sometime later as a result of draftees not fitting in their uniforms due to the prevalence of goiters (never mind the cretinism); France was little better, with travelers noting whole villages of retarded cretins23, where a quarter of young (relatively) healthy men were rejected by the military and many men were insane, hunchback, bow-legged, or club-footed due to conditions which were little kinder to young rural women either, who one contemporary called often, .24 Life expectancy increases appear to have relatively little to do with headline medical treatments like cancer, and more to do with public health measures like reductions in pandemics, with reductions in childhood illnesses predicting increases in adult life expectancy; and diseases like dementia have been in remarkable decline. All of this points to large improvements in overall “bodily integrity”: everything is more robust and better due to less accumulated damage from lifestyle and childhood infections and pollutants like indoor fires and increased protein consumption.

This accelerated aging, incidentally, turns out to be relevant to contemporary politics, as many wealthy countries grant special immigration privileges to people under 18 years, but older people in poor countries can claim to be much younger than they are and proving otherwise is difficult. Jean Harlow herself furnishes an interesting example, as after long-running health problems such as weight gain/fatigue/paleness, she died aged 26 of kidney disease (now mostly treatable) which was probably the sequelae of a childhood infection by (now curable & occurrence largely suppressed by antibiotics).

Some objections come to mind:

  • with an increasingly large population, the most extreme models and actresses will be much more beautiful than early on, similar to sports. The USA was a smaller population in 1900 than in 2016, and Hollywood & advertising have likewise expanded enormously, in addition to recruiting globally. Early Hollywood starlets were big fish in small national pools. Or perhaps modern advertisements and media are increasingly manipulated with Photoshop

    But then why does it also hold true when we compare photographs of ordinary people, and why would the artwork, whose artists were little constrained by reality, have been exceeded as well? And can we really say that the elimination of things like smallpox scarring makes no difference?

  • beauty is purely relative

There are at least 2 possibilities for how beauty works:

  1. beauty is (mostly) relative/ordinal and is perceived as relative: a beautiful person is merely someone above the average on some arbitrary cultural measurements which are caused by no important objective attributes like health or strength; in another group of people, the same person would be rated by the same raters as ugly rather than beautiful. Particularly good examples of the relativism include the centuries of tooth-blackening and eyebrow-plucking among the Japanese aristocracy, Chinese foot-binding, tanning vs white skin, gavage in Mauritania etc.

    Changes in beauty, therefore, indicate no gains to the possessors of beauty, cause no additional pleasure/displeasure in those around them (as they will perceive the same average level of beauty regardless), will vary wildly from culture to culture, and beauty itself is a harmful construct in that the biases in favor of beauty can disproportionately harm subgroups and in general causes wasteful arms races in time & money spent on tactics like cosmetics, clothing, or surgery, which leaves the group worse off.

  2. beauty is (mostly) objective/cardinal and is perceived as objective: a beautiful person is above average on objective attributes like , long hair, smooth undiseased skin, height, energy & health, personality, intelligence etc. Hence, entire groups of people can increase or decrease in their average beauty, and ratings of individuals will not shift based on reference group.

    Changes in beauty, therefore, may be due to objective improvements or it may be due to cosmetics etc. However, since perceptions are not relative, people will enjoy more what they see, so the arms races may be worthwhile in the same way that any decoration or artwork is worthwhile—because it looks nicer. On the other hand, to the extent that beauty serves as an indicator for objective things, this may be harmful: for example, if beauty & reproductive fitness are to reduce genetic , use of cosmetics is harmful as it hides the harm being done by bad genes & prevents them from being purged.

If #1 is right, then there should be high levels of disagreement about whether a photograph of an individual is ugly or beautiful between raters (who will have been raised in different social groups and have different standards), higher still across ethnic groups, and almost total global disagreement across cultures; and beauty should correlate minimally with traits because social treatment has little effect on stable traits like height or health or intelligence or personality.

, meta-analyzes a variety of studies, and on the first point, finds that ratings of beauty are remarkably consistent and actually increase with distance: within-culture, r=.9/.85; cross-ethnic, r=.88; cross-culture, r=.94. (Given the limits of such inventories, this might imply that agreement on beauty cross-culturally approaches identity.) Langlois et al 2000 also finds that more attractive adults are more employed, date & have sex more and are more socially skilled & extraverted, are in better mental & physical health, and are slightly more intelligent. Unsurprisingly, beliefs that the beautiful are treated better by other people also turn out to be true. (Given that sex did not strongly moderate the results, this suggests that either men pay too little attention to their appearances or women too much.) Combined with the other evidence for things like fluctuating symmetry, #1 can be rejected. (Theory #2 is also more consistent with my personal observations.)

The past is a foreign country, so it seems like a safe assumption that the beauty ratings of someone in, say, 1920 would correlate r=.94 with ours. Then ratings will still be similar—eg someone rated at the 84th percentile (+1SD) by us would on average be rated 82nd percentile (+0.94SD) by them. So we would expect that the modern mean of beauty would be higher as long as it’s at least 0.06SDs higher, which is not much at all.

That would assume the difference is random, though, and not systematic: in the worst case, if that remaining 0.06 reflects a consistent cultural preference & fashion of the moment, then someone in 1920 will rate higher all people from 1920, and someone from 2016 will rate higher all people from 2016. How large would this rating bonus have to be to produce an overall correlation of r=.94? The total variance is , so a binary variable totally explaining the remaining variance must have the effect b=0.342. So in the worst case, we would have to demonstrate an increase by our standards of +0.342SDs before we could be sure that people from 1920 would agree there had been an increase. The implication of this increase is that our 50th percentile would have to match their 63rd percentile; or to put it another way, in random pairs, ~59.5% of modern people would have to be judged the more beautiful. I think this is a bar that could definitely be met, so even in the worst case, beauty has increased over time.


  1. And in the case of sports, we also why it might not be odd if some records set in the 1960s–1980s haven’t been broken yet…↩︎

  2. Both butyric acid and salt would also explain the interest in licking sweaty hands/armpits.↩︎

  3. Perhaps more representative than outright murder is the loosely-inspired-by-Parker-Hulme Simpsons episode, .↩︎

  4. An example is :

    As Wang narrates the Slenderman story, she revisits her own memory of a fraught childhood imagination. Her young mind has been captivated by the world of , a 1984 film depicting a fantasy world that eventually includes its reader in the narrative. Wang describes convincing her best friend Jessica that their life, too, was just another thread in the story, crafting a complicated universe of rules to dictate their time together. “We’re just playing, right?” Jessica finally asks, bemused and a little frightened; Wang’s childhood self disagrees, telling Jessica that the imaginary world was, in fact, real: “With my every denial, she became increasingly hysterical while I remained calm. I watched her leave in sobs; I remained grounded in the world of my imagination.”

    ↩︎
  5. Calhoun reflects on this in: Calhoun, J. B. C. 1979. “Employee’s contribution to the Performance Assessment of his Scientific Service. [Draft.]” 4 December. John B. Calhoun Papers, National Library of Medicine (NLM), Bethesda, MD. n.p.↩︎

  6. The study alluded to by Inglis-Arkell here appears to actually be the discussion of the behavioral sink in chapter 32 of the Hock 2004 textbook.↩︎

  7. Another example of this interpretation would be Moore 1999, .↩︎

  8. Calhoun appears to have maintained this position up to his death in 1996, according to his : “But his work had its frustrations as well, she [a colleague] noted, because its implications for the future of the human rat race were often met with studied disregard. But Dr. Calhoun was convinced that his mice and rat populations were an accurate model for humans.”He didn’t regard it as hypothesis any more, he regarded it as factual," Mrs. Kerr said."↩︎

  9. Given the disparate results of all these universes, it seems, contrary to the claims of that Calhoun “had been building utopian environments for rats and mice since the 1940s, with thoroughly consistent results. Heaven always turned into hell.”, many of them did no such thing.↩︎

  10. I’ve been told that a UK university quashed a third Mouse Utopia proposal on, ironically, ‘ethics’ grounds. (The same reason Zimbardo gives for why no one should ever try to replicate his Stanford Prison Experiment, incidentally.)↩︎

  11. “Dr Calhoun said that they (the investigators) were not very sanitary in their husbandry, if that was the kind of pollution inferred. The environment was cleaned, most feces and soiled bedding removed, every six weeks or two months, but nothing was ever sterilized. He did not consider this necessary in such a closed system and the mice had better survival than in most laboratory colonies.” In claiming Mouse Utopia mortality rates superior to ‘most’ regular lab mice, Calhoun is presumably excluding the extremely high infant or youth mortality.↩︎

  12. See also the .↩︎

  13. Calhoun 1973: “Dr Calhoun felt that there probably was some mutation. Mice which continually circled, about a dozen, had been noted, but these might have been ‘vestibular’ mice and a result of an infection, not mutation. Even if mutation rates were known, the first generation would have been very much like the last. So the real conclusion was that tremendous behavioural differentiation could occur as a result of social environmental influences even given a high degree of genetic homozygosity.” Marsden 1972, presenting a ‘synthesis’ based on a year with Calhoun, denies any possibility of genetic change: "The genetic potential for exploiting this mouse paradise was qualitatively and quantitatively present in equal proportion in each of the original eight colonizers and would remain essentially unchanged even to the _n_th generation."↩︎

  14. A , as I doubt that the urban planners or demographers or Democratic politicians who took an interest in Mouse Utopia would be as interested if the causal mechanism turned out to be “urban densities increase STDs or genetic mutations to the point of collapse”. And if it turned out that Mouse Utopia replicated in mice but never humans (early attempts to correlate population density with social decay in humans apparently did not do well, incidentally), I also doubt if most people citing it, aside from a few zoologists, ethologists, or mouse breeders, would be doing so.↩︎

  15. Although asks if otherkin are in decline—hard as these things are to gauge, they do seem to come up less?↩︎

  16. Pg63–64:

    One morning in 1946 in Los Angeles, Stanislaw Ulam, a newly appointed professor at the University of Southern California, awoke to find himself unable to speak. A few hours later he underwent dangerous surgery after the diagnosis of encephalitis. His skull was sawed open and his brain tissue was sprayed with antibiotics. After a short convalescence he managed to recover apparently unscathed.

    In time, however, some changes in his personality became obvious to those who knew him. Paul Stein, one of his collaborators at the Los Alamos Laboratory (where Stan Ulam worked most of his life), remarked that while Stan had been a meticulous dresser before his operation, a dandy of sorts, afterwards he became visibly sloppy in the details of his attire even though he would still carefully and expensively select every item of clothing he wore.

    Soon after I met him in 1963, several years after the event, I could not help noticing that his trains of thought were not those of a normal person, even a mathematician. In his conversation he was livelier and wittier than anyone I had ever met; and his ideas, which he spouted out at odd intervals, were fascinating beyond anything I have witnessed before or since. However, he seemed to studiously avoid going into any details. He would dwell on any subject no longer than a few minutes, then impatiently move on to something entirely unrelated.

    Out of curiosity, I asked , Stan’s collaborator in the thirties (and, like Stan, a former Junior Fellow at Harvard) about their working habits before his operation. Surprisingly, Oxtoby described how at Harvard they would sit for hours on end, day after day, in front of the blackboard. From the time I met him, Stan never did anything of the sort. He would perform a calculation (even the simplest) only when he had absolutely no other way out. I remember watching him at the blackboard, trying to solve a quadratic equation. He furrowed his brow in rapt absorption while scribbling formulas in his tiny handwriting. When he finally got the answer, he turned around and said with relief: “I feel I have done my work for the day.”

    The Germans have aptly called Sitzfleisch the ability to spend endless hours at a desk doing gruesome work. Sitzfleisch is considered by mathematicians to be a better gauge of success than any of the attractive definitions of talent with which psychologists regale us from time to time. Stan Ulam, however, was able to get by without any Sitzfleisch whatsoever. After his bout with encephalitis, he came to lean on his unimpaired imagination for his ideas, and on the Sitzfleisch of others for technical support. The beauty of his insights and the promise of his proposals kept him amply supplied with young collaborators, willing to lend (and risking the waste of) their time.

    ↩︎
  17. , Wigner 1992, pg109–110:

    Does it seem odd for a mathematician like Hilbert to take a young physicist for an assistant? Well, Hilbert needed no help in mathematics. But his work embraced physics, too, and I hoped to help Hilbert somewhat with physics.

    So I was quite excited to reach Göttingen in 1927. I was quickly and deeply disappointed. I found Hilbert painfully withdrawn. He had contracted pernicious anemia in 1925 and was no longer an active thinker. The worst symptoms of pernicious anemia are not immediately obvious, and Hilbert’s case had not yet been diagnosed. But we knew already that something was quite wrong. Hilbert was only living halfway. His enormous fatigue was plain. And the correct diagnosis was not encouraging when it came. Pernicious anemia was then not considered curable.

    So Hilbert suddenly seemed quite old. He was only about 65, which seems rather young to me now. But life no longer much interested him. I knew very well that old age comes eventually to everyone who survives his stay on this earth. For some people, it is a time of ripe reflection, and I had often envied old men their position. But Hilbert had aged with awful speed, and the prematurity of his decline took the glow from it. His breadth of interest was nearly gone and with it the engaging manner that had earned him so many disciples.

    Hilbert eventually got medical treatment for his anemia and managed to live until 1943. But he was hardly a scientist after 1925, and certainly not a Hilbert. I once explained some new theorem to him. As soon as he saw that its use was limited, he said, “Ah, then one doesn’t really have to learn this one.” It was painfully dear that he did not want to learn it.

    …I had come to Göttingen to be Hilbert’s assistant, but he wanted no assistance. We can all get old by ourselves.

    ↩︎
  18. Although having become much more cynical about psychology and education in particular since I first heard of Bloom’s result back in the 2000s, I would suggest renaming it “Bloom’s 0.5 Sigma problem”…↩︎

  19. Chemical pheromones have been suggested for many things in humans— as a possible mob mechanism—but as far as I know, the evidence they do anything in humans is quite weak (the relevant genes are broken and it’s unclear if we even have a VMO), and some of the relevant-seeming hormones even weaker (like the oxytocin literature turns out to be badly afflicted by ). Given the Reproducibility Crisis, can we really take seriously any of these n=40 studies where “we had some female undergraduates sniff underwear and fill out a survey”? In animals, it’s impossible to mistake that scents/pheromones are an important thing, in a way that they are not in humans—any cat owner will have noticed the ‘’ or ‘gape’, even if they don’t know the name for it (and you don’t have to spend too long around horses to notice it there either).

    And what are the testable implications? For example, meetings held in well-ventilated areas should be disastrous because any pheromone concentration would be diluted far below other meetings. Meetings where you notice body odor, indicating potent bodily output and little ventilation, should go great. Leaders would be well-advised to avoid using deodorant, as that reduces the direct route for pheromone emission. Direct interaction should be weaker than expected as a predictor of bonding/success, because the pheromones are omnidirectional. ‘Mere exposure’ effects should be substantial. People with lower smell acuity should be less affected by meetings as broken olfactory capabilities may break any downstream pheromone sensitivity; anosmics presumably would be entirely indifferent between virtual and real meetings. A (very clear) glass pane should eliminate meeting effects, while incremental improvements in latency, screen resolution, or audio quality would produce small or no gains over the baseline. Gas chromatography could probably identify pheromones and should be able to predict meeting success—while it’s true the hypothetical pheromones may be unknown, hormones/pheromones are frequently in the steroid family, and so my understanding is that it should be possible to measure a “total steroids” concentration in samples which would pick up on any social pheromone and be used in a regression.

    Many of those have not been conducted, but some of them don’t tally with my own experiences. For example, one 2018 conference I attended was what prompted me to ask this—in several cases, I’d known people I met there for years online before, and yet, meeting them in person seemed to make a large difference in how much I trusted or liked them. Good—except most of it was held outside because the weather was so nice and there was a pleasant breeze; everyone got along despite the conditions being awful for any pheromone effects.↩︎

  20. Nobody looks more prematurely aged than a subsistence agriculture peasant.↩︎

  21. The Public Domain Review notes in that missing teeth was so common that it was hardly a point of shame:

    It remains a commonly held belief that for hundreds of years people didn’t smile in pictures because their teeth were generally awful. This is not really true—bad teeth were so common that this was not seen as necessarily taking away from someone’s attractiveness. , Queen Victoria’s Whig prime minister, was often described as being devastatingly good-looking, and having a despite the fact that he had a number of prominent teeth missing as a result of hunting accidents. It was only in later life, when he acquired a set of flapping false teeth, that his image was compromised. His fear of them falling out when he spoke led to a stop-start delivery of his speeches, causing to openly poke fun at him in parliament.

    ↩︎
  22. Which one might expect to hurt, but manual labor is not as effective as regular exercise as it is highly repetitive, can be harmful, does not spread the work over the body evenly and cannot be calibrated to one’s fitness level, and must often be done at rates, times, places, and conditions minimally of one’s choosing. So increasing gender equity, permitting—even expecting—women to participate more in sports and use public gyms etc, could well offset this reduction. Certainly an Afghanistani woman confined to her house by is not better off for it.↩︎

  23. , Robb 2008; ch5:

    At the end of the eighteenth century, doctors from urban Alsace to rural Brittany found that high death rates were not caused primarily by famine and disease. The problem was that, as soon as they became ill, people took to their beds and hoped to die. In 1750, the Marquis d’Argenson noticed that the peasants who farmed his land in the Touraine were ‘trying not to multiply’: ‘They wish only for death’. Even in times of plenty, old people who could no longer wield a spade or hold a needle were keen to die as soon as possible. ‘Lasting too long’ was one of the great fears of life. Invalids were habitually hated by their carers. It took a special government grant, instituted in 1850 in the Seine and Loiret départements, to persuade poor families to keep their ailing relatives at home instead of sending them to that bare waiting room of the graveyard, the municipal hospice.

    When there was just enough food for the living, the mouth of a dying person was an obscenity. In the relatively harmonious household of the 1840s described by the peasant novelist Émile Guillaumin, the family members speculate openly in front of Émile’s bed-ridden grandmother (who has not lost her hearing): ‘“I wish we knew how long it’s going to last.” And another would reply, “Not long, I hope.”’ As soon as the burden expired, any water kept in pans or basins was thrown out (since the soul might have washed itself—or, if bound for Hell, tried to extinguish itself—as it left the house), and then life went on as before.

    ‘Happy as a corpse’ was a saying in the Alps. Visitors to villages in the Savoy Alps, the central Pyrenees, Alsace and Lorraine, and parts of the Massif Central were often horrified to find silent populations of cretins with hideous thyroid deformities. (The link between goitre and lack of iodine in the water was not widely recognized until the early nineteenth century.) The Alpine explorer Saussure, who asked in vain for directions in a village in the Aosta Valley when most of the villagers were out in the fields, imagined that ‘an evil spirit had turned the inhabitants of the unhappy village into dumb animals, leaving them with just enough human face to show that they had once been men’.

    The infirmity that seemed a curse to Saussure was a blessing to the natives. The birth of a cretinous baby was believed to bring good luck to the family. The idiot child would never have to work and would never have to leave home to earn money to pay the tax-collector. These hideous, creatures were already half-cured of life. Even the death of a normal child could be a consolation. If the baby had lived long enough to be baptized, or if a clever witch revived the corpse for an instant to sprinkle it with holy water, its soul would pray for the family in heaven.

    ↩︎
  24. The Discovery of France, Robb 2008; ch6:

    In the mid-nineteenth century, over a quarter of the young men who stood naked in front of military recruitment boards were found to be unfit for service because of ‘infirmity’, which included ‘weak constitution’, a useless or missing limb, partial blindness and eye disease, hernias and genital complaints, deafness, goitre, scrofula and respiratory and chest complaints. In a typical contingent of two hundred and thirty thousand, about one thousand were found to be mentally defective or insane, two thousand were hunchbacks and almost three thousand had bow legs or club feet. A further 5 per cent were too short (under five feet), and about 4 per cent suffered from unspecified complaints which probably included dysentery and virulent infestations of lice. For obvious reasons, people suffering from infectious diseases were not examined and do not appear in the figures.

    This was the healthiest section of the population—young men in their early twenties. The physical condition of everyone else might give the traveller serious doubts about information culled from books, museums and paintings—even if the painters belonged to the Realist school…If one of the living figures turned around, the traveller might find himself looking at what Lieutenant-Colonel Pinkney unkindly called ‘a Venus with the face of an old monkey’. [More precisely: “The peasant women of France work so hard, as to lose every appearance of youth in the face, whilst they retain it in the person; and it is therefore no uncommon thing to see the person of a Venus, and the face of an old monkey.”] To judge by the reactions of contemporary travellers, the biggest surprise would be the preponderance of women in the fields. Until the mid- to late-nineteenth century, almost everywhere in France, apart from the Provençal coast (but not the hinterland), the northeast and a narrow region from Poitou to Burgundy, at least half the people working in the open air were women. In many parts, women appeared to do the lion’s share of the work…The report on southern Normandy cruelly suggested that women were treated as beasts of burden because hard work had robbed them of their beauty: a sun-baked, arthritic creature was hardly an ornament and might as well be put to work. In parts like the southern Auvergne, where society was patriarchal, women seemed to belong to a different caste…Her face confirms the truth of what she says in all but one respect. That evening, at Mars-la-Tour, the traveller remembers her face when he writes his account: ‘It speaks, at the first sight, hard and severe labour. I am inclined to think that they work harder than the men.’ ‘This woman, at no great distance, might have been taken for sixty or seventy, her figure was so bent and her face so furrowed and hardened by labour,—but she said she was only twenty-eight.’

    ↩︎