January 2020 gwern.net newsletter with 5 writeups, and links on AI scaling, videos-for-cats, and art; 1 book and 1 opera review.
2019-12-26–2021-01-04
finished
certainty: log
importance: 0
January 2020’s Gwern.net newsletter is now out; previous, December 2019/
Writings
- “Danbooru2019: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”
- Preference Learning GPT-2 Music: Null Result
- This Waifu Does Not Existv3: 100k StyleGAN 2 anime portrait samples
- Subreddit Simulator: GPT-2-1.5b upgrade
- 14 Internet Search Case Studies
- Gwern.net: margin notes are now inlined on mobile
Media
Links
Genetics:
Everything Is Heritable:
Recent Evolution:
- “The Exposome in Human Evolution: From Dust to Diesel”, Trumble & Finch 2019 (media; previously, Hubbard et al 2016)
Engineering:
- “Utility and First Clinical Application of Screening Embryos for Polygenic Disease Risk Reduction”, Treff et al 2019 (Genomic Prediction)
- “CC, world’s first cloned cat, turns 18 years old”, Katz
AI:
“2019 AI Alignment Literature Review and Charity Comparison”, Larks
Matters Of Scale:
- “Scaling Laws for Neural Language Models”, Kaplan et al 2020 (NN LMs appear to be nowhere near saturation nor training infeasibility as larger models show predictable gains and are both more compute-efficient & data-efficient (!)—onwards to GPT-3?)
- “Big Transfer (BiT): Large Scale Learning of General Visual Representations for Transfer”, Kolesnikov et al 2019 (blog; JFT-300M showing larger=better: large benchmark gains, transfer everywhere, much more robust representations, more plausible errors, etc)
- “DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames”, Wijmans et al 2019 (blog; “Mishkin et al 2019 benchmarked classical (mapping + planning) and learning-based methods…and showed that classical methods outperform learning-based. However, they trained for ‘only’ 5 million steps…Savva et al 2019 then scaled this training to 75 million steps and found that this trend reverses…Fig. 1 shows an agent does not saturate before 1 billion steps, suggesting that previous studies were incomplete by 1–2 orders of magnitude.”)
- “Meena: Towards a Human-like Open-Domain Chatbot”, Adiwardana et al 2020 (blog; 2.6b parameters trained on 341GB text, although Kaplan et al 2020 suggests they’d’ve done better to go much bigger & trained much less than 164 epoches; likelihood loss near identical with human-rated performance? Likelihood loss seems ultimately a flawed metric, but maybe we’re still far from hitting its limits…)
“A Very Unlikely Chess Game” (GPT-2-1.5b shenanigans; rival implementation)
Statistics/
- “Compliance with legal requirement to report clinical trial results on ClinicalTrials.gov: a cohort study”, DeVito et al 2020; “FDA and NIH let clinical trial sponsors keep results secret and break the law”, Science
- “A national experiment reveals where a growth mindset improves achievement”, Yeager et al 2019 (the incredible shrinking ‘growth mindset’ effect & the Stainless Steel Law)
- “Backlash Over Meat Dietary Recommendations Raises Questions About Corporate Ties to Nutrition Scientists”, Rubin 2019 (criticizing the critics of Carroll & Doherty 2019 /
Zeraatkar et al 2019a / Han et al 2019 / Vernooij et al 2019 / Zeraatkar et al 2019b / Valli et al 2019 / Johnston et al 2019—“what’s sauce for the goose is sauce for the gander”) - “Follow-up: I found two identical packs of Skittles, among 468 packs with a total of 27,740 Skittles” (empirically verifying the birthday paradox in Skittles bags)
Politics/
- “The rape of men: the darkest secret of war” (case study: Libya)
- “Under the Weather: As Psychiatrists And Philosophers Begin To Define A Pervasive Mental Health Crisis Triggered By Climate Change, They Ask Who Is Really Sick: The Individual Or Society?”
- “Statistical Reliability Analysis For A Most Dangerous Occupation: Roman Emperor”, Saleh 2019
- “Parachuting For Charity: Is It Worth The Money? A 5-Year Audit Of Parachute Injuries In Tayside And The Cost To The NHS”, Lee et al 1999
- “What’s in a Font?: Ideological Perceptions of Typography”, Haenschen & Tamul 2019 (‘Sunrise is the future liberals want’)
Psychology/
- “A Meta-Analysis of Procedures to Change Implicit Measures”, Forscher et al 2019
- “Attention And Awareness In Stage Magic: Turning Tricks Into Research”, Macknik et al 2008
- “A World Without Pain: Does hurting make us human?” (cf my essay on pain)
- “What Intellectual Progress Did I Make In The 2010s?”, Scott Alexander (a look back on how his ideas/
beliefs evolved over the past decade of psychiatry blogging) - “Three cases of giant panda attack on humans at Beijing Zoo”, Zhang et al 2014 (graphic images)
Technology:
- “Cats, Once YouTube Stars, Are Now an ‘Emerging Audience’: They’re addicted to channels like Little Kitty & Family, Handsome Nature, and Videos for Your Cat—provided their owners switch on the iPad first” (After reading this, I gave videos-for-cats another try with my cat, since he gets stir-crazy in winter. I full-screened it, in landscape mode; he continued resolutely ignoring the screen as always, until I left my earphones out and he heard the birds chirping—he went nuts, convinced a bird had gotten into the apartment, until he finally noticed the screen, and I could see the instant the lightbulb went on and he was hooked. He’ll watch videos like Paul Dinning’s “8 Hour Bird Bonanza” for as many hours as I’ll leave it on, hunting behind the monitor for the birds, and makes a nuisance of himself sitting in front of it, waiting for the birds to come back—which he is doing as I try to write this! These are truly superstimuli: cats would never see these many birds this close up so unsuspecting in the wild.)
Economics:
- “Clustering of health, crime and social-welfare inequality in 4 million citizens from two nations”, Richmond-Rakerd et al 2020 (everything is correlated: “Figure 4: Aggregation of poor health, crime and social-welfare dependency”; previously, Belsky et al 2016 & Caspi et al 2016)
- “The Exquisitely English (and Amazingly Lucrative) World of London Clerks”
Philosophy:
Fiction:
- “Behind the Sensationalism: Images of a Decaying Corpse in Japanese Buddhist Art”, Kanda 2005 (on kusozu, cf Maraṇasati; graphic images; famous examples: Body of a Courtesan in Nine Stages, Kobayashi Eitaku c. 1870s, and The Death Of A Noble Lady And The Decay Of Her Body, c. 1700s)
- “Having Had No Predecessor to Imitate, He Had No Successor Capable of Imitating Him”, Alvaro de Menard (summary of the Homeric Question, resolved by Parry 1933; also worth reading: Borges on the literary merits of different translations of Homer & The Thousand and One Nights)
- “Choose Your Own Adventure: One Book, Many Readings”, Christian Swinehart 2009 (visualizing paths through the classic CYOA gamebooks; see also: “These Maps Reveal the Hidden Structures of Choose Your Own Adventure Books: If you decide to see more, click on this story”, Atlas Obscura 2017)
- “Master of Orion”, Jimmy Maher (review of the seminal 4X strategy game)
Misc:
Books
Nonfiction:
- An Introduction to Japanese Court Poetry, Miner 1968 (review)
Film/TV
Live-action:
Link Bibliography
Bibliography of page links in reading order (with annotations when available):
“January 2020 News”, (2019-12-26):
January 2020 gwern.net newsletter with 5 writeups, and links on AI scaling, videos-for-cats, and art; 1 book and 1 opera review.
“December 2019 News”, (2019-11-21):
December 2019 gwern.net newsletter with links on gene editing, the Replication Crisis, computer latency, and suffering; 4 book reviews, 2 opera/movie reviews, and 2 anime reviews.
“2019 News”, (2019-11-21):
Annual summary of 2019 gwern.net newsletters, selecting my best writings, the best 2019 links by topic, and the best books/movies/anime I saw in 2019, with some general discussion of the year and the 2010s, and an intellectual autobiography of the past decade.
“Gwern.net newsletter archives”, (2013-12-01):
Newsletter tag: archive of all issues back to 2013 for the gwern.net newsletter (monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews.)
“Changelog”, (2013-09-15):
This page is a changelog for Gwern.net: a monthly reverse chronological list of recent major writings/
changes/ additions. Following my writing can be a little difficult because it is often so incremental. So every month, in addition to my regular /
r/ subreddit submissions, I write up reasonably-interesting changes and send it out to the mailing list in addition to a compilation of links & reviews (archives).Gwern “/r/gwern subreddit”, (2018-10-01):
A subreddit for posting links of interest and also for announcing updates to gwern.net (which can be used as a RSS feed). Submissions are categorized similar to the monthly newsletter and typically will be collated there.
“Danbooru2019: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, (2015-12-15):
Deep learning for computer revision relies on large annotated datasets. Classification/
categorization has benefited from the creation of ImageNet, which classifies 1m photos into 1000 categories. But classification/ categorization is a coarse description of an image which limits application of classifiers, and there is no comparably large dataset of images with many tags or labels which would allow learning and detecting much richer information about images. Such a dataset would ideally be >1m images with at least 10 descriptive tags each which can be publicly distributed to all interested researchers, hobbyists, and organizations. There are currently no such public datasets, as ImageNet, Birds, Flowers, and MS COCO fall short either on image or tag count or restricted distribution. I suggest that the “image -boorus” be used. The image boorus are longstanding web databases which host large numbers of images which can be ‘tagged’ or labeled with an arbitrary number of textual descriptions; they were developed for and are most popular among fans of anime, who provide detailed annotations. The best known booru, with a focus on quality, is Danbooru. We provide a torrent/
rsync mirror which contains ~3tb of 3.69m images with 108m tag instances (of 392k defined tags, ~29/ image) covering Danbooru from 2005-05-24–2019-12-31 (final ID: #3,734,659), providing the image files & a JSON export of the metadata. We also provide a smaller torrent of SFW images downscaled to 512×512px JPGs (295GB; 2,828,400 images) for convenience. Our hope is that a Danbooru2019 dataset can be used for rich large-scale classification/
tagging & learned embeddings, test out the transferability of existing computer vision techniques (primarily developed using photographs) to illustration/ anime-style images, provide an archival backup for the Danbooru community, feed back metadata improvements & corrections, and serve as a testbed for advanced techniques such as conditional image generation or style transfer. “GPT-2 Preference Learning for Music Generation”, (2019-12-16):
Standard language generation neural network models, like GPT-2, are trained via likelihood training to imitate human text corpuses. Generated text suffers from persistent flaws like repetition, due to myopic generation word-by-word, and cannot improve on the training data because they are trained to predict ‘realistic’ completions of the training data.
A proposed alternative is to use reinforcement learning to train the NNs, to encourage global properties like coherence & lack of repetition, and potentially improve over the original corpus’s average quality. Preference learning trains a reward function on human ratings, and uses that as the ‘environment’ for a blackbox DRL algorithm like PPO.
OpenAI released a codebase implementing this dual-model preference learning approach for textual generation, based on GPT-2. Having previously used GPT-2 for poetry & music generation, I experimented with GPT-2 preference learning for unconditional music and poetry generation.
I found that preference learning seemed to work better for music than poetry, and seemed to reduce the presence of repetition artifacts, but the results, at n≅7,400 ratings compiled over 23 iterations of training+sampling November 2019–January 2020, are not dramatically better than alternative improvements like scaling up models or more thorough data-cleaning or more stringent sample curation. My blind ratings using n≅200 comparisons showed no large advantage for the RL-tuned samples (winning only 93 of 210 comparisons, or 46%).
This may be due to insufficient ratings, bad hyperparameters, or not using samples generated with common prefixes, but I suspect it’s the former, as some NLP tasks in Ziegler et al 2019 required up to 60k ratings for good performance, and the reward model appeared to achieve poor performance & succumb to adversarial examples easily.
Working with it, I suspect that preference learning is unnecessarily sample-inefficient & data-inefficient, and that the blackbox reinforcement learning approach is inferior to directly using the reward model to optimize text samples, and propose two major architectural overhauls: have the reward model directly model the implied ranking of every datapoint, and drop the agent model entirely in favor of backprop-powered gradient ascent which optimizes sequences to maximize the reward model’s output.
“ThisWaifuDoesNotExist.net”, (2019-02-19):
ThisWaifuDoesNotExist.net
(TWDNE) is a static website which uses JS to display random anime faces generated by StyleGAN neural networks, along with GPT-3-generated anime plot summaries.A screenshot of “This Waifu Does Not Exist” (TWDNE) showing a random StyleGAN-generated anime face and a random GPT-3 text sample conditioned on anime keywords/phrases. “ThisWaifuDoesNotExist, version 3”, (2020-01-20):
Discussion of TWDNEv3, launched January 2020. TWDNEv3 upgrades TWDNEv2 to use 100k anime portraits from an anime portrait StyleGAN2, an improvement to StyleGAN released in December 2019, which removes the blob artifacts and is generally of somewhat higher visual quality. TWDNEv3 provides images in 3 ranges of diversity, showing off both narrow but high quality samples and more wild samples. It replaces the StyleGAN 1 faces and portrait samples.
“Anime Portraits with StyleGAN 2”, (2020-01-20):
How to use StyleGAN2, an improvement to StyleGAN released in December 2019, which removes the blob artifacts and is generally of somewhat higher visual quality. StyleGAN 2 is tricky to use because it requires custom local compilation of optimized code. Aaron Gokaslan provided tips on getting StyleGAN 2 running and trained a StyleGAN 2 on my anime portraits, which is available for download and which I use to create TWDNEv3"
“Update: Upgrading to 1.5B GPT-2, and adding 22 new subreddit-bots”, (2020-01-12):
When I originally trained the models in May 2019, I’d used the 345M version of GPT-2, which at the time was the largest one that OpenAI had publicly released. Last November, however, OpenAI finally released the full 1.5 billion parameter model.
The 1.5B model requires much more memory to fine-tune than the 345M, so I was initially having a lot of difficulty getting it to work on Colab. Thankfully, I was contacted by /u/gwern (here’s his Patreon) and Shawn Presser (/u/shawwwn), who very generously offered to do the fine-tuning themselves if I provided them with the dataset. This training took about 2 weeks, and apparently required around $70K worth of TPU credits, so in hindsight this upgrade definitely wouldn’t have been possible for me to do myself, without their assistance.
Based on my tests of the new model so far, I’m pretty happy with the quality, and IMO it is noticeably more coherent than the 345M version.
One thing that I should point out about the upgrade is that the original 345M models had been separately fine-tuned for each subreddit individually (i.e. there were 108 separate models), whereas the upgraded one is just a single 1.5B model that has been fine-tuned using a combined dataset containing the comments/submissions from all the subreddits that I scraped. The main reason for this decision is simply that it would not have been feasible to train ~100 separate 1.5B models. Also, there may have been benefits from transfer learning across subreddits, which wouldn’t occur with separate models.
…Here is the full list of new bots to be added: /r/capitalismvsocialism · /r/chess · /r/conlangs · /r/dota2 · /r/etymology · /r/fiftyfifty · /r/hobbydrama · /r/markmywords · /r/moviedetails · /r/neoliberal · /r/obscuremedia · /r/recipes · /r/riddles · /r/stonerphilosophy · /r/subsimulatorgpt2 · /r/subsimulatorgpt2meta · /r/tellmeafact · /r/twosentencehorror · /r/ukpolitics · /r/wordavalanches · /r/wouldyourather · /r/zen
“Internet Search Tips: 14 Case Studies”, (2020-01-21):
Followup section to the article covering how to search the Internet effectively: 14 case studies of challenging Internet searches drawn from the past 10 years. I present the problem, and step through the process of finding it, and describe my tacit knowledge and implicit strategies. These case studies hopefully make the prior tips more understandable by showing them off in practice.
“Genomic Relationships, Novel Loci, and Pleiotropic Mechanisms across Eight Psychiatric Disorders”, (2019-12-12):
- Three groups of highly genetically-related disorders among 8 psychiatric disorders
- Identified 109 pleiotropic loci affecting more than one disorder
- Pleiotropic genes show heightened expression beginning in 2nd prenatal trimester
- Pleiotropic genes play prominent roles in neurodevelopmental processes
Genetic influences on psychiatric disorders transcend diagnostic boundaries, suggesting substantial pleiotropy of contributing loci. However, the nature and mechanisms of these pleiotropic effects remain unclear. We performed analyses of 232,964 cases and 494,162 controls from genome-wide studies of anorexia nervosa, attention-deficit/hyperactivity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, schizophrenia, and Tourette syndrome. Genetic correlation analyses revealed a meaningful structure within the eight disorders, identifying three groups of inter-related disorders. Meta-analysis across these eight disorders detected 109 loci associated with at least two psychiatric disorders, including 23 loci with pleiotropic effects on four or more disorders and 11 loci with antagonistic effects on multiple disorders. The pleiotropic loci are located within genes that show heightened expression in the brain throughout the lifespan, beginning prenatally in the second trimester, and play prominent roles in neurodevelopmental processes. These findings have important implications for psychiatric nosology, drug development, and risk prediction.
“Everything Is Correlated”, (2014-09-12):
Statistical folklore asserts that “everything is correlated”: in any real-world dataset, most or all measured variables will have non-zero correlations, even between variables which appear to be completely independent of each other, and that these correlations are not merely sampling error flukes but will appear in large-scale datasets to arbitrarily designated levels of statistical-significance or posterior probability.
This raises serious questions for null-hypothesis statistical-significance testing, as it implies the null hypothesis of 0 will always be rejected with sufficient data, meaning that a failure to reject only implies insufficient data, and provides no actual test or confirmation of a theory. Even a directional prediction is minimally confirmatory since there is a 50% chance of picking the right direction at random.
It also has implications for conceptualizations of theories & causal models, interpretations of structural models, and other statistical principles such as the “sparsity principle”.
“Investigating the Genetic Architecture of Non-Cognitive Skills Using GWAS-by-Subtraction”, (2020-01-15):
Educational attainment (EA) is influenced by cognitive abilities and by other characteristics and traits. However little is known about the genetic architecture of these “non-cognitive” contributions to EA. Here, we use Genomic Structural Equation Modelling and results of prior genome-wide association studies (GWASs) of EA (N = 1,131,881) and cognitive test performance (N = 257,841) to estimate SNP associations with variation in EA that is independent of cognitive ability. We identified 157 genome-wide significant loci and a polygenic architecture accounting for 57% of genetic variance in EA. Phenotypic and biological annotation revealed that (1) both cognitive and non-cognitive contributions to EA were genetically correlated with socioeconomic success and longevity; and (2) non-cognitive contributions to EA were related to personality, decision making, risk-behavior, and increased risk for psychiatric disorders; (3) non-cognitive and cognitive contributions to EA were enriched in the same tissues and cell types, but (4) showed different associations with gray-matter neuroimaging phenotypes.
“The Exposome in Human Evolution: From Dust to Diesel”, (2019-12):
Global exposures to air pollution and cigarette smoke are novel in human evolutionary history and are associated with at least 12 million premature deaths per year. We investigate the history of the human exposome for relationships between novel environmental toxins and genetic changes during human evolution in six phases. Phase I: With increased walking on savannas, early human ancestors inhaled crustal dust, fecal aerosols, and spores; carrion scavenging introduced new infectious pathogens. Phase II: Domestic fire exposed early Homo to novel toxins from smoke and cooking. Phases III and IV: Neolithic to preindustrial Homo sapiens incurred infectious pathogens from domestic animals and dense communities with limited sanitation. Phase V: Industrialization introduced novel toxins from fossil fuels, industrial chemicals, and tobacco at the same time infectious pathogens were diminishing. Thereby, pathogen-driven causes of mortality were replaced by chronic diseases driven by sterile inflammogens, exogenous and endogenous. Phase VI: Considers future health during global warming with increased air pollution and infections. We hypothesize that adaptation to some ancient toxins persists in genetic variations associated with inflammation and longevity. [Keywords: exposome, human evolution, genes, toxins, infections.]
-
Scientists are still figuring out how air pollution causes these ailments. They are also puzzling over the apparent resilience that some people have to this modern onslaught. Some researchers now argue that the answers to these questions lie in our distant evolutionary past, millions of years before the first cigarette was lit and the first car hit the road.
Our ancestors were bedeviled by airborne toxins even as bipedal apes walking the African savanna, argued Benjamin Trumble, a biologist at Arizona State University, and Caleb Finch of the University of Southern California, in the December issue of the Quarterly Review of Biology. Our forebears evolved defenses against these pollutants, the scientists propose. Today, those adaptations may provide protection, albeit limited, against tobacco smoke and other airborne threats. But our evolutionary legacy may also be a burden, Dr. Trumble and Dr. Finch speculated. Some genetic adaptations may have increased our vulnerability to diseases linked to air pollution. It is “a really creative, interesting contribution to evolutionary medicine,” said Molly Fox, an anthropologist at the University of California, Los Angeles, who was not involved in the new study. The story begins about seven million years ago. Africa at the time was gradually growing more arid. The Sahara emerged in northern Africa, while grasslands opened up in eastern and southern Africa. The ancestors of chimpanzees and gorillas remained in the retreating forests, but our ancient relatives adapted to the new environments. They evolved into a tall, slender frame well suited to walking and running long distances. Dr. Finch and Dr. Trumble believe that early humans faced another challenge that has gone largely overlooked: the air. Periodically, the savanna would have experienced heavy dust storms from the Sahara, and our distant ancestors may have risked harm to their lungs from breathing in the silica-rich particles. “When the dust is up, we’re going to see more pulmonary problems,” Dr. Finch said. Even today, Greek researchers have found that when Sahara winds reach their country, patients surge into hospitals with respiratory complaints. The dense foliage of tropical forests gave chimpanzees and gorillas a refuge from dust. But the earliest humans, wandering the open grasslands, had nowhere to hide. Dust was not the only hazard. The lungs of early humans also may have been irritated by the high levels of pollen and particles of fecal matter produced by the savanna’s vast herds of grazing animals. Dr. Finch and Dr. Trumble maintain that scientists should consider whether these new challenges altered our biology through natural selection. Is it possible, for instance, that people who are resilient to cigarette smoke have inherited genetic variants that protected their distant ancestors from cave fires?
…“Most traditional people live in a highly smoky environment,” Dr. Finch said. “I think it has been a fact of human living for us even before our species.” Smoke created a new evolutionary pressure, he and Dr. Trumble believe. Humans evolved powerful liver enzymes, for example, to break down toxins passing into the bloodstream from the lungs. Gary Perdew, a molecular toxicologist at Penn State University, and his colleagues have found evidence of smoke-driven evolution in another gene, AHR. This gene makes a protein found on cells in the gut, lungs and skin. When toxins get snagged on the protein, cells release enzymes that break down the poisons. Other mammals use AHR to detoxify their food. But the protein is also effective against some of the compounds in wood smoke. Compared to other species, the human version produces a weaker response to toxins, perhaps because AHR protein is not the perfect protector—the fragments it leaves behind can cause tissue damage. Before fire, our ancestors did not need to use AHR very often; in theory, their bodies could tolerate the limited damage the protein caused. But when we began breathing smoke regularly and needing the AHR protein constantly, the gene might have become dangerous to our health. Dr. Perdew believes that humans evolved a weaker AHR response as a way to find “a sweet spot,” a compromise that minimized the damage of airborne pollutants without causing too many side effects. These adaptations were never perfect, as evidenced by the fact that millions of people still die today from indoor air pollution. But evolution doesn’t seek perfect health.
“Divergent Ah Receptor Ligand Selectivity during Hominin Evolution”, (2016):
We have identified a fixed nonsynonymous sequence difference between humans (Val381; derived variant) and Neandertals (Ala381; ancestral variant) in the ligand-binding domain of the aryl hydrocarbon receptor (AHR) gene. In an exome sequence analysis of four Neandertal and Denisovan individuals compared with nine modern humans, there are only 90 total nucleotide sites genome-wide for which archaic hominins are fixed for the ancestral nonsynonymous variant and the modern humans are fixed for the derived variant. Of those sites, only 27, including Val381 in the AHR, also have no reported variability in the human dbSNP database, further suggesting that this highly conserved functional variant is a rare event. Functional analysis of the amino acid variant Ala381 within the AHR carried by Neandertals and nonhuman primates indicate enhanced polycyclic aromatic hydrocarbon (PAH) binding, DNA binding capacity, and AHR mediated transcriptional activity compared with the human AHR. Also relative to human AHR, the Neandertal AHR exhibited 150-1000 times greater sensitivity to induction of Cyp1a1 and Cyp1b1 expression by PAHs (e.g., benzo(a)pyrene). The resulting CYP1A1/CYP1B1 enzymes are responsible for PAH first pass metabolism, which can result in the generation of toxic intermediates and perhaps AHR-associated toxicities. In contrast, the human AHR retains the ancestral sensitivity observed in primates to nontoxic endogenous AHR ligands (e.g., indole, indoxyl sulfate). Our findings reveal that a functionally significant change in the AHR occurred uniquely in humans, relative to other primates, that would attenuate the response to many environmental pollutants, including chemicals present in smoke from fire use during cooking.
“Utility and First Clinical Application of Screening Embryos for Polygenic Disease Risk Reduction”, (2019-12-04):
For over 2 decades preimplantation genetic testing (PGT) has been in clinical use to reduce the risk of miscarriage and genetic disease in patients with advanced maternal age and risk of transmitting disease. Recently developed methods of genome-wide genotyping and machine learning algorithms now offer the ability to genotype embryos for polygenic disease risk with accuracy equivalent to adults. In addition, contemporary studies on adults indicate the ability to predict polygenic disorders with risk equivalent to monogenic disorders. Existing biobanks provide opportunities to model the clinical utility of polygenic disease risk reduction among sibling adults. Here, we provide a mathematical model for the use of embryo screening to reduce the risk of type 1 diabetes. Results indicate a 45–72% reduced risk with blinded genetic selection of one sibling. The first clinical case of polygenic risk scoring in human preimplantation embryos from patients with a family history of complex disease is reported. In addition to these data, several common and accepted practices place PGT for polygenic disease risk in the applicable context of contemporary reproductive medicine. In addition, prediction of risk for PCOS, endometriosis, and aneuploidy are of particular interest and relevance to patients with infertility and represent an important focus of future research on polygenic risk scoring in embryos.
“CC, world’s first cloned cat, turns 18 years old”, (2020-01-02):
The first of her kind, CC the cloned cat is breaking more boundaries as she turns 18 years old. There are no big plans locally to mark the day, but CC—Carbon Copy or Copy Cat—will be the focus of a Dutch cartoon set for release today to celebrate her birthday, researcher and owner Duane Kraemer said.
...CC is not only enjoying life as the Kraemers’ pet, but she has her own condo called the “kitty house” behind the Kraemers’ house where she lives with her three offspring, sired by a cat named Smokey. Those offspring, just by existing, helped CC make headlines in the scientific community. There had not been much research done in the reproduction success of clones—and none had been done with a cat. Tim, Zip and Tess were born Sept. 1, 2006, along with a fourth kitten that was stillborn. Not knowing CC’s reaction would be to her kittens, Kraemer said, they found CC was “the perfect mother” and had the innate maternal instincts they were hoping she would exhibit. Besides proving clones can successfully reproduce, CC also proved not all clones die young. “Dolly the sheep, that was the first of the mammals to be cloned by nuclear transfer, had died at, I think, at 6 years of age,” Kraemer said. “So the fact that CC didn’t die young was news.” About 20% of cloned animals have developmental abnormalities of some kind, he said, with some being serious enough to result in the animal’s death at a young age or at birth. However, the other 80% born without those conditions “would probably live to a normal variation of ages.”
“2019 AI Alignment Literature Review and Charity Comparison”, (2019-12-18):
As in 2016, 2017 and 2018, I have attempted to review the research that has been produced by various organisations working on AI safety, to help potential donors gain a better understanding of the landscape. This is a similar role to that which GiveWell performs for global health charities, and somewhat similar to a securities analyst with regards to possible investments. My aim is basically to judge the output of each organisation in 2019 and compare it to their budget. This should give a sense of the organisations’ average cost-effectiveness. We can also compare their financial reserves to their 2019 budgets to get a sense of urgency.
…Here are the un-scientifically-chosen hashtags: Agent Foundations · AI Theory · Amplification · Careers · CIRL · Decision Theory · Ethical Theory · Forecasting · Introduction · Misc · ML safety · Other Xrisk · Overview · Philosophy · Politics · RL · Security · Short-term · Strategy.
- Research organisations reviewed: FHI (The Future of Humanity Institute) · CHAI (The Center for Human-Aligned AI) · MIRI (The Machine Intelligence Research Institute) · GCRI (The Global Catastrophic Risks Institute) · CSER (The Center for the Study of Existential Risk) · Ought · OpenAI · Google DeepMind · AI Safety camp · FLI (The Future of Life Institute) · AI Impacts · GPI (The Global Priorities Institute) · FRI (The Foundational Research Institute) · Median Group · CSET (The Center for Security and Emerging Technology) · Leverhulme Center for the Future of Intelligence · BERI (The Berkeley Existential Risk Initiative) · AI Pulse
- Capital Allocators reviewed: LTFF (Long-term future fund) · OpenPhil (The Open Philanthropy Project)
…The size of the field continues to grow, both in terms of funding and researchers. Both make it increasingly hard for individual donors. I’ve attempted to subjectively weigh the productivity of the different organisations against the resources they used to generate that output, and donate accordingly.
“Scaling Laws for Neural Language Models”, (2020-01-23):
We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within a wide range. Simple equations govern the dependence of overfitting on model/dataset size and the dependence of training speed on model size. These relationships allow us to determine the optimal allocation of a fixed compute budget. Larger models are significantly more sample-efficient, such that optimally compute-efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence.
“Big Transfer (BiT): General Visual Representation Learning”, (2019-12-24):
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes – from 1 example per class to 1M total examples. BiT achieves 87.5 on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.
https:/
/ ai.googleblog.com/ 2020/ 05/ open-sourcing-bit-exploring-large-scale.html “DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames”, (2019-11-01):
We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtual robots to navigate in Habitat-Sim, DD-PPO exhibits near-linear scaling – achieving a speedup of 107x on 128 GPUs over a serial implementation. We leverage this scaling to train an agent for 2.5 Billion steps of experience (the equivalent of 80 years of human experience) – over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs.
This massive-scale training not only sets the state of art on Habitat Autonomous Navigation Challenge 2019, but essentially solves the task –near-perfect autonomous navigation in an unseen environment without access to a map, directly from an RGB-D camera and a GPS+Compass sensor. Fortuitously, error vs computation exhibits a power-law-like distribution; thus, 90 performance is obtained relatively early (at 100 million steps) and relatively cheaply (under 1 day with 8 GPUs). Finally, we show that the scene understanding and navigation policies learned can be transferred to other navigation tasks – the analog of ImageNet pre-training + task-specific fine-tuning for embodied AI. Our model outperforms ImageNet pre-trained CNNs on these transfer tasks and can serve as a universal resource (all models and code are publicly available).
“Near-perfect point-goal navigation from 2.5 billion frames of experience”, (2020-01-21):
The AI community has a long-term goal of building intelligent machines that interact effectively with the physical world, and a key challenge is teaching these systems to navigate through complex, unfamiliar real-world environments to reach a specified destination—without a preprovided map. We are announcing today that Facebook AI has created a new large-scale distributed reinforcement learning (RL) algorithm called DD-PPO, which has effectively solved the task of point-goal navigation using only an RGB-D camera, GPS, and compass data. Agents trained with DD-PPO (which stands for decentralized distributed proximal policy optimization) achieve nearly 100% success in a variety of virtual environments, such as houses and office buildings. We have also successfully tested our model with tasks in real-world physical settings using a LoCoBot and Facebook AI’s PyRobot platform. An unfortunate fact about maps is that they become outdated the moment they are created. Most real-world environments evolve—buildings and structures change, objects are moved around, and people and pets are in constant flux. By learning to navigate without a map, DD-PPO-trained agents will accelerate the creation of new AI applications for the physical world.
Previous systems reached a 92% success rate on these tasks, but even failing 1 out of 100 times is not acceptable in the physical world, where a robot agent might damage itself or its surroundings by making an error. DD-PPO-trained agents reach their goal 99.9% of the time. Perhaps even more impressive, they do so with near-maximal efficiency, choosing a path that comes within 3% (on average) of matching the shortest possible route from the starting point to the goal. It is worth stressing how uncompromising this task is. There is no scope for mistakes of any kind—no wrong turn at a crossroads, no backtracking from a dead end, no exploration or deviation of any kind from the most direct path. We believe that the agent learns to exploit the statistical regularities in the floor plans of real indoor environments (apartments, houses, and offices) that are also present in our data sets. This improved performance is powered by a new, more effective system for distributed training (DD-PPO), along with the state-of-the-art speed and fidelity of Facebook AI’s open source AI Habitat platform.
…We propose a simple, synchronous, distributed RL method that scales well. We call this method decentralized distributed proximal policy optimization, as it is decentralized (has no parameter server) and distributed (runs across many machines), and we use it to scale proximal policy optimization, a previously developed technique (Schulman et al., 2017). In DD-PPO, each worker alternates between collecting experience in a resource-intensive, GPU-accelerated simulated environment and then optimizing the model. This distribution is synchronous—there is an explicit communication stage in which workers synchronize their updates to the model.
The variability in experience collection runtime presents a challenge to using this method in RL. In supervised learning, all gradient computations take approximately the same time. In RL, some resource-intensive environments can take significantly longer to simulate. This introduces significant synchronization overhead, as every worker must wait for the slowest to finish collecting experience. To address this, we introduced a preemption threshold where the rollout collection stage of these stragglers is forced to end early once some percentage, p percent, (we find 60% to work well) of the other workers are finished collecting their rollout, thereby dramatically improving scaling. Our system weighs all workers’ contributions to the loss equally and limits the minimum number of steps before preemption to one-fourth the maximum to ensure that all environments contribute to learning.
“Benchmarking Classic and Learned Navigation in Complex 3D Environments”, (2019-01-30):
Navigation research is attracting renewed interest with the advent of learning-based methods. However, this new line of work is largely disconnected from well-established classic navigation approaches. In this paper, we take a step towards coordinating these two directions of research. We set up classic and learning-based navigation systems in common simulated environments and thoroughly evaluate them in indoor spaces of varying complexity, with access to different sensory modalities. Additionally, we measure human performance in the same environments. We find that a classic pipeline, when properly tuned, can perform very well in complex cluttered environments. On the other hand, learned systems can operate more robustly with a limited sensor suite. Overall, both approaches are still far from human-level performance.
“Habitat: A Platform for Embodied AI Research”, (2019-04-02):
We present Habitat, a platform for research in embodied artificial intelligence (AI). Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast – when rendering a scene from Matterport3D, it achieves several thousand frames per second (fps) running single-threaded, and can reach over 10,000 fps multi-process on a single GPU. (ii) Habitat-API: a modular high-level library for end-to-end development of embodied AI algorithms – defining tasks (e.g., navigation, instruction following, question answering), configuring, training, and benchmarking embodied agents.
These large-scale engineering contributions enable us to answer scientific questions requiring experiments that were till now impracticable or ’merely’ impractical. Specifically, in the context of point-goal navigation: (1) we revisit the comparison between learning and SLAM approaches from two recent works and find evidence for the opposite conclusion – that learning outperforms SLAM if scaled to an order of magnitude more experience than previous investigations, and (2) we conduct the first cross-dataset generalization experiments train, test x Matterport3D, Gibson for multiple sensors blind, RGB, RGBD, D and find that only agents with depth (D) sensors generalize across datasets. We hope that our open-source platform and these findings will advance research in embodied AI.
“Towards a Human-like Open-Domain Chatbot”, (2020-01-27):
We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize perplexity of the next token. We also propose a human evaluation metric called Sensibleness and Specificity Average (SSA), which captures key elements of a human-like multi-turn conversation. Our experiments show strong correlation between perplexity and SSA. The fact that the best perplexity end-to-end trained Meena scores high on SSA (72 SSA of 86 Additionally, the full version of Meena (with a filtering mechanism and tuned decoding) scores 79 we evaluated.
“Towards a Conversational Agent that Can Chat About…Anything”, (2020-01-28):
Modern conversational agents (chatbots) tend to be highly specialized—they perform well as long as users don’t stray too far from their expected usage. To better handle a wide variety of conversational topics, open-domain dialog research explores a complementary approach attempting to develop a chatbot that is not specialized but can still chat about virtually anything a user wants. Besides being a fascinating research problem, such a conversational agent could lead to many interesting applications, such as further humanizing computer interactions, improving foreign language practice, and making relatable interactive movie and videogame characters.
However, current open-domain chatbots have a critical flaw—they often don’t make sense. They sometimes say things that are inconsistent with what has been said so far, or lack common sense and basic knowledge about the world. Moreover, chatbots often give responses that are not specific to the current context. For example, “I don’t know,” is a sensible response to any question, but it’s not specific. Current chatbots do this much more often than people because it covers many possible user inputs.
In “Towards a Human-like Open-Domain Chatbot”, we present Meena, a 2.6 billion parameter end-to-end trained neural conversational model. We show that Meena can conduct conversations that are more sensible and specific than existing state-of-the-art chatbots. Such improvements are reflected through a new human evaluation metric that we propose for open-domain chatbots, called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for human conversation. Remarkably, we demonstrate that perplexity, an automatic metric that is readily available to any neural conversational models, highly correlates with SSA.
…The Meena model has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations. Compared to an existing state-of-the-art generative model, OpenAI GPT-2, Meena has 1.7× greater model capacity and was trained on 8.5× more data.
…For each chatbot, we collect between 1600 and 2400 individual conversation turns through about 100 conversations. Each model response is labeled by crowdworkers to indicate if it is sensible and specific. The sensibleness of a chatbot is the fraction of responses labeled “sensible”, and specificity is the fraction of responses that are marked “specific”. The average of these two is the SSA score. The results below demonstrate that Meena does much better than existing state-of-the-art chatbots by large margins in terms of SSA scores, and is closing the gap with human performance.
Automatic Metric: Perplexity
Researchers have long sought for an automatic evaluation metric that correlates with more accurate, human evaluation. Doing so would enable faster development of dialogue models, but to date, finding such an automatic metric has been challenging. Surprisingly, in our work, we discover that perplexity, an automatic metric that is readily available to any neural seq2seq model, exhibits a strong correlation with human evaluation, such as the SSA value. Perplexity measures the uncertainty of a language model. The lower the perplexity, the more confident the model is in generating the next token (character, subword, or word). Conceptually, perplexity represents the number of choices the model is trying to choose from when producing the next token.
During development, we benchmarked eight different model versions with varying hyperparameters and architectures, such as the number of layers, attention heads, total training steps, whether we use Evolved Transformer or regular Transformer, and whether we train with hard labels or with distillation. As illustrated in the figure below, the lower the perplexity, the better the SSA score for the model, with a strong correlation coefficient (R2 = 0.93)…As advocated previously, we will continue our goal of lowering the perplexity of neural conversational models through improvements in algorithms, architectures, data, and compute.
“A Very Unlikely Chess Game”, (2020-01-06):
…Black is GPT-2. Its excuse [for this chess blunder] is that it’s a text prediction program with no concept of chess. As far as it knows, it’s trying to predict short alphanumeric strings like “e2e4” or “Nb7”. Nobody told it this represents a board game. It doesn’t even have a concept of 2D space that it could use to understand such a claim. But it still captured my rook! Embarrassing!…Last month, I asked him if he thought GPT-2 could play chess. I wondered if he could train it on a corpus of chess games written in standard notation (where, for example, e2e4 means “move the pawn at square e2 to square e4”). There are literally millions of games written up like this. GPT-2 would learn to predict the next string of text, which would correspond to the next move in the chess game. Then you would prompt it with a chessboard up to a certain point, and it would predict how the chess masters who had produced its training data would continue the game – ie make its next move using the same heuristics they would. Gwern handed the idea to his collaborator Shawn Presser, who had a working GPT-2 chess engine running within a week:…You can play against GPT-2 yourself by following the directions in the last tweet, though it won’t be much of a challenge for anyone better than I am.
…What does this imply? I’m not sure (and maybe it will imply more if someone manages to make it actually good). It was already weird to see something with no auditory qualia learn passable poetic meter. It’s even weirder to see something with no concept of space learn to play chess. Is any of this meaningful? How impressed should we be that the same AI can write poems, compose music, and play chess, without having been designed for any of those tasks? I still don’t know.
[See also the much later Noever et al 2020a/Noever et al 2020b who do the exact same thing in applying GPT-2 to Go SGF/chess PGN games.]
“Transformers Play Chess”, (2020-01-10):
The Shannon entropy of natural English language is roughly one byte per word, depending on the dataset used. Shannon estimated the number of possible chess games to be 10120. I’ve also seen an estimate of 3 reasonable moves per ply (so 1040 possible 40 move games). This begs the question: just how much information is there in a chess move?…I treated this as a sequence modeling problem. An alternative (and possibly better) approach would be to explicitly make use of the board state. However as I was lazy, I did not do this. I was also motivated by the idea of recreating blindfold chess, which is challenging for humans, but unclear for computers (how would you blindfold a computer?—(also see Tom Murphy’s Elo World)). Also as the “Markovian” approach of simply predicting the move given the current board state has been done many, many times before, I decided this was more interesting.
…The
lichess.org
game database contains at the time of writing roughly 1 billion games…I chose to use long algebraic notation, which specifies the start and end coordinate of every piece moved (for example, e2e4). “special” moves also include castling and promotion. There are slightly less than 2000 valid unique tokens in this notation…I used the
transformer_big_single_gpu
(henceforth known as T78) model from the tensor2tensor repository which has roughly 78 million parameters. I used the default hyperparameters and did not tune anything. I trained on a single 1080ti for almost 4 days (~2 million steps). This turns out to be roughly 50 million games, which is to say, the model only saw 25% of the dataset.…Results:
- A games: 2.15 bits per ply, 4.43 perplexity
- B games: 2.26 bits per ply, 4.80 perplexity
I “preregistered” a guess of 2.5 bits per ply before running any experiments. After seeing the results, I believe a better designed model could probably reach between 1.6 and 2.0 BPP. I also believe a larger model would perform better, as I was probably close to saturating the capacity of T78.
[Response to “A Very Unlikely Chess Game”, see Reddit.]
“Compliance with legal requirement to report clinical trial results on ClinicalTrials.gov: a cohort study”, (2020-01-17):
Background: Failure to report the results of a clinical trial can distort the evidence base for clinical practice, breaches researchers’ ethical obligations to participants, and represents an important source of research waste. The Food and Drug Administration Amendments Act (FDAAA) of 2007 now requires sponsors of applicable trials to report their results directly onto ClinicalTrials.gov within 1 year of completion. The first trials covered by the Final Rule of this act became due to report results in January, 2018. In this cohort study, we set out to assess compliance.
Methods: We downloaded data for all registered trials on ClinicalTrials.gov each month from March, 2018, to September, 2019. All cross-sectional analyses in this manuscript were performed on data extracted from ClinicalTrials.gov on Sept 16, 2019; monthly trends analysis used archived data closest to the 15th day of each month from March, 2018, to September, 2019. Our study cohort included all applicable trials due to report results under FDAAA. We excluded all non-applicable trials, those not yet due to report, and those given a certificate allowing for delayed reporting. A trial was considered reported if results had been submitted and were either publicly available, or undergoing quality control review at ClinicalTrials.gov. A trial was considered compliant if these results were submitted within 1 year of the primary completion date, as required by the legislation. We described compliance with the FDAAA 2007 Final Rule, assessed trial characteristics associated with results reporting using logistic regression models, described sponsor-level reporting, examined trends in reporting, and described time-to-report using the Kaplan-Meier method.
Findings: 4209 trials were due to report results; 1722 (40·9%; 95% CI 39·4–42·2) did so within the 1-year deadline. 2686 (63·8%; 62·4–65·3) trials had results submitted at any time. Compliance has not improved since July, 2018. Industry sponsors were significantly more likely to be compliant than non-industry, non-US Government sponsors (odds ratio [OR] 3·08 [95% CI 2·52–3·77]), and sponsors running large numbers of trials were significantly more likely to be compliant than smaller sponsors (OR 11·84 [9·36–14·99]). The median delay from primary completion date to submission date was 424 days (95% CI 412–435), 59 days higher than the legal reporting requirement of 1 year.
Interpretation: Compliance with the FDAAA 2007 is poor, and not improving. To our knowledge, this is the first study to fully assess compliance with the Final Rule of the FDAAA 2007. Poor compliance is likely to reflect lack of enforcement by regulators. Effective enforcement and action from sponsors is needed; until then, open public audit of compliance for each individual sponsor may help. We will maintain updated compliance data for each individual sponsor and trial at fdaaa.trialstracker.net.
Funding: Laura and John Arnold Foundation.
“FDA and NIH let clinical trial sponsors keep results secret and break the law”, (2020-01-13):
The rule took full effect 2 years ago, on 2018-01-18, giving trial sponsors ample time to comply. But a Science investigation shows that many still ignore the requirement, while federal officials do little or nothing to enforce the law.
Science examined more than 4700 trials whose results should have been posted on the NIH website ClinicalTrials.gov under the 2017 rule. Reporting rates by most large pharmaceutical companies and some universities have improved sharply, but performance by many other trial sponsors—including, ironically, NIH itself—was lackluster. Those sponsors, typically either the institution conducting a trial or its funder, must deposit results and other data within 1 year of completing a trial. But of 184 sponsor organizations with at least five trials due as of 2019-09-25, 30 companies, universities, or medical centers never met a single deadline. As of that date, those habitual violators had failed to report any results for 67% of their trials and averaged 268 days late for those and all trials that missed their deadlines. They included such eminent institutions as the Harvard University-affiliated Boston Children’s Hospital, the University of Minnesota, and Baylor College of Medicine—all among the top 50 recipients of NIH grants in 2019. The violations cover trials in virtually all fields of medicine, and the missing or late results offer potentially vital information for the most desperate patients. For example, in one long-overdue trial, researchers compared the efficacy of different chemotherapy regimens in 200 patients with advanced lymphoma; another—nearly 2 years late—tests immunotherapy against conventional chemotherapy in about 600 people with late-stage lung cancer.
…Contacted for comment, none of the institutions disputed the findings of this investigation. In all 4768 trials Science checked, sponsors violated the reporting law more than 55% of the time. And in hundreds of cases where the sponsors got credit for reporting trial results, they have yet to be publicly posted because of quality lapses flagged by ClinicalTrials.gov staff (see sidebar).
Although the 2017 rule, and officials’ statements at the time, promised aggressive enforcement and stiff penalties, neither NIH nor FDA has cracked down. FDA now says it won’t brandish its big stick—penalties of up to $12,103 a day for failing to report a trial’s results—until after the agency issues further “guidance” on how it will exercise that power. It has not set a date. NIH said at a 2016 briefing on the final rule that it would cut off grants to those who ignore the trial reporting requirements, as authorized in the 2007 law, but so far has not done so…NIH and FDA officials do not seem inclined to apply that pressure. Lyric Jorgenson, NIH deputy director for science policy, says her agency has been “trying to change the culture of how clinical trial results are reported and disseminated; not so much on the ‘aha, we caught you’, as much as getting people to understand the value, and making it as easy as possible to share and disseminate results.” To that end, she says, ClinicalTrials.gov staff have educated researchers about the website and improved its usability. As for FDA, Patrick McNeilly, an official at the agency who handles trial enforcement matters, recently told an industry conference session on ClinicalTrials.gov that “FDA has limited resources, and we encourage voluntary compliance.” He said the agency also reviews reporting of information on ClinicalTrials.gov as part of inspections of trial sites, or when it receives complaints. McNeilly declined an interview request, but at the conference he discounted violations of ClinicalTrials.gov reporting requirements found by journalists and watchdog groups. “We’re not going to blanketly accept an entire list of trials that people say are noncompliant,” he said.
…It also highlights that pharma’s record has been markedly better than that of academia and the federal government.
…But such good performance shouldn’t be an exception, Harvard’s Zarin says. “Further public accountability of the trialists, but also our government organizations, has to happen. One possibility is that FDA and NIH will be shamed into enforcing the law. Another possibility is that sponsors will be shamed into doing a better job. A third possibility is that ClinicalTrials.gov will never fully achieve its vital aspirations.”
“A national experiment reveals where a growth mindset improves achievement”, (2019-08-07):
A global priority for the behavioural sciences is to develop cost-effective, scalable interventions that could improve the academic outcomes of adolescents at a population level, but no such interventions have so far been evaluated in a population-generalizable sample. Here we show that a short (less than one hour), online growth mindset intervention—which teaches that intellectual abilities can be developed—improved grades among lower-achieving students and increased overall enrolment to advanced mathematics courses in a nationally representative sample of students in secondary education in the United States. Notably, the study identified school contexts that sustained the effects of the growth mindset intervention: the intervention changed grades when peer norms aligned with the messages of the intervention. Confidence in the conclusions of this study comes from independent data collection and processing, pre-registration of analyses, and corroboration of results by a blinded Bayesian analysis.
“The Iron Law Of Evaluation And Other Metallic Rules”, (2012-09-18):
Problems with social experiments and evaluating them, loopholes, causes, and suggestions; non-experimental methods systematically deliver false results, as most interventions fail or have small effects.
“Backlash Over Meat Dietary Recommendations Raises Questions About Corporate Ties to Nutrition Scientists”, (2020-01-15):
[Summary of vegetarian activist/researcher reaction to recent reviews & meta-analysis indicating that the correlation of meat-eating with bad health often does not appear in epidemiological datasets, the randomized experiments do not support the strong claims, and the overall evidence that eating meat = bad health is low quality & weak:
- “Meat Consumption and Health: Food for Thought”, Carroll & Doherty 2019 (editorial)
- “Red and Processed Meat Consumption and Risk for All-Cause Mortality and Cardiometabolic Outcomes: A Systematic Review and Meta-analysis of Cohort Studies”, Zeraatkar et al 2019a
- “Reduction of Red and Processed Meat Intake and Cancer Mortality and Incidence: A Systematic Review and Meta-analysis of Cohort Studies”, Han et al 2019
- “Patterns of Red and Processed Meat Consumption and Risk for Cardiometabolic and Cancer Outcomes: A Systematic Review and Meta-analysis of Cohort Studies”, Vernooij et al 2019
- “Effect of Lower Versus Higher Red Meat Intake on Cardiometabolic and Cancer Outcomes: A Systematic Review of Randomized Trials”, Zeraatkar et al 2019b
- “Health-Related Values and Preferences Regarding Meat Consumption: A Mixed-Methods Systematic Review”, Valli et al 2019
- “Unprocessed Red Meat and Processed Meat Consumption: Dietary Guideline Recommendations From the Nutritional Recommendations (NutriRECS) Consortium”, Johnston et al 2019
After breaking the embargo, they began lobbying against it, spamming the journal editor, demanding the papers be retracted before publication, denouncing it in talks, and contacting the Federal Trade Commission & district attorneys demanding they investigate; they justify these activities by saying that since high-quality evidence can’t be easily obtained in nutrition, there is no need for it, and accusing the authors of financial conflicts of interest and comparing them to global warming deniers.
However, the conflicts of interest represent very small percentages of funding, and the vegetarian activist/researchers themselves are heavily funded by anti-meat interests, such as olive research institutions, walnut industry bodies, the egg industry, snack companies, and alternative diet groups, with the list of funders of one member including but far from limited to “the Pulse Research Network, the Almond Board of California, the International Nut and Dried Fruit Council; Soy Foods Association of North America; the Peanut Institute; Kellogg’s Canada; and Quaker Oats Canada.”]
“Meat Consumption and Health: Food for Thought”, (2019-11-19):
For some time, medical and science organizations have been beating the drum that red and processed meat are bad for you. For almost as long, they have lamented that their efforts to inform the public have not convinced enough people to change their consumption. This month’s issue offers us food for thought on why. The field of nutritional epidemiology is plagued by observational studies that have conducted inappropriate analyses, accompanied by likely erroneous conclusions (1). Many studies selectively report results, and many lack an a priori hypothesis. Many use notoriously unreliable self-reports of food consumption while failing to collect or appropriately control for data on numerous potential confounders.
…Four more studies join the evidence base this month, and because they review all of the evidence that came before, they cannot be accused of cherry-picking. The first was a meta-analysis of cohort studies that focused on how dietary patterns, including differing amounts of red or processed meat, affected all-cause mortality, cardiometabolic outcomes, and cancer incidence and mortality (6). More than 100 studies including more than 6 million participants were analyzed. The overall conclusions were that dietary patterns, including differences in meat consumption, may result in only small differences in risk outcomes over long periods.
The next study was a meta-analysis that homed in specifically on cohort studies examining how reductions in red and processed meat might affect cancer incidence and mortality (7). It included 118 studies with more than 6 million participants, and it, too, found that the possible impact of reduced meat intake was very small. The third study was a meta-analysis of cohort studies that looked specifically at meat consumption and its relationship to all-cause mortality and cardiometabolic outcomes (8), and—once again—it found that any link was very small.
…In a fourth analysis in this issue (9), researchers examined randomized controlled trials that compared diets with differing amounts of red meat consumption for at least 6 months. They found 12 eligible studies, but one of them—the Women’s Health Initiative—was so large (almost 49 000 women) that it dominated the analysis. We can wish for more studies, and we could hope that they had more homogenous outcomes and better fidelity to assigned diets, but the overall conclusions from what they had were that “red meat may have little or no effect on major cardiometabolic outcomes and cancer mortality and incidence.”
…it may be time to stop producing observational research in this area. These meta-analyses include millions of participants. Further research involving much smaller cohorts has limited value. High-quality randomized controlled trials are welcome, but only if they’re designed to tell us things we don’t already know.
-
Background: Dietary guidelines generally recommend limiting intake of red and processed meat. However, the quality of evidence implicating red and processed meat in adverse health outcomes remains unclear.
Purpose: To evaluate the association between red and processed meat consumption and all-cause mortality, cardiometabolic outcomes, quality of life, and satisfaction with diet among adults.
Data Sources: EMBASE (Elsevier), Cochrane Central Register of Controlled Trials (Wiley), Web of Science (Clarivate Analytics), CINAHL (EBSCO), and ProQuest from inception until July 2018 and MEDLINE from inception until April 2019, without language restrictions, as well as bibliographies of relevant articles.
Study Selection: Cohort studies with at least 1000 participants that reported an association between unprocessed red or processed meat intake and outcomes of interest.
Data Extraction: Teams of 2 reviewers independently extracted data and assessed risk of bias. One investigator assessed certainty of evidence, and the senior investigator confirmed the assessments.
Data Synthesis: Of 61 articles reporting on 55 cohorts with more than 4 million participants, none addressed quality of life or satisfaction with diet. Low-certainty evidence was found that a reduction in unprocessed red meat intake of 3 servings per week is associated with a very small reduction in risk for cardiovascular mortality, stroke, myocardial infarction (MI), and type 2 diabetes. Likewise, low-certainty evidence was found that a reduction in processed meat intake of 3 servings per week is associated with a very small decrease in risk for all-cause mortality, cardiovascular mortality, stroke, MI, and type 2 diabetes.
Limitation: Inadequate adjustment for known confounders, residual confounding due to observational design, and recall bias associated with dietary measurement.
Conclusion: The magnitude of association between red and processed meat consumption and all-cause mortality and adverse cardiometabolic outcomes is very small, and the evidence is of low certainty.
-
Background: Cancer incidence has continuously increased over the past few centuries and represents a major health burden worldwide.
Purpose: To evaluate the possible causal relationship between intake of red and processed meat and cancer mortality and incidence.
Data Sources: Embase, Cochrane Central Register of Controlled Trials, Web of Science, CINAHL, and ProQuest from inception until July 2018 and MEDLINE from inception until April 2019 without language restrictions.
Study Selection: Cohort studies that included more than 1000 adults and reported the association between consumption of unprocessed red and processed meat and cancer mortality and incidence.
Data Extraction: Teams of 2 reviewers independently extracted data and assessed risk of bias; 1 reviewer evaluated the certainty of evidence, which was confirmed or revised by the senior reviewer.
Data Synthesis: Of 118 articles (56 cohorts) with more than 6 million participants, 73 articles were eligible for the dose-response meta-analyses, 30 addressed cancer mortality, and 80 reported cancer incidence. Low-certainty evidence suggested that an intake reduction of 3 servings of unprocessed meat per week was associated with a very small reduction in overall cancer mortality over a lifetime. Evidence of low to very low certainty suggested that each intake reduction of 3 servings of processed meat per week was associated with very small decreases in overall cancer mortality over a lifetime; prostate cancer mortality; and incidence of esophageal, colorectal, and breast cancer.
Limitation: Limited causal inferences due to residual confounding in observational studies, risk of bias due to limitations in diet assessment and adjustment for confounders, recall bias in dietary assessment, and insufficient data for planned subgroup analyses.
Conclusion: The possible absolute effects of red and processed meat consumption on cancer mortality and incidence are very small, and the certainty of evidence is low to very low.
-
Background: Studying dietary patterns may provide insights into the potential effects of red and processed meat on health outcomes.
Purpose: To evaluate the effect of dietary patterns, including different amounts of red or processed meat, on all-cause mortality, cardiometabolic outcomes, and cancer incidence and mortality.
Data Sources: Systematic search of MEDLINE, EMBASE, the Cochrane Central Register of Controlled Trials, CINAHL, Web of Science, and ProQuest Dissertations & Theses Global from inception to April 2019 with no restrictions on year or language.
Study Selection: Teams of 2 reviewers independently screened search results and included prospective cohort studies with 1000 or more participants that reported on the association between dietary patterns and health outcomes.
Data Extraction: Two reviewers independently extracted data, assessed risk of bias, and evaluated the certainty of evidence using GRADE (Grading of Recommendations Assessment, Development and Evaluation) criteria.
Data Synthesis: Eligible studies that followed patients for 2 to 34 years revealed low-certainty to very-low-certainty evidence that dietary patterns lower in red and processed meat intake result in very small or possibly small decreases in all-cause mortality, cancer mortality and incidence, cardiovascular mortality, nonfatal coronary heart disease, fatal and nonfatal myocardial infarction, and type 2 diabetes. For all-cause, cancer, and cardiovascular mortality and incidence of some types of cancer, the total sample included more than 400 000 patients; for other outcomes, total samples included 4000 to more than 300 000 patients.
Limitation: Observational studies are prone to residual confounding, and these studies provide low-certainty or very-low-certainty evidence according to the GRADE criteria.
Conclusion: Low-certainty or very-low-certainty evidence suggests that dietary patterns with less red and processed meat intake may result in very small reductions in adverse cardiometabolic and cancer outcomes.
“Effect of Lower Versus Higher Red Meat Intake on Cardiometabolic and Cancer Outcomes: A Systematic Review of Randomized Trials”, (2019-10-01):
Background: Few randomized trials have evaluated the effect of reducing red meat intake on clinically important outcomes.
Purpose: To summarize the effect of lower versus higher red meat intake on the incidence of cardiometabolic and cancer outcomes in adults.
Data Sources: EMBASE, CENTRAL, CINAHL, Web of Science, and ProQuest from inception to July 2018 and MEDLINE from inception to April 2019, without language restrictions.
Study Selection: Randomized trials (published in any language) comparing diets lower in red meat with diets higher in red meat that differed by a gradient of at least 1 serving per week for 6 months or more.
Data Extraction: Teams of 2 reviewers independently extracted data and assessed the risk of bias and the certainty of the evidence.
Data Synthesis: Of 12 eligible trials, a single trial enrolling 48 835 women provided the most credible, though still low-certainty, evidence that diets lower in red meat may have little or no effect on all-cause mortality (hazard ratio [HR], 0.99 [95% CI, 0.95 to 1.03], cardiovascular mortality (HR, 0.98 [CI, 0.91 to 1.06]), and cardiovascular disease (HR, 0.99 [CI, 0.94 to 1.05]). That trial also provided low-certainty to very-low-certainty evidence that diets lower in red meat may have little or no effect on total cancer mortality (HR, 0.95 [CI, 0.89 to 1.01]) and the incidence of cancer, including colorectal cancer (HR, 1.04 [CI, 0.90 to 1.20]) and breast cancer (HR, 0.97 [0.90 to 1.04]).
Limitations: There were few trials, most addressing only surrogate outcomes, with heterogeneous comparators and small gradients in red meat consumption between lower versus higher intake groups.
Conclusion: Low-certainty to very-low-certainty evidence suggests that diets restricted in red meat may have little or no effect on major cardiometabolic outcomes and cancer mortality and incidence.
“Health-Related Values and Preferences Regarding Meat Consumption: A Mixed-Methods Systematic Review”, (2019-11-19):
Background: A person’s meat consumption is often determined by their values and preferences.
Purpose: To identify and evaluate evidence addressing health-related values and preferences regarding meat consumption.
Data Sources: MEDLINE, EMBASE, Web of Science, Centre for Agriculture and Biosciences Abstracts, International System for Agricultural Science and Technology, and Food Science and Technology Abstracts were searched from inception to July 2018 without language restrictions.
Study Selection: Pairs of reviewers independently screened search results and included quantitative and qualitative studies reporting adults’ health-related values and preferences regarding meat consumption.
Data Extraction: Pairs of reviewers independently extracted data and assessed risk of bias.
Data Synthesis: Data were synthesized into narrative form, and summaries were tabulated and certainty of evidence was assessed using the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach. Of 19 172 initial citations, 41 quantitative studies (38 addressed reasons for meat consumption and 5 addressed willingness to reduce meat consumption) and 13 qualitative studies (10 addressed reasons for meat consumption and 4 addressed willingness to reduce meat consumption) were eligible for inclusion. Thirteen studies reported that omnivores enjoy eating meat, 18 reported that these persons consider meat an essential component of a healthy diet, and 7 reported that they believe they lack the skills needed to prepare satisfactory meals without meat. Omnivores are generally unwilling to change their meat consumption. The certainty of evidence was low for both “reasons for meat consumption” and “willingness to reduce meat consumption in the face of undesirable health effects.”
Limitation: Limited generalizability of findings to lower-income countries, low-certainty evidence for willingness to reduce meat consumption, and limited applicability to specific types of meat (red and processed meat).
Conclusion: Low-certainty evidence suggests that omnivores are attached to meat and are unwilling to change this behavior when faced with potentially undesirable health effects.
-
Description: Dietary guideline recommendations require consideration of the certainty in the evidence, the magnitude of potential benefits and harms, and explicit consideration of people’s values and preferences. A set of recommendations on red meat and processed meat consumption was developed on the basis of 5 de novo systematic reviews that considered all of these issues.
Methods: The recommendations were developed by using the Nutritional Recommendations (NutriRECS) guideline development process, which includes rigorous systematic review methodology, and GRADE methods to rate the certainty of evidence for each outcome and to move from evidence to recommendations. A panel of 14 members, including 3 community members, from 7 countries voted on the final recommendations. Strict criteria limited the conflicts of interest among panel members. Considerations of environmental impact or animal welfare did not bear on the recommendations. Four systematic reviews addressed the health effects associated with red meat and processed meat consumption, and 1 systematic review addressed people’s health-related values and preferences regarding meat consumption.
Recommendations: The panel suggests that adults continue current unprocessed red meat consumption (weak recommendation, low-certainty evidence). Similarly, the panel suggests adults continue current processed meat consumption (weak recommendation, low-certainty evidence).
“Follow-up: I found two identical packs of Skittles, among 468 packs with a total of 27,740 Skittles”, (2019-04-06):
This is a follow-up to a post from earlier this year discussing the likelihood of encountering two identical packs of Skittles, that is, two packs having exactly the same number of candies of each flavor. Under some reasonable assumptions, it was estimated that we should expect to have to inspect “only about 400-500 packs” on average until encountering a first duplicate. This is interesting, because as described in that earlier post, there are millions of different possible packs– or even if we discount those that are much less likely to occur (like, say, a pack of nothing but red Skittles), then there are still hundreds of thousands of different “likely” packs that we might expect to encounter.
So, on 12 January of this year, I started buying boxes of packs of Skittles. This past week, “only” 82 days, 13 boxes, 468 packs, and 27,740 individual Skittles later, I found the following identical 2.17-ounce packs.
…this seemed like a great opportunity to demonstrate the predictive power of mathematics. A few months ago, we did some calculations on a cocktail napkin, so to speak, predicting that we should be able to find a pair of identical packs of Skittles with a reasonably—and perhaps surprisingly—small amount of effort. Actually seeing that effort through to the finish line can be a vivid demonstration for students of this predictive power of what might otherwise be viewed as “merely abstract” and not concretely useful mathematics.
“Birthday problem”, (2020-12-28):
In probability theory, the birthday problem or birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367. However, 99.9% probability is reached with just 70 people, and 50% probability with 23 people. These conclusions are based on the assumption that each day of the year is equally probable for a birthday.
“Skittles (confectionery)”, (2020-12-28):
Skittles is a brand of fruit-flavored candy, currently produced and marketed by the Wrigley Company, a division of Mars, Inc.
-
Of all the secrets of war, there is one that is so well kept that it exists mostly as a rumour. It is usually denied by the perpetrator and his victim. Governments, aid agencies and human rights defenders at the UN barely acknowledge its possibility. Yet every now and then someone gathers the courage to tell of it…“That was hard for me to take,” Owiny tells me today. “There are certain things you just don’t believe can happen to a man, you get me? But I know now that sexual violence against men is a huge problem. Everybody has heard the women’s stories. But nobody has heard the men’s.”
It’s not just in East Africa that these stories remain unheard. One of the few academics to have looked into the issue in any detail is Lara Stemple, of the University of California’s Health and Human Rights Law Project. Her study “Male Rape and Human Rights” notes incidents of male sexual violence as a weapon of wartime or political aggression in countries such as Chile, Greece, Croatia, Iran, Kuwait, the former Soviet Union and the former Yugoslavia. Twenty-one per cent of Sri Lankan males who were seen at a London torture treatment centre reported sexual abuse while in detention. In El Salvador, 76% of male political prisoners surveyed in the 1980s described at least one incidence of sexual torture. A study of 6,000 concentration-camp inmates in Sarajevo found that 80% of men reported having been raped…Dolan first heard of wartime sexual violence against men in the late 1990s while researching his PhD in northern Uganda, and he sensed that the problem might be dramatically underestimated. Keen to gain a fuller grasp of its depth and nature, he put up posters throughout Kampala in June 2009 announcing a “workshop” on the issue in a local school. On the day, 150 men arrived. In a burst of candour, one attendee admitted: “It’s happened to all of us here.”…a rare 2010 survey, published in the Journal of the American Medical Association, found that 22% of men and 30% of women in Eastern Congo reported conflict-related sexual violence.
…Back at RLP I’m told about the other ways in which their clients have been made to suffer. Men aren’t simply raped, they are forced to penetrate holes in banana trees that run with acidic sap, to sit with their genitals over a fire, to drag rocks tied to their penis, to give oral sex to queues of soldiers, to be penetrated with screwdrivers and sticks. Atim has now seen so many male survivors that, frequently, she can spot them the moment they sit down. “They tend to lean forward and will often sit on one buttock,” she tells me. “When they cough, they grab their lower regions. At times, they will stand up and there’s blood on the chair. And they often have some kind of smell.”
-
Male rape is being used systematically in Libya as an instrument of war and political domination by rival factions, according to multiple testimonies gathered by investigators. Years of work by a Tunis-based group and witnessed by a journalist from Le Monde have produced harrowing reports from victims, and video footage showing men being sodomised by various objects, including rockets and broom handles. In several instances, witnesses say a victim was thrown into a room with other prisoners, who were ordered to rape him or be killed.
The atrocity is being perpetrated to humiliate and neutralise opponents in the lawless, militia-dominated country. Male rape is such a taboo in Arab societies that the abused generally feel too damaged to rejoin political, military or civic life. One man, Ahmed, told investigators he was detained for four years in a prison in Tomina, on the outskirts of Misrata. “They separate you to subjugate you,” he said. “‘Subjugate the men’, that’s the expression that they use. So that you never hold your head up again. And they were filming everything with their phones. “They take a broom and fix it on the wall. If you want to eat, you have to take off your pants, back on to the broom and not move off until the jailer sees blood flowing. Nobody can escape it.”
…In one camp, south of Tripoli, a man called Ali recounted his experience. He was 39 but looked 65 and walked with a cane. “Some of us were locked in a room, naked, for a whole night with groups of migrants,” he said. “The guards did not release them until they had all raped each other. Fortunately, I didn’t go through that, I only got the stick and the wheel.” The “wheel” involved being put naked and folded double, through a tyre suspended from the ceiling, making it easier for torturers to penetrate him with weaponry. Ali said he now had physical problems, “leaks” as he called them.
In another camp in southern Tripoli, Fathia said women were not immune. She said her entire family was violated by a militia from Misrata, with the men being deliberately targeted. “They dragged me in the street, in front of everyone, saying: ‘You raped our girls. We’ll do the same thing to you.’ “The worst thing they did to me,” she whispered, “is to rape me in front of my eldest son. Since then, he won’t speak to me.” Asked about other inmates who suffered a similar ordeal, Fathia said: “I only heard men’s voices. They were screaming, day and night.”
-
Eating fallen fruit and sleeping outside, however, didn’t provide him relief from his feelings of guilt and foreboding. He began to feel a dread that was inescapable and all-consuming. A devastating depression that he had suffered a few years before that fall semester returned. Normally a math phenom, Chris started failing his tests. In his apartment, he would sit in the dark—he didn’t want to waste electricity—listen to records, and cry. “I felt like I was slowly dying,” he said. A few months later, Chris left Davis to pursue a PhD in philosophy at the University of Kansas. But his condition didn’t improve. After having subsisted on scavenged persimmons and radishes for the entire fall term, he’d lost a dangerous amount of weight. His mother paid a visit to campus and, horrified by his appearance, immediately drove him to the grocery store to buy food. At home, Chris’s family had a hard time understanding the intensity of the self-denial that governed his life. His father and sister blamed his breakdown on abuse that Chris had suffered as a child; they believed his desire to escape society was a projection, an act of taking responsibility for something that wasn’t his fault. But Chris had a different explanation. When he was fifteen, his father had taken him and his sister on a trip to Mount St. Helens. Halfway up the mountain, they had passed clear-cut land. As Chris recalls, one moment there was only evergreen forest and the next moment there was nothing—just bare ground and stumps as far as he could see. A word came to his mind: evil…“They made it sound like I had a psychosis or a mental breakdown and that this is just the form it took, when really, shouldn’t anyone who is ethical and compassionate also choose to opt out of this society?”
…I was working fifty-hour weeks, mostly unpaid. My mother, concerned, suggested that I take a break. But I refused. There was no pause button on climate change, so why should I get a break? On some days, Salt Lake City, where I lived, had exceptionally bad air quality, a thick soup of pollution settling between the mountains and the valley. The corridor between Salt Lake and Provo, where I’d gone to college, had been completely converted from farmland to strip malls in just ten years. To the south lay one of the biggest open-pit copper mines in the world, to the north was an industrial warren of refineries, and to the west was nuclear waste buried in clay-sealed chambers, reeking of death. That was just the local stuff. Coral reefs were collapsing, ocean ecosystems were overfished, and people in island nations were trapped between salted well water and the swallowing sea. Meanwhile, everyone around me was fine…Sometimes I could do it. Other times I got combative, desperate, contrary. Meanwhile, Chris got married and had two children. When we hung out, he was happier. But he was different too. In his purist days, he’d let his lawn go to seed, refusing to use scarce water resources to keep it green. Now he was living in the suburbs, putting in Kentucky bluegrass. “Why don’t you just keep your lawn the way it was?” I said, too urgently. “Because I’ve been sad my whole life,” Chris said, “and sometimes I just want to sit on my green lawn with my wife and feel love.” I knew it was just a lawn, but it upset me anyway.
…I quit climate activism for a time, but I’ve kept going to therapy, and I keep confusing my therapists by talking about the end of the world. As it turns out, I’m not alone. A report released in 2012 by the National Wildlife Federation warned that climate change is creating a mental health crisis. The climate scientists, psychologists, and policy experts who authored the study estimated that two hundred million Americans will suffer from mental illness as a result of natural disasters, droughts, heat waves, and economic downturn. Recent disasters bear this out. In the wake of Hurricane Maria, Puerto Rico’s worst natural disaster on record, there was a 7% spike in PTSD among the children who survived. In the year after Hurricane Katrina, the suicide rate in New Orleans tripled, and the number of instances of depression and PTSD grew to what health experts described as near-epidemic levels. Even people who aren’t directly impacted by climate disasters can be affected. According to a 2017 report by the American Psychological Association, merely acknowledging the reality of climate change and its consequences can trigger chronic fear, fatalism, anger, and exhaustion—a condition that psychologists are increasingly referring to as eco-anxiety. Eco-anxiety can manifest in other serious ways. In 2008, in the midst of a severe drought in Australia, a seventeen-year-old boy refused to drink water because he was afraid that doing so would lead to the deaths of millions of people. Doctors diagnosed him with “climate delusion” and prescribed antidepressants. When they asked him why he took such drastic action, he said he felt guilty….Greta Thunberg, a sixteen-year-old Swedish girl who inspired the growing student climate strike movement, says that learning about climate change—and seeing adults’ inaction—contributed to a severe depression during which she stopped eating and drinking…other activists are turning the violence of climate change on themselves—like David Buckel, a human rights lawyer who in 2018 lit himself on fire in Prospect Park, in Brooklyn, to call attention to the scale of the climate plight…Quante told me that one of her earliest memories was learning that so many things around her were alive—the trees, the grass, the frogs. It terrified her to realize the harm she was capable of. One day, after it had rained, her mother made her walk along a worm-strewn sidewalk, and she screamed as she was dragged along. “We’re killing them!” she said. “We’re killing them!”…Van Susteren started having trouble sleeping. After getting into bed and closing her eyes, she would be ambushed by intrusive images. She would see refugees surrounded by barbed wire, animals trapped in the path of a hurricane, people stranded in floodwaters. The worst image was of a child. It wasn’t any child she knew, but a sort of representative for all children. The child looked at Van Susteren and asked the same question again and again: “Why didn’t you do anything?” As a psychiatrist, Van Susteren recognized her symptoms. The stress, the insomnia, the intrusive thoughts—they read like PTSD. And yet the trauma she was imagining hadn’t happened yet, or at least it hadn’t happened to her…Van Susteren coined a new term for her condition: pre-traumatic stress disorder…In the back of the class, a student started crying. “If I didn’t have hope, how could I live?” she asked.
…Robert Salo, the doctor who diagnosed the Australian boy with climate psychosis, was careful to note the boy’s other symptoms (long-term depression, suicidal thoughts, and hearing voices) and the disproportionate sense of importance he placed on his own actions (believing that his own small water usage would lead to widespread deaths). Other critics have pointed out that climate delusion usually afflicts people who already suffer from other mental health maladies, and that the triggers for psychotic episodes generally take the form of the dominant political or cultural issues of the time, from nuclear holocaust to Cold War-era fears about the spread of communism.
“Statistical reliability analysis for a most dangerous occupation: Roman emperor”, (2019-12-23):
Popular culture associates the lives of Roman emperors with luxury, cruelty, and debauchery, sometimes rightfully so. One missing attribute in this list is, surprisingly, that this mighty office was most dangerous for its holder. Of the 69 rulers of the unified Roman Empire, from Augustus (d. 14 CE) to Theodosius (d. 395 CE), 62% suffered violent death. This has been known for a while, if not quantitatively at least qualitatively. What is not known, however, and has never been examined is the time-to-violent-death of Roman emperors. This work adopts the statistical tools of survival data analysis to an unlikely population, Roman emperors, and it examines a particular event in their rule, not unlike the focus of reliability engineering, but instead of their time-to-failure, their time-to-violent-death. We investigate the temporal signature of this seemingly haphazard stochastic process that is the violent death of a Roman emperor, and we examine whether there is some structure underlying the randomness in this process or not. Nonparametric and parametric results show that: (1) emperors faced a significantly high risk of violent death in the first year of their rule, which is reminiscent of infant mortality in reliability engineering; (2) their risk of violent death further increased after 12 years, which is reminiscent of wear-out period in reliability engineering; (3) their failure rate displayed a bathtub-like curve, similar to that of a host of mechanical engineering items and electronic components. Results also showed that the stochastic process underlying the violent deaths of emperors is remarkably well captured by a (mixture) Weibull distribution. We discuss the interpretation and possible reasons for this uncanny result, and we propose a number of fruitful venues for future work to help better understand the deeper etiology of the spectacle of regicide of Roman emperors.
-
All parachute injuries from two local parachute centres over a 5-year period were analysed. Of 174 patients with injuries of varying severity, 94% were first-time charity-parachutists. The injury rate in charity-parachutists was 11% at an average cost of £3751 per casualty. 63% of casualties who were charity-parachutists required hospital admission, representing a serious injury rate of 7%, at an average cost of £5781 per patient. The amount raised per person for charity was £30. Each pound raised for charity cost the NHS £13.75 in return. Parachuting for charity costs more money than it raises, carries a high risk of serious personal injury and places a significant burden on health resources.
“What’s in a Font?: Ideological Perceptions of Typography”, (2019-12-20):
Although extensive political communication research considers the content of candidate messages, scholars have largely ignored how those words are rendered—specifically, the typefaces in which they are set. If typefaces are found to have political attributes, that may impact how voters receive campaign messages. Our paper reports the results of two survey experiments demonstrating that individuals perceive typefaces, type families, and type styles to have ideological qualities. Furthermore, partisanship moderates subjects’ perceptions of typefaces: Republicans generally view typefaces as more conservative than Independents and Democrats. We also find evidence of affective polarization, in that individuals rate typefaces more favorably when perceived as sharing their ideological orientation. Results broaden our understanding of how meaning is conveyed in political communication, laying the groundwork for future research into the functions of typography and graphic design in contemporary political campaigns. Implications for political practitioners are also discussed. Keywords: Political communication, ideology, partisanship, typeface, graphic design. [Ranking: Blackletter, Times New Roman, Jubilat, Gill Sans, Birds of Paradise, Century Gothic, Sunrise.]
“A Meta-Analysis of Procedures to Change Implicit Measures”, (2019-08-19):
Using a novel technique known as network meta-analysis, we synthesized evidence from 492 studies (87,418 participants) to investigate the effectiveness of procedures in changing implicit measures, which we define as response biases on implicit tasks. We also evaluated these procedures’ effects on explicit and behavioral measures. We found that implicit measures can be changed, but effects are often relatively weak (|ds| < .30). Most studies focused on producing short-term changes with brief, single-session manipulations. Procedures that associate sets of concepts, invoke goals or motivations, or tax mental resources changed implicit measures the most, whereas procedures that induced threat, affirmation, or specific moods/emotions changed implicit measures the least. Bias tests suggested that implicit effects could be inflated relative to their true population values. Procedures changed explicit measures less consistently and to a smaller degree than implicit measures and generally produced trivial changes in behavior. Finally, changes in implicit measures did not mediate changes in explicit measures or behavior. Our findings suggest that changes in implicit measures are possible, but those changes do not necessarily translate into changes in explicit measures or behavior.
“Attention and awareness in stage magic: turning tricks into research”, (2008-07-30):
Just as vision scientists study visual art and illusions to elucidate the workings of the visual system, so too can cognitive scientists study cognitive illusions to elucidate the underpinnings of cognition. Magic shows are a manifestation of accomplished magic performers’ deep intuition for and understanding of human attention and awareness. By studying magicians and their techniques, neuroscientists can learn powerful methods to manipulate attention and awareness in the laboratory. Such methods could be exploited to directly study the behavioural and neural basis of consciousness itself, for instance through the use of brain imaging and other neural recording techniques. [See their “Table 1: Psychological Assumptions” for a taxonomy.]
“A World Without Pain: Does hurting make us human?”, (2020-01-06):
Cameron is entirely insensitive to physical pain. As a child, she fell and hurt her arm while roller-skating, but had no idea she’d broken it until her mother noticed that it was hanging strangely. Giving birth was no worse…Cameron was having a trapeziectomy, an operation to remove a small bone at the base of the thumb joint. Though her hands never hurt, they’d become so deformed by arthritis that she couldn’t hold a pen properly. She’d had a similar experience with her hip, which had recently been replaced; it didn’t hurt, but her family noticed that she wasn’t walking normally. She saw her local doctor about it several times, but the first question was always “How much pain are you in?” And the answer was always “None.” (“The third time I was there I think they figured, ‘We’ll just take an X-ray to shut this woman up’,” Cameron told me. “Then the X-ray came in and it was really bad. Everything was all distorted and mangled and crumbling. He said, ‘Wow. This has got to be done.’”)…Cameron is beguiled by the idea that she can help alleviate others’ suffering—she remembers the terrible migraines that tormented her mother. Her father, however, was pain-free. “I never saw him take an aspirin,” Cameron said. “I’m convinced he was the same as me, because I never heard my father complaining about any pain, ever. He died suddenly, of a brain hemorrhage—I think other people would have had a warning.” ·…People with severe congenital neuropathy tend to die young, because they injure themselves so frequently and severely. (Without pain, children are in constant danger. They swallow something burning hot, the esophagus ruptures, bacteria spill into the internal organs, and terminal sepsis sets in. They break their necks roughhousing. To protect some patients, doctors have removed all their teeth to prevent them from chewing off their tongues and bleeding to death.) ·…Cameron does not have neuropathy: she can feel all the sensations the rest of us do, except pain. The most striking difference between her and everyone else is the way she processes endocannabinoids—chemicals that exist naturally in every human brain. Endocannabinoids mitigate our stress response, and they bind to the same receptors as the THC in the kind of cannabis you smoke. Normally, they are broken down by an enzyme called fatty acid amide hydrolase, or FAAH. But Cameron has a mutation on her FAAH gene that makes the enzyme less effective—so her endocannabinoids build up. She has extraordinarily high levels of one in particular: anandamide, whose name is derived from the Sanskrit word for “bliss.” · About a third of the population has a mutation in the FAAH gene, which provides increased levels of anandamide. “That phenotype—low levels of anxiety, forgetfulness, a happy-go-lucky demeanor—isn’t representative of how everyone responds to cannabis, but you see a lot of the prototypical changes in them that occur when people consume cannabis,” said Matthew Hill, a biologist at the University of Calgary’s Hotchkiss Brain Institute, who was a co-author of the Cameron paper. The FAAH gene, like every gene, comes in a pair. People who have the mutation in one allele of the gene seem a little high; people who have it in both even more so. Jo Cameron is fully baked. “When I met Jo for the first time, I was just struck by her,” Cox, an affable forty-year-old with a scruffy beard, told me, one afternoon in his lab at U.C.L. “She was very chatty. Did you notice that?” (It’s hard to miss.) “I said to her, ‘Are you worried about what’s going to happen today?’ Because she was meeting our clinicians to have a skin biopsy and do quantitative sensory testing—pain-threshold tests. She said, ‘No. In fact, I’m never worried about anything.’” Cox told me that it was difficult to get through everything in the time they’d allotted, because Cameron was so friendly and loquacious with the scientists, even as they burned her, stuck her with pins, and pinched her with tweezers until she bled. This imperviousness to pain is what makes her distinct from everyone else with a FAAH mutation. They, like even the most committed stoners, can still get hurt. ·…I asked Matthew Hill—a renowned expert on cannabinoids and stress—if there was any downside to Cameron’s biology, and he laughed out loud. “Yes! From an evolutionary perspective, it would be tremendously destructive for a species to have that,” he said. Without fear, you drown in waves that you shouldn’t be swimming in; you take late-night strolls in cities that you don’t know; you go to work at a construction site and neglect to put on a hard hat. “Her phenotype is only beneficial in an environment where there is no danger,” Hill asserted. “If you can’t be concerned about a situation where you’d be at risk of something adverse happening to you, you are more likely to put yourself in one. Anxiety is a highly adaptive process: that’s why every mammalian species exhibits some form of it.” · Unlike other pain-insensitive people, Cameron has made it into her seventies without getting badly hurt. Sometimes she realizes that she’s burning her hand on the stove because she smells singeing; sometimes she cuts herself in the garden and sees that she’s bleeding. But none of that has been severe, and Cameron did raise two children safely into adulthood. “The human brain is very capable of learning, ‘This is what’s appropriate to do in this situation’,” Hill said. Cameron’s relative cautiousness may have developed imitatively. “And there may not have been that much threat presented to her—she’s lived in a rural community in Scotland,” he concluded. “Maybe she hasn’t had to deal with that much that would physically or emotionally harm her.” ·…One complicating question is how much of Cameron’s Cameronness is really a consequence of her FAAH mutation and FAAH OUT deletion. She has plenty of other genes, after all, and her upbringing and her early environment also played a role in making her who she is. Since the paper was published, Matthew Hill has heard from half a dozen people with pain insensitivity, and he told me that many of them seemed nuts. “If you had this phenotype and weren’t a generally pleasant person like Jo—maybe you’re, like, a douche-y frat boy—the way that you would process this might be entirely different. Our whole perception of this phenotype is explicitly based on the fact that it was Jo who presented it.”
“Evolution as Backstop for Reinforcement Learning”, (2018-12-06):
One defense of free markets notes the inability of non-market mechanisms to solve planning & optimization problems. This has difficulty with Coase’s paradox of the firm, and I note that the difficulty is increased by the fact that with improvements in computers, algorithms, and data, ever larger planning problems are solved. Expanding on some Cosma Shalizi comments, I suggest interpreting phenomenon as multi-level nested optimization paradigm: many systems can be usefully described as having two (or more) levels where a slow sample-inefficient but ground-truth ‘outer’ loss such as death, bankruptcy, or reproductive fitness, trains & constrains a fast sample-efficient but possibly misguided ‘inner’ loss which is used by learned mechanisms such as neural networks or linear programming group selection perspective. So, one reason for free-market or evolutionary or Bayesian methods in general is that while poorer at planning/
optimization in the short run, they have the advantage of simplicity and operating on ground-truth values, and serve as a constraint on the more sophisticated non-market mechanisms. I illustrate by discussing corporations, multicellular life, reinforcement learning & meta-learning in AI, and pain in humans. This view suggests that are inherent balances between market/ non-market mechanisms which reflect the relative advantages between a slow unbiased method and faster but potentially arbitrarily biased methods. “What Intellectual Progress Did I Make In The 2010s?”, (2020-01-08):
[Scott Alexander look back on how his ideas/beliefs evolved over the past decade of blogging at Jackdaws/LessWrong/SlateStarCodex. Primary topics:
Bayesian predictive coding as a unified theory of brain perception, control, behavior, and psychiatric disorders as bad priors/updates
- Psychedelics use as modifying brain priors, explaining how psychedelics affect and sometimes benefit their users
- trauma/attachment disorder
Philosophy of mental disease
efficacy of SSRIs
Genetics of psychiatric disorders, especially autism/transsexuals: ???
Willpower: also predictive coding???
Diet/weight loss: setpoints, somehow
Existential risk: dissolving the Great Filter, raising AI risk awareness
Secular stagnation: progress is slowing, perhaps because human populations aren’t growing exponentially
- Baumol’s cost disease as core cause of economic stagnation and political backlash
The Replication Crisis: even worse than he thought
Psychological effects:
- Placebo effect: much more powerless than he thought
- Birth order effects: much more powerful than he thought
Utilitarianism: still confused, but more towards rule-utilitarianism
Politics: social media turbocharging tribalism/outgroup-bias
Ideology of liberalism and SJWism
Coordination problems as core problem of politics
Enlightenment: not actually that great, possibly wireheading]
“Three cases giant panda attack on human at Beijing Zoo.”, (2014):
Panda is regarded as Chinese national treasure. Most people always thought they were cute and just ate bamboo and had never imagined a panda could be vicious. Giant panda attacks on human are rare. There, we present three cases of giant panda attacks on humans at the Panda House at Beijing Zoo from September 2006 to June 2009 to warn people of the giant panda's potentially dangerous behavior.
-
Whenever Courtney Cirone grabs her iPad, her cat Cooper runs over as though a bag of treats had just been shaken. He wants to watch YouTube, specifically videos of squirrels and tiny birds scurrying about. “His eyes get super big, and he moves his head back and forth following the animals,” Cirone says. “He ducks his head down low like he’s hiding. One time he looked at me, meowing, like, ‘HELP ME CATCH THIS BASTARD.’” Cooper paws relentlessly at the screen, sometimes lunging at it head-first in an attempt to catch his digital prey. He loves these videos (along with clips of Dr. Phil). He’s so obsessed that Cirone limits his viewing to three times per week, because he sits very close and she’s cautious about protecting his eyes. When she turns her iPad off, he even sulks. If this sounds strange, it is and it’s not: Cats, famously the subjects of online videos, now sit on the other side, watching…Now she puts cat-targeted YouTube videos on for Jasper a few times weekly. He loves them so much that he’ll sit in front of the TV or in between Gall and her laptop to signal that he wants to watch.
Beyond all the content for humans, there’s a growing world on YouTube specifically for our feline friends. Loved by certain cat owners and occasionally championed by veterinarians and animal scientists, these videos tap into cats’ instincts to stalk, chase, and hunt. Cat-targeted footage of small animals is particularly popular on the platform, posted by channels like Little Kitty & Family, Handsome Nature, and Videos for Your Cat. One of the most prolific creators, Paul Dinning, has posted hundreds of videos for cats, including an eight-hour “Bird Bonanza” that’s amassed almost 7 million views. According to YouTube’s Trends and Insights team, Dinning created eight of the 10 most-viewed videos for cats in 2019…In 2019, videos containing the phrase “videos for cats” were viewed over 55 million on the platform, up 41% from 2018. “We now have this world where cats are an emerging audience,” Pettie says, “and movies for cats are an emerging trend.”…According to YouTube, videos targeted at dogs garnered only 6 million views last year.
…Cat Games creator Max Gomboev, a motion designer from Russia, first started making these videos as a tribute to his late cat. After seeing how much other cat owners liked them and the experience they provided over cat-targeted mobile apps, like Cat Fishing 2, which offer much less variety, he started making videos more regularly. “It’s easier than installing an app, and you can show my videos on a TV,” Gomboev says. “Usually, I create a new video every 10 days. Cats like to watch something new.”.
“Clustering of health, crime and social-welfare inequality in 4 million citizens from two nations”, (2020-01-20):
Health and social scientists have documented the hospital revolving-door problem, the concentration of crime, and long-term welfare dependence. Have these distinct fields identified the same citizens? Using administrative databases linked to 1.7 million New Zealanders, we quantified and monetized inequality in distributions of health and social problems and tested whether they aggregate within individuals. Marked inequality was observed: Gini coefficients equalled 0.96 for criminal convictions, 0.91 for public-hospital nights, 0.86 for welfare benefits, 0.74 for prescription-drug fills and 0.54 for injury-insurance claims. Marked aggregation was uncovered: a small population segment accounted for a disproportionate share of use-events and costs across multiple sectors. These findings were replicated in 2.3 million Danes. We then integrated the New Zealand databases with the four-decade-long Dunedin Study. The high-need/high-cost population segment experienced early-life factors that reduce workforce readiness, including low education and poor mental health. In midlife they reported low life satisfaction. Investing in young people’s education and training potential could reduce health and social inequalities and enhance population wellbeing.
https:/
/ www.gwern.net/ images/ genetics/ correlation/ 2020-richmondrakerd-figure4-correlations.png “The Genetics of Success: How Single-Nucleotide Polymorphisms Associated With Educational Attainment Relate to Life-Course Development”, (2016-06-01):
A previous genome-wide association study (GWAS) of more than 100,000 individuals identified molecular-genetic predictors of educational attainment. We undertook in-depth life-course investigation of the polygenic score derived from this GWAS using the four-decade Dunedin Study (N = 918). There were five main findings. First, polygenic scores predicted adult economic outcomes even after accounting for educational attainments. Second, genes and environments were correlated: Children with higher polygenic scores were born into better-off homes. Third, children’s polygenic scores predicted their adult outcomes even when analyses accounted for their social-class origins; social-mobility analysis showed that children with higher polygenic scores were more upwardly mobile than children with lower scores. Fourth, polygenic scores predicted behavior across the life course, from early acquisition of speech and reading skills through geographic mobility and mate choice and on to financial planning for retirement. Fifth, polygenic-score associations were mediated by psychological characteristics, including intelligence, self-control, and interpersonal skill. Effect sizes were small. Factors connecting GWAS sequence with life outcomes may provide targets for interventions to promote population-wide positive development. [Keywords: genetics, behavior genetics, intelligence, personality, adult development]
“Childhood forecasting of a small segment of the population with large economic burden.”, (2016):
Policy-makers are interested in early-years interventions to ameliorate childhood risks. They hope for improved adult outcomes in the long run, bringing return on investment. How much return can be expected depends, partly, on how strongly childhood risks forecast adult outcomes. But there is disagreement about whether childhood determines adulthood. We integrated multiple nationwide administrative databases and electronic medical records with the four-decade Dunedin birth-cohort study to test child-to-adult prediction in a different way, by using a population-segmentation approach. A segment comprising one-fifth of the cohort accounted for 36% of the cohort's injury insurance-claims; 40% of excess obese-kilograms; 54% of cigarettes smoked; 57% of hospital nights; 66% of welfare benefits; 77% of fatherless childrearing; 78% of prescription fills; and 81% of criminal convictions. Childhood risks, including poor age-three brain health, predicted this segment with large effect sizes. Early-years interventions effective with this population segment could yield very large returns on investment.
-
Alex/John/Mark Taylor belongs to one of the last surviving professions of Dickensian London. Clerks have co-existed with chimney sweeps and gene splicers. It’s a trade that one can enter as a teenager, with no formal qualifications, and that’s astonishingly well-paid. A senior clerk can earn a half-million pounds per year, or more than $650,000, and some who are especially entrenched make far more.
Clerks—pronounced “clarks”—have no equivalent in the U.S. legal system, and have nothing in common with the Ivy League-trained Supreme Court aides of the same spelling. They exist because in England and Wales, to simplify a bit, the role of lawyer is divided in two: There are solicitors, who provide legal advice from their offices, and there are barristers, who argue in court. Barristers get the majority of their business via solicitors, and clerks act as the crucial middlemen between the tribes—they work for and sell the services of their barristers, steering inquiring solicitors to the right man or woman. Clerks are by their own cheerful admission “wheeler-dealers,” what Americans might call hustlers. They take a certain pride in managing the careers of their bosses, the barristers—a breed that often combines academic brilliance with emotional fragility. Many barristers regard clerks as their pimps. Some, particularly at the junior end of the profession, live in terror of clerks. The power dynamic is baroque and deeply English, with a naked class divide seen in few other places on the planet. Barristers employ clerks, but a bad relationship can strangle their supply of cases. In his 1861 novel Orley Farm, Anthony Trollope described a barrister’s clerk as a man who “looked down from a considerable altitude on some men who from their professional rank might have been considered as his superiors.”…One of the most peculiar aspects of the clerk-barrister relationship is that clerks handle money negotiations with clients. Barristers argue that avoiding fee discussions keeps their own interactions with clients clean and uncomplicated, but as a consequence, they’re sometimes unaware of how much they actually charge. The practice also insulates and coddles them. Clerks become enablers of all sorts of curious, and in some cases self-destructive, behavior.
…John Flood, a legal sociologist who in 1983 published the only book-length study of barristers’ clerks, subtitled The Law’s Middlemen, uses an anthropological lens to explain the relationship. He suggests that barristers, as the de facto priests of English law—with special clothes and beautiful workplaces—require a separate tribe to keep the temple flames alight and press money from their congregation. Clerks keep barristers’ hands clean; in so doing they accrue power, and they’re paid accordingly. I asked more than a dozen clerks and barristers, as well as a professional recruiter, what the field pays. Junior clerks, traditionally recruited straight after leaving school at 16 and potentially with no formal academic qualifications, start at £15,000 to £22,000 ($19,500 to $28,600); after 10 years they can make £85,000. Pay for senior clerks ranges from £120,000 to £500,000, and a distinct subset can earn £750,000. The Institute of Barristers’ Clerks disputed these figures, saying the lows were too low and the highs too high. But there’s no doubt that the best clerks are well-rewarded. David Grief, 63, a senior clerk at the esteemed Essex Court Chambers, spoke to me enthusiastically about his personal light airplane, a TB20 Trinidad.
…Before the U.K. decimalized its currency in 1971, clerks received “shillings on the guinea” for each case fee. Under the new money system, the senior clerks’ take was standardized at 10% of their chambers’ gross revenue. Sometimes, but not always, they paid their junior staff and expenses out of this tithe. Chambers at the time were typically small, four to six barristers strong, but in the 1980s, they grew. As they added barristers and collected more money, each chambers maintained just one chief clerk, whose income soared. The system was opaque: The self-employed barristers didn’t know what their peers within their own chambers were paid, and in a precomputer age, with all transactions recorded in a byzantine paper system, barristers sometimes didn’t know what their clerks earned, either. Jason Housden, a longtime clerk who now works at Matrix Chambers, told me that, when he started out in the 1980s at another office, his senior clerk routinely earned as much as the top barristers and on occasion was the best-paid man in the building. · One anecdote from around the same time, possibly apocryphal, is widely shared. At a chambers that had expanded and was bringing in more money, three silks decided their chief clerk’s compensation, at 10%, had gotten out of hand. They summoned him for a meeting and told him so. In a tactical response that highlights all the class baggage of the clerk-barrister relationship, as well as the acute British phobia of discussing money, the clerk surprised the barristers by agreeing with them. “I’m not going to take a penny more from you,” he concluded. The barristers, gobsmacked and paralyzed by manners, never raised the pay issue again, and the clerk remained on at 10% until retirement. · Since the 1980s, fee structures have often been renegotiated when a senior clerk retires. Purely commission-based arrangements are now rare—combinations of salary and incentive are the rule, though some holdouts remain. Goddard told me last summer that he receives 3% of the entire take of the barristers at 4 Stone; later he said this was inaccurate, and that his pay was determined by a “complicated formula.” (Pupil barristers, as trainees are known, start there at £65,000 per year, and the top silks each make several million pounds.) · The huge sums that clerks earn, at least relative to their formal qualifications, both sit at odds with the feudal nature of their employment and underpin it. In some chambers, clerks still refer to even junior barristers as “sir” or “miss.” Housden remembers discussing this issue early in his career with a senior clerk. He asked the man whether he found calling people half his age “sir” demeaning. The reply was straightforward: “For three-quarters of a million pounds per year, I’ll call anyone sir.”
“Subscripts For Citations”, (2020-01-08):
I propose reviving an old General Semantics notation: borrow from scientific notation and use subscripts like ‘Gwern2020’ for denoting sources (like citation, timing, or medium). Using subscript indices is flexible, compact, universally technically supported, and intuitive. This convention can go beyond formal academic citation and be extended further to ‘evidentials’ in general, indicating the source & date of statements. While (currently) unusual, subscripting might be a useful trick for clearer writing, compared to omitting such information or using standard cumbersome circumlocutions.
“Behind the Sensationalism: Images of a Decaying Corpse in Japanese Buddhist Art”, (2005):
The kusözu, “painting of the nine stages of a decaying corpse,” portrays the sequential decay of a female cadaver in graphic detail. The shocking subject, rooted in Buddhist devotional practices, was regularly painted and reinterpreted during half a millennium of Japanese art. The images of a decaying corpse were charged with contextualized functionalities that have gone unrecognized in current scholarship. Through an examination of four major exemplars of the genre, this study shows how new meanings of the image were catalyzed by religious and social transformations.
The kusozu, “painting of the nine stages of a decaying corpse” (hereafter, painting of the nine stages), was executed in Japan from approximately the thirteenth through the nineteenth centuries in various formats, including handscrolls, hanging scrolls, and printed books. The subject itself is derived from a traditional Buddhist doctrine that urges contemplation on the nine stages of a decaying corpse (kusokan, hereafter, contemplation on the nine stages). The teaching dates to the early fifth century and promotes a systematic meditation on the impurity of a decaying corpse as an aid to ardent devotees who wish to liberate themselves from sensual desires and affections.
This paper explores unrecognized features of the paintings of the nine stages as they appear through almost half a millennium of Japanese art. We will see that these narrative paintings functioned as distinct visual agents for audiences in different eras. The functionality of the image shifted from a meditative focus for pietistic catharsis, to a didactic incentive for the pursuit of paradise, to an intercessory offering for the dead at merit transferal rites, to a popularized platform for politically manipulated precepts on feminine morality. After giving the textual and theological background for the nine stages of a decaying corpse, I will examine four images of the nine stages from different centuries, which I term the Nakamura, Raigoji, Dainenbutsuji, and Akagi versions. Finally, some remarks are offered on the enduring vitality of this sensational subject.
“Maraṇasati”, (2020-12-28):
Maraṇasati is a Buddhist meditation practice that uses various visualization and contemplation techniques to meditate on the nature of death. The cultivation of Maranassati is said to be conducive to right effort and also helps in developing a sense of spiritual urgency (Saṃvega) and renunciation (Nekkhamma).
“"Body of a courtesan in nine stages": A 19th century study of decomposition”, (2014-06-24):
“Body of a Courtesan in Nine Stages” was painted on handscroll by Japanese artist Kobayashi Eitaku in the 1870’s. It’s not unusual for artists to study corpses and body parts because of their need to learn about the human form, and because of the historical connection between the science of anatomy and artistic illustration. What makes this style unique is that it’s part of a Japanese artistic tradition devoted specifically to the study of human postmortem changes that stretches back hundreds of years.
“Body of a Courtesan in Nine Stages” is an example of kusozu, the illustration of a decomposing corpse, that was popular in Japanese art from about the 13th to 19th centuries…Though the painting maybe religious and/or scientific in nature, according to the British Museum it also has erotic themes. Because the subject matter is a courtesan, the curator notes for this piece at the British Museum say that this handscroll also falls into the genre of erotic art, or shunga. The word shunga means ‘picture of spring’ in Japanese. The word “spring” is a common synonym for sex. Below are all 9 panels. All images come from The British Museum.
“Kobayashi Eitaku”, (2020-12-28):
Kobayashi Eitaku was a Japanese artist and illustrator specializing in ukiyo-e and nihonga.
“The beauty of human decomposition in Japanese watercolor”, (2015-03-06):
I think I might be obsessed with kusozu, Japanese watercolor paintings that graphically depict human decomposition, which were popular between the 13th and 19th centuries; “Body of a Courtesan in Nine Stages” is another series in this genre featured previously on this site. Kusozu works of art were inspired by Buddhist beliefs and these paintings were meant to encourage people to ponder the temporary nature of the physical world. Kusozu watercolors also happen to be fantastic early studies of human decay and taphonomy, which is why one series, titled Kusozu: the death of a noble lady and the decay of her body, is currently on display as part of the “Forensics: The Anatomy of Crime” exhibit in London.
According to the Wellcome Collection, Kusozu: the death of a noble lady and the decay of her body was painted some time in the 18th century. The below scenes include: (1) the woman’s impending death and her preparation for it; (2) the noble woman has just passed away and her loved ones are seated around her; (3) slight skin discoloration (maybe some liver mortis) and a bit of bloating of during early decomposition; (4) the onset of putrefaction with bloating and marbling; (5) advanced decomposition as seen by pervasive marbling, leakage of purge fluid from the mouth, and the abdominal cavity has burst open (6) caving of abdominal cavity and scavenging animals; (7) start of skeletonization and the disappearance of soft tissue; (8) complete skeletonization and scattering of remains; (9) finally human remains have been completely scattered or consumed by unseen animals so all that remains is a memorial for the deceased woman.
“Having Had No Predecessor to Imitate, He Had No Successor Capable of Imitating Him”, (2020-01-17):
[Summary of the Homeric Question that gripped Western classical literary scholarship for centuries: who wrote the Iliad/Odyssey, when, and how? They appear in Greek history out of nowhere: 2 enormously lengthy, sophisticated, beautiful, canonical, unified works that would dominate Western literature for millennia, and yet, appeared to draw on no earlier tradition nor did Homer have any earlier (non-spurious) works. How was this possible?
The iconoclastic Analysts proposed it was a fraud, and the works were pieced together later out of scraps from many earlier poets. The Unitarians pointed to the overall quality; the complex (apparently planned) structure; the disagreements of Analysts on what parts were what pieces; and the Analysts’ inability to explain many anomalies in Homer: there are passages splicing together Greek dialects, passages which were metrical only given long-obsolete Greek letters/pronunciations, and even individual words which mixed up Greek dialects! (Not that these anomalies were all that much easier to explain by the Unitarian hypothesis of a single author).
The eventual resolution relied an old hypothesis: that Homer was in fact the product of a lost oral tradition. There was, unfortunately, no particular evidence for it, and so it never made any headway against the Analysts or Unitarians—until Milman Parry found a living oral tradition of epic poetry in the Balkans, and discovered in it all the signs of the Homeric poems, from repetitive epithets to a patchwork of dialects, and thus empirical examples of how long oral traditions could produce a work like Homer if one of them happened to get written down at some point.]
“Homeric Question”, (2020-12-27):
The Homeric Question concerns the doubts and consequent debate over the identity of Homer, the authorship of the Iliad and Odyssey, and their historicity. The subject has its roots in classical antiquity and the scholarship of the Hellenistic period, but has flourished among Homeric scholars of the 19th and 20th centuries.
“Whole Formulaic Verses in Greek and Southslavic Heroic Song”, (1933):
In this essay on the method to be used in the comparative study of early poetries the view is set forth that the essential feature of such poetry is its oral form, and not such cultural likenesses as have been called “popular,” “primitive,” “natural,” or “heroic.” As an example of method those numerous cases are considered where we find both in Homer and in Southslavic heroic song a verse which expresses the same idea. The explanation is as follows. Oral poetry is largely composed out of fixed verses. Especially will ideas which recur with any frequency be expressed by a fixed verse. Thus where the two poetries express the same frequent idea they both tend to do it in just the length of a verse. Knowing this common feature in the oral form of the two poetries we can conclude that the extraordinary hold which heroic poetry has on the thought and conduct of the Southern Slavs provides us with an example of what heroic poetry must have been for the early Greeks.
“The Homeric Versions”, (1932):
[6pg Borges essay on the literary merits of different translations of Homer and the problems of translation: the Newman-Arnold debate encapsulates the basic problem of literality vs literary. Borges gives translations of one passage by Buckley, Butcher & Lang, Cowper, Pope, Chapman, and Butler. Which is best? See also Borges 1936, “The Translators of the Thousand and One Nights”, a much more extended discussion of different translations of a work.]
“The Translators of The Thousand and One Nights”, (1936):
[18pg Borges essay on translations of the collection of Arab fairytales The Thousand and One Nights: each translator—Galland, Lane, Burton, Littmann, Mardrus—criticized the previous translator by creation.]
At Trieste, in 1872, in a palace with damp statues and deficient hygienic facilities, a gentleman on whose face an African scar told its tale-Captain Richard Francis Burton, the English consul-embarked on a famous translation of the Quitab alif laila ua laila, which the roumis know by the title The Thousand and One Nights. One of the secret aims of his work was the annihilation of another gentleman (also weather-beaten, and with a dark and Moorish beard) who was compiling a vast dictionary in England and who died long before he was annihilated by Burton. That gentleman was Edward Lane, the Orientalist, author of a highly scrupulous version of The Thousand and One Nights that had supplanted a version by Galland. Lane translated against Galland, Burton against Lane; to understand Burton we must understand this hostile dynasty.
“Choose Your Own Adventure: One Book, Many Readings”, (2009):
[Visualizing CYOA by generating graphs and coloring events by desirability; Swinehart observes distinct patterns in network types, harshness, linearity, and highlights various curious anomalies and tricks CYOA books could play on the reader.]
...To get a sense for the distribution of pages within the actual CYOA books, I’ve prepared a dataset of 12 books. The earliest dates from 1979 and at the later edge are a handful from 1986. They are laid out chronologically (or according to series order for books released in the same year) with the oldest at the top left and more recent books below. Each book has been arranged into rows of ten pages apiece. In scanning over the distribution of colors in this plot, one clear pattern is a gradual decline in the number of endings. The earliest books (in the top row) are awash in reds and oranges, with a healthy number of ‘winning’ endings mixed in. Later CYOA books tended to favor a single ‘best’ ending (see CYOA 44 & 53). The most extreme case of this was actually not a Choose Your Own Adventure book at all but a gamebook offshoot of the Zork text adventure series. The Cavern of Doom (labeled WDIDN 3 above) has a virtually linear progression where endings later in the book are increasingly better than those on earlier pages. This is reflected in the nearly unbroken spectrum from red to blue when scanning down the rows. The one outlier is the catastrophic ending seen in the third row from the bottom. This was a punishment page that could only be reached by cheating. Unlike most other endings in the book it does not offer to let you continue the story from a few pages back but instead calls you a cheater and leaves you with no choice but to start over from the beginning. Another surprising change over time is the decline in the number of choices in the books. The mess of light grey boxes in the top row gives way to books like A New Hope (CYOASW 1) which have more pages devoted to linear narrative than to decisions and endings combined. But to address this apparent pattern with more rigor it would be best to look at the numbers of pages in each category independent of their placement in the book...I’d be very curious to know the reason for this progression toward linearity. Presumably the invisible hand was guiding this development, but whether the hunger was for less difficulty in the books or simply for something with more in the way of traditional storytelling is harder to unravel. I could also imagine that this balance between interaction and exposition was peculiar to the individual writers, so this could merely reflect a changing set of practitioners.
“Me”, :
Christian Swinehart is a graphic designer, software developer, and data artist. His practice focuses on interaction and user interface design with a specialty in data visualization. He is the founder and principal of Samizdat Drafting Co. and is an active participant in the open-source world as the author of the PlotDevice and Arbor.js visualization tools.
Christian’s work is informed by a background in biology and computational modeling. His projects frequently employ simulation and numerical analysis as a means to communicate the structure within complex systems. Recent clients include The New York Times, Bloomberg, Gallup, Pentagram, Diller Scofidio + Renfro, and Allied Works Architects.
Degrees Held:
- MFA | Graphic Design (RISD, 2008)
- Ph.D. | Computational Neuroscience (Brandeis University, 2005)
- BS | Cognitive Science (Dickinson College, 1998)
“Choose Your Own Adventure”, (2020-12-27):
Choose Your Own Adventure, or Secret Path Books is a series of children's gamebooks where each story is written from a second-person point of view, with the reader assuming the role of the protagonist and making choices that determine the main character's actions and the plot's outcome. The series was based upon a concept created by Edward Packard and originally published by Constance Cappel's and R. A. Montgomery's Vermont Crossroads Press as the "Adventures of You" series, starting with Packard's Sugarcane Island in 1976.
“Gamebook”, (2020-12-22):
A gamebook is a work of printed fiction that allows the reader to participate in the story by making choices. The narrative branches along various paths, typically through the use of numbered paragraphs or pages. Each narrative typically does not follow paragraphs in a linear or ordered fashion. Gamebooks are sometimes called choose your own adventure books or CYOA after the influential Choose Your Own Adventure series originally published by US company Bantam Books. Gamebooks influenced hypertext fiction.
“These Maps Reveal the Hidden Structures of Choose Your Own Adventure Books: If you decide to see more, click on this story”, (2017-06-13):
The last installment of the original “Choose Your Own Adventure” series came out in 1998, but since 2004, Chooseco, founded by one of the series’ original authors, R.A. Montgomery, has been republishing classic volumes, as well as new riffs on the form of interactive fiction that seemed ubiquitous in the 1980s and ’90s. The new editions also carry an additional feature—maps of the hidden structure of each book.
Tattoo of Death, Choose Your Own Adventure #22 (All maps courtesy of ChooseCo) For years, fans have been creating visualizations of the forking structures of “Choose Your Own Adventure” books. Often, they’re interested in the types of outcomes at the end of each path. One map labels each ending as “new life, return home, or death,” and another separates them into “cliffhanger, solution, or death.” Christian Swinehart’s extensive graphical analysis of the books labels the endings as “great, favorable, mediocre, disappointing, or catastrophic.”
…Mapping the bones of the books can have other purposes, too. Nick Montfort, a poet and professor at the Massachusetts Institute of Technology who studies interactive fiction, has a habit of asking people what they know about “Choose Your Own Adventure” books. “They often say, ‘You have two choices after every page’,” he says. “That’s not true. Sometimes you have one choice. Sometimes you have more than two. When you show the maps, you can see that these books don’t look exactly the same.” The older volumes, for instance, tend to have more endings than the later ones, and three of the oldest—Journey Under the Sea, Space and Beyond, and By Balloon to the Sahara—have 42 endings each, more than any other books in the series….In just about every case, it can be surprising how a simple choice leads you down a complex path. In By Balloon to the Sahara, you’re in a balloon and are presented with a choice on the very first page. Storm clouds are on the horizon. Choice 1: “If you act now, you can release gas from the balloon and land before the storm overtakes you.” Choice 2: “Perhaps the storm will pass quickly. Maybe you can ride it out.” That’s just the beginning, since this book has the most decision points—48—of the series.
…There is yet another possibility in these nonlinear books: hidden endings. Inside UFO 54-40 has a hidden ending that’s only available to a reader who ignores the decisions and flips to it without prompting. But it’s there. “It’s a two-page, big illustration of this city,” says Montfort, the MIT professor. “The land of Ultima. As you flip through the book, even if you’re being very obedient, you can’t help but wonder what this text is.”
…Maps like the ones Chooseco created can reveal the structure of a book that gives readers choices, but though the multiple story lines are part of what makes the series so fun, they’re not the only thing that defines it. The meat of “Choose Your Own Adventure” stories are gender-neutral romps in worlds where there are no obviously right or wrong moral choices. There’s danger around bend, usually in the form of something like space monkeys, malicious ghosts, or conniving grown-ups. Even with a map, there’s no way to find out what really comes next without making a choice and flipping to another page.
“Master of Orion”, (2020-01-24):
A typical game of Master of Orion plays out over three broad stages. The first stage is the land grab, the wide-open exploration and colonization phase that happens before you meet your rival aliens. Here your challenge is to balance the economic development of your existing planets against your need to settle as many new ones as possible to put yourself in a good position for the mid-game. (When exactly do I stop spending my home planet’s resources on improving its own infrastructure and start using them to build more colony ships?) The mid-game begins when you start to bump into your rivals, and comes to entail much jockeying for influence, as the various races begin to sort themselves into rival factions. (The Alkaris, bird-like creatures, loathe the Mrrshans, the aforementioned race of frenzied pussycats, and their loathing is returned in kind. I don’t have strong feelings about either one—but whose side would it most behoove me to choose from a purely strategic perspective?) The end-game is nigh when the there is no more room for anyone to expand, apart from taking planets from a rival by force, and the once-expansive galaxy suddenly seems claustrophobic. It often, although by no means always, is marked by a massive war that finally secures somebody that elusive two-thirds majority in the Galactic Council.
…Yet the core genius of Master of Orion actually lies in how resistant it is to generalization. It’s no exaggeration to say that there really is no “typical” game; I’ve enjoyed plenty which played out in nothing like the pattern I’ve just described for you. I’ve played games in which I never fired a single shot in anger, even ones where I’ve never built a single armed ship of war, just as I’ve played others where I was in a constant war for survival from beginning to end…Master of Orion can easily be read as the work of a designer who looked at Civilization and was unimpressed with its touchy-feely side, then set out to make a game that fixed all the other failings which that side obscured.
…Master of Orion, on the other hand, works hard at every turn to make such one-size-fits-all strategies impossible—and nowhere more so than in its tech tree. When a new game begins, each race is given a randomized selection of technologies that are possible for it to research, constituting only about half of the total number of technologies in the game. Thus, while a technology roughly equivalent to Civilization’s Railroads does exist in Master of Orion—Star Gates—you don’t know if this or any other technology is actually available to you until you advance far enough up the tree to reach the spot where it ought to be. You can’t base your entire strategy around a predictable technology progression. While you can acquire technologies that didn’t make it into your tree by trading with other empires, bullying them into giving them to you, or attacking their planets and taking them, that’s a much more fraught, uncertain path to go down than doing the research yourself, one that requires a fair amount of seat-of-your-pants strategy in its own right. Any way you slice it, in other words, you have to improvise. This one clever design choice has repercussions for every other aspect of the game. Take, for instance, the endlessly fascinating game-within-a-game of designing your fleet of starships. If the tech tree was static, players would inevitably settle upon a small set of go-to designs that worked for their style of play. As it is, though, every new ship is a fresh balancing act, its equipment calibrated to maximize your side’s technological strengths and mitigate its weaknesses, while also taking into careful account the strengths and weaknesses of the foe you expect to use it against, about which you’ve hopefully been compiling information through your espionage network. Do you build a huge number of tiny, fast, maneuverable fighters, or do you build just a few lumbering galactic dreadnoughts? Or do you build something in between? There are no universally correct answers, just sets of changing circumstances.
…in Master of Orion, each race’s unique affordances force you to play it differently. Likewise, each opposing race’s affordances in combination with those of your own force you to respond differently to that race when you encounter it, whether on the other side of a diplomats’ table or on a battlefield in space. Further, most races have one technology they’re unusually good at researching and one they’re unusually bad at. Throw in varying degrees of affinity and prejudice toward the other races, and, again, you’ve got an enormous amount of variation which defies cookie-cutter strategizing.
…Sometimes a status such as that enjoyed by Master of Orion arrives thanks to an historical accident or a mere flashy technical innovation, but that is definitively not the case here. Master of Orion remains as rewarding as ever in all its near-infinite variation. Personally, I like to embrace its dynamic spirit for everything it’s worth by throwing a (virtual) die to set up a new game, letting the Universe decide what size galaxy I play in, how many rivals I play with, and which race I play myself. The end result never fails to be enjoyable, whether it winds up a desperate free-for-all between six alien civilizations compressed into a tiny galaxy with just 24 stars, or a wide-open, stately game of peaceful exploration in a galaxy with over 100 of them. In short, Master of Orion is the most inexhaustible well of entertainment I’ve ever found in the form of a single computer game—a timeless classic that never fails to punish you for playing lazy, but never fails to reward you for playing well. I’ve been pulling it out to try to conquer another random galaxy at least once every year or two for half my life already. I suspect I’ll still be doing so until the day I die.
“Master of Orion”, (2020-12-28):
Master of Orion is a turn-based, 4× science fiction strategy game released in 1993 by MicroProse on the MS-DOS operating system. It was ported to the Mac OS in 1995 by Take-Two Interactive and distributed by GameTek. The game is the first in its franchise, and the rights are held by Wargaming. The player leads one of ten races to dominate the galaxy through a combination of diplomacy and conquest while developing technology, exploring and colonizing star systems.
-
[Photo essay on making shiny balls of mud.]
Hi there, this is Bruce Gardner. I am out of Albuquerque, New Mexico and my strange superpower is: I am very good at making mud balls, aka hikaru dorodango. I’m taking over the Laurence King blog today to introduce my new book, Dorodango: The Japanese Art of Making Mud Balls…Coming from the words doro, meaning “mud” and dango, a type of Japanese flour cake, hikaru dorodango consists of forming a mud ball by hand. Layers of increasingly fine dirt are added to the surface over the space of days to a point at which the dorodango can be polished to a high sheen (hikaru means “shining”)…I was introduced to hikaru dorodango by a William Gibson essay in Tate Magazine, way back in 2002. I was immediately bowled over by the idea of creating art from such a humble material; I have been creating mud balls ever since.
…Here is an image of a few of my pieces that illustrate the scope of colour and texture that is possible with soil gathered from different locations (various parts of New Mexico, in this case).
…The process of creating hikaru dorodango is very conducive to flow: There is a repetitive quality to the work but it is still challenging as the dorodango changes, one minute to the next. Your mind remains engaged but you’re disconnected from everything else. Hours can easily slip by this way…How sturdy are they? That varies by soil. Some would shatter like glass if you dropped them. This one would dent your hardwood floor and roll away.
“Dorodango”, (2020-12-28):
Dorodango is a Japanese art form in which earth and water are molded, then carefully polished to create a delicate shiny sphere, resembling a billiard ball.
“Shiny balls of Mud: William Gibson Looks at Japanese Pursuits of Perfection”, (2012-04-20):
Essay on minimalism, otaku, and hikikomori as esthetic choices reflecting an obsessive focus on perfection of a single activity, exemplified by the unusual sculpture form dorodango (hand-rolling mud into colorful spheres).
“It Had to Be Her: Review of Passionate Spirit: The Life of Alma Mahler, Haste 2019”, (2020-01-16):
[The Browser summary: “The amazing life of Alma Mahler. She married and/or romanced Gustav Mahler, Oskar Kokoschka, Walter Gropius and Franz Werfel. She was “anti-Semitic, narcissistic, boastful, and untruthful”. Was she also an “ambitious young woman who longed to be a great composer but became instead a great muse to great men?”. Was she an “artist stunted by society’s restrictions on women?”. Was she a “grandiose groupie, expropriating the fame of her husbands and lovers?” Perhaps uniquely, she was all three." Mahler’s life dramatized the Viennese milieu, with absurd melodrama.]
The Alma Schindler of her early diaries, which she began in 1898, is, indeed, appealing. They reveal an ebullient teenager full of serious opinions and enthusiasms, a flirtatious young woman giddy with the attentions of the cultural elite in culturally elite fin-de-siècle Vienna. Alma writes about crushes and kisses and assignations on the Ringstrasse, about vigorously practicing the piano and earnestly studying composition, about attending the opera, about buying dresses and fighting with her mama. She is a girl—a splendid girl in a splendid city at a splendid time. She is vain and unsure of herself, self-aggrandizing as only a serious, determined, sensitive young person can be. The early diaries, published in English in 1998, end in 1902, just before she married Gustav Mahler. Alma lived for another sixty-two years, years of vainglorious strutting, scheming, and disloyalty, years chronicled by her own memoirs and by her later diaries (which have not been translated into English). Mahler scholars have a name for the challenge that arises from her unreliable tendencies: the Alma Problem. “She is routinely accused of massaging the facts to serve her own legacy,” Haste writes, “of suppressing or editing her husband Gustav Mahler’s published letters to remove critical references to her, for instance—acts seen, particularly by Mahler scholars (for whom she was for some time their principal source), as tampering with the archive.”…Touched by her husband’s new devotion and convinced that he would die if she left him, Alma sent Gropius away. Gustav wrote her daily love poems, smothered her slippers in kisses, and listened again to her music, pronouncing it good and begging her to resume composing. Alma was undeniably talented, and her songs are admired today, but this episode points as much to her extraordinary power as a muse as to her gifts as an artist. Her daughter Anna said that when Alma
just stopped in the doorway, you could immediately feel an electric charge…. She was an incredibly passionate woman…. And she really paid attention to everyone she spoke to. And encouraged them…. She was able to enchant people in a matter of seconds.
Albrecht Joseph, eventually Anna’s fifth husband, who was shocked by Alma’s dowdiness when he first met the legendary seductress in 1931, nevertheless noted that her “unique gift” was “a profound, uncanny understanding of what it was that [creative] men tried to achieve, an enthusiastic, orgiastic persuasion that they could do what they aimed at, and that she, Alma, fully understood what it was.” The intensity of her belief in art and genius had the effect of creating an almost violent sympathy. Gustav, like the other men she loved, did not think he could survive artistically without her. ·…And then there was Kokoschka. Alma later described her three-year affair with Oskar Kokoschka as “one violent struggle of love. Never before had I experienced so much strain, so much hell and so much paradise.” Jealous and controlling, the artist stalked her, patrolling her street after he left her house to make sure no other man visited. She refused to marry him, so while she was in Paris he stole her documents and posted the banns in the Döbling parish hall. “Oskar Kokoschka could only make love to me with the most peculiar game playing,” she later wrote. “As I refused to hit him during our hours of love, he began conjuring up the most appalling images of murder” of his supposed rivals “while whispering murkily to himself.” One night when she sang Parsifal at the piano, he whispered “a new, eerie text” into her ear, which caused her to scream and cry, then to swallow a toxic dose of bromine. (Kokoschka called the doctor.) · And through it all, he painted her. When she had an abortion (she wrote that she was afraid of “what might grow in me”), Kokoschka took a blood-stained cotton pad from her and kept it with him, saying, “That is, and will always be, my only child.” He painted bloody, murdered children. He drew “Alma Mahler Spinning with Kokoschka’s Intestine.” He insisted that she cover her arms with long sleeves. Kokoschka painted Alma entwined with him in a boat on a stormy sea, he painted Alma rising to the heavens while he stood in hell surrounded by snakes. Anna watched him work and asked, “Can’t you paint anything else except Mommy?” · When war came, Alma’s reaction was, as even the temperate Haste must admit, “an astonishing flourish of self-aggrandizement.” “I sometimes imagine,” Alma wrote, “that I was the one who ignited this whole world conflagration in order to experience some kind of development or enrichment—even if it be only death.” By now, she wanted to purify herself of the “evil fascination” of Kokoschka. She taunted him until he joined the cavalry, then broke off their relationship in unkind letters. In despair, Kokoschka insisted on being sent to the front, where he was wounded so badly he was reported dead in the Viennese papers. Though she later defiantly published a facsimile of Mahler’s manuscript of his Tenth Symphony, revealing (for a good price) his intimate, despairing notes, she was less keen on allowing her own letters to reach the public. After rushing to Kokoschka’s studio with her set of keys, she removed and burned her notes to him. · Though Kokoschka had not in fact died, her interest in him had. She was back to writing letters to Gropius. When she saw him while he was on leave, Haste writes, “their passion was rekindled,” and they got married. Kokoschka dealt with this rejection by commissioning a life-sized Alma doll, with instructions to “please make it possible that my sense of touch will be able to take pleasure in those parts where the layers of fat and muscle suddenly give way to a sinuous covering of skin.” The doll, covered in fluffy swan skin, suffered an ignominious end, beheaded and bedraggled in a courtyard the morning after Kokoschka threw a raucous farewell party for it.
“Alma Mahler”, (2020-12-27):
Alma Maria Mahler Gropius Werfel was a Viennese-born composer, author, editor and socialite. At fifteen, she was mentored by Max Burckhard. Musically active from her early years, she was the composer of nearly fifty songs for voice and piano, and works in other genres as well. Only seventeen songs are known to survive.
/
Book-reviews#an-introduction-to-japanese-court-poetry-miner-1968 “They Live”, (2020-12-28):
They Live is a 1988 American science-fiction action horror film written and directed by John Carpenter, based on the 1963 short story "Eight O'Clock in the Morning" by Ray Nelson. Starring Roddy Piper, Keith David, and Meg Foster, the film follows an unnamed drifter who discovers through special sunglasses that the ruling class are aliens concealing their appearance and manipulating people to consume, breed, and conform to the status quo via subliminal messages in mass media.
“Wozzeck”, (2020-12-28):
Wozzeck is the first opera by the Austrian composer Alban Berg. It was composed between 1914 and 1922 and first performed in 1925. The opera is based on the drama Woyzeck, which the German playwright Georg Büchner left incomplete at his death. Berg attended the first production in Vienna of Büchner's play on 5 May 1914, and knew at once that he wanted to base an opera on it. From the fragments of unordered scenes left by Büchner, Berg selected 15 to form a compact structure of three acts with five scenes each. He adapted the libretto himself, retaining "the essential character of the play, with its many short scenes, its abrupt and sometimes brutal language, and its stark, if haunted, realism..."
“Gwern.net newsletter (Substack subscription page)”, (2013-12-01):
Subscription page for the monthly gwern.net newsletter. There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews. You can also browse the archives since December 2013.
“GPT-2 Neural Network Poetry”, (2019-03-03):
In February 2019, following up on my 2015–2016 text-generation experiments with char-RNNs, I experiment with the cutting-edge Transformer NN architecture for language modeling & text generation. Using OpenAI’s GPT-2-117M (117M) model pre-trained on a large Internet corpus and nshepperd’s finetuning code, I retrain GPT-2-117M on a large (117MB) Project Gutenberg poetry corpus. I demonstrate how to train 2 variants: “GPT-2-poetry”, trained on the poems as a continuous stream of text, and “GPT-2-poetry-prefix”, with each line prefixed with the metadata of the PG book it came from. In May 2019, I trained the next-largest GPT-2, GPT-2-345M, similarly, for a further quality boost in generated poems. In October 2019, I retrained GPT-2-117M on a Project Gutenberg corpus with improved formatting, and combined it with a contemporary poem dataset based on Poetry Foundation’s website. .> With just a few GPU-days on 1080ti GPUs, GPT-2-117M finetuning can produce high-quality poetry which is more thematically consistent than my char-RNN poems, capable of modeling subtle features like rhyming, and sometimes even a pleasure to read. I list the many possible ways to improve poem generation and further approach human-level poems. For the highest-quality AI poetry to date, see my followup page, “GPT-3 Creative Writing”.
For anime plot summaries, see TWDNE; for generating ABC-formatted folk music, see “GPT-2 Folk Music” & “GPT-2 Preference Learning for Music and Poetry Generation”; for playing chess, see “A Very Unlikely Chess Game”; for the Reddit comment generator, see SubSimulatorGPT-2; for fanfiction, the Ao3; and for video games, the walkthrough model. For OpenAI’s GPT-3 followup, see “GPT-3: Language Models are Few-Shot Learners”.
“GPT-2 Folk Music”, (2019-11-01):
In November 2019, I experimented with training a GPT-2 neural net model to generate folk music in the high-level ABC music text format, following previous work in 2016 which used a char-RNN trained on a ‘The Session’ dataset. A GPT-2 hypothetically can improve on an RNN by better global coherence & copying of patterns, without problems with the hidden-state bottleneck.
I encountered problems with the standard GPT-2 model’s encoding of text which damaged results, but after fixing that, I successfully trained it on n = 205,304 ABC music pieces taken from The Session & ABCnotation.com. The resulting music samples are in my opinion quite pleasant. (A similar model was later retrained by Geerlings & Meroño-Peñuela 2020.)
The ABC folk model & dataset are available for download, and I provide for listening selected music samples as well as medleys of random samples from throughout training.
We followed the ABC folk model with an ABC-MIDI model: a dataset of 453k ABC pieces decompiled from MIDI pieces, which fit into GPT-2-117M with an expanded context window when trained on TPUs. The MIDI pieces are far more diverse and challenging, and GPT-2 underfits and struggles to produce valid samples but when sampling succeeds, it can generate even better musical samples.
https:/
/ www.gwern.net/ GPT-2-preference-learning#bradley-terry-preference-learning https:/
/ www.gwern.net/ GPT-2-preference-learning#optimization-by-backprop-not-blackbox “This Waifu Does Not Exist”, (2019-02-19):
Generating high-quality anime faces has long been a task neural networks struggled with. The invention of StyleGAN in 2018 has effectively solved this task and I have trained a StyleGAN model which can generate high-quality anime faces at 512px resolution. To show off the recent progress, I made a website, “This Waifu Does Not Exist” for displaying random StyleGAN 2 faces. TWDNE displays a different neural-net-generated face & plot summary every 15s. The site was popular and went viral online, especially in China. The model can also be used interactively for exploration & editing in the Artbreeder online service.
TWDNE faces have been used as screensavers, user avatars, character art for game packs or online games, uploaded to Pixiv, given away in streams, and used in a research paper (Noguchi & Harada 2019). TWDNE results also helped inspired Sizigi Studio’s online interactive waifu GAN, Waifu Labs, which generates even better anime faces than my StyleGAN results.
“Making Anime Faces With StyleGAN”, (2019-02-04):
Generative neural networks, such as GANs, have struggled for years to generate decent-quality anime faces, despite their great success with photographic imagery such as real human faces. The task has now been effectively solved, for anime faces as well as many other domains, by the development of a new generative adversarial network, StyleGAN, whose source code was released in February 2019.
I show off my StyleGAN 1/
2 CC-0-licensed anime faces & videos, provide downloads for the final models & anime portrait face dataset, provide the ‘missing manual’ & explain how I trained them based on Danbooru2017/ 2018 with source code for the data preprocessing, document installation & configuration & training tricks.For application, I document various scripts for generating images & videos, briefly describe the website “This Waifu Does Not Exist” I set up as a public demo (see also Artbreeder), discuss how the trained models can be used for transfer learning such as generating high-quality faces of anime characters with small datasets (eg Holo or Asuka Souryuu Langley), and touch on more advanced StyleGAN applications like encoders & controllable generation.
The appendix gives samples of my failures with earlier GANs for anime face generation, and I provide samples & model from a relatively large-scale BigGAN training run suggesting that BigGAN may be the next step forward to generating full-scale anime images.
A minute of reading could save an hour of debugging!
“GPT-3 Creative Fiction”, (2020-06-19):
I continue my AI poetry generation experiments with OpenAI’s 2020 GPT-3, which is 116× larger, and much more powerful, than the 2019 GPT-2. GPT-3, however, is not merely a quantitative tweak yielding “GPT-2 but better”—it is qualitatively different, exhibiting eerie runtime learning capabilities allowing even the raw model, with zero finetuning, to “meta-learn” many textual tasks purely by example or instruction. One does not train or program GPT-3 in a normal way, but one engages in dialogue and writes prompts to teach GPT-3 what one wants.
Experimenting through the OpenAI Beta API in June 2020, I find that GPT-3 does not just match my finetuned GPT-2-1.5b-poetry for poem-writing quality, but exceeds it, while being versatile in handling poetry, Tom Swifty puns, science fiction, dialogue like Turing’s Turing-test dialogue, literary style parodies… As the pièce de résistance, I recreate Stanislaw Lem’s Cyberiad’s “Trurl’s Electronic Bard” poetry using GPT-3. (Along the way, I document instances of how the BPE text encoding unnecessarily damages GPT-3’s performance on a variety of tasks, how to best elicit the highest-quality responses, common errors people make in using GPT-3, and test out GPT-3’s improvements in NN weak points like logic or commonsense knowledge.)
GPT-3’s samples are not just close to human level: they are creative, witty, deep, meta, and often beautiful. They demonstrate an ability to handle abstractions, like style parodies, I have not seen in GPT-2 at all. Chatting with GPT-3 feels uncannily like chatting with a human. I was impressed by the results reported in the GPT-3 paper, and after spending a week trying it out, I remain impressed.
This page records GPT-3 samples I generated in my explorations, and thoughts on how to use GPT-3 and its remaining weaknesses. I hope you enjoy them even a tenth as much as I enjoyed testing GPT-3 and watching the completions scroll across my screen.
“A Style-Based Generator Architecture for Generative Adversarial Networks”, (2018-12-12):
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.
“Language Models are Few-Shot Learners”, (2020-05-28):
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions—something which current NLP systems still largely struggle to do.
Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10× more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.
Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.
“Analyzing and Improving the Image Quality of StyleGAN”, (2019-12-03):
The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images. In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably attribute a generated image to a particular network. We furthermore visualize how well the generator utilizes its output resolution, and identify a capacity problem, motivating us to train larger models for additional quality improvements. Overall, our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.
“Language Models are Unsupervised Multitask Learners”, (2019-02-14):
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets.
We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset—matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples.
The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text.
These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
“GPT-2: 1.5B Release”, (2019-11-05):
As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models released since August, we’ve continued with our original staged release plan in order to provide the community with a test case of a full staged release process. We hope that this test case will be useful to developers of future powerful models, and we’re actively continuing the conversation with the AI community on responsible publication.
Our findings:
- Humans find GPT-2 outputs convincing.
- GPT-2 can be fine-tuned for misuse.
- Detection is challenging.
- We’ve seen no strong evidence of misuse so far.
- We need standards for studying bias.
…Next steps: Our experience with GPT-2 over the past 9 months has given us valuable insight into the challenges and opportunities for creating responsible publication norms in AI. We’re continuing our work on this issue via participation in the Partnership on AI’s “Responsible Publication Norms for Machine Learning” project and discussions with our colleagues in the research community.
“The Go Transformer: Natural Language Modeling for Game Play”, (2020-07-07):
This work applies natural language modeling to generate plausible strategic moves in the ancient game of Go. We train the Generative Pretrained Transformer (GPT-2) to mimic the style of Go champions as archived in Smart Game Format (SGF), which offers a text description of move sequences. The trained model further generates valid but previously unseen strategies for Go. Because GPT-2 preserves punctuation and spacing, the raw output of the text generator provides inputs to game visualization and creative patterns, such as the Sabaki project’s game engine using auto-replays. Results demonstrate that language modeling can capture both the sequencing format of championship Go games and their strategic formations. Compared to random game boards, the GPT-2 fine-tuning shows efficient opening move sequences favoring corner play over less advantageous center and side play. Game generation as a language modeling task offers novel approaches to more than 40 other board games where historical text annotation provides training data (e.g., Amazons & Connect 4/6).
“The Chess Transformer: Mastering Play using Generative Language Models”, (2020-08-02):
This work demonstrates that natural language transformers can support more generic strategic modeling, particularly for text-archived games. In addition to learning natural language skills, the abstract transformer architecture can generate meaningful moves on a chessboard. With further fine-tuning, the transformer learns complex gameplay by training on 2.8 million chess games in Portable Game Notation. After 30,000 training steps, OpenAI’s Generative Pre-trained Transformer (GPT-2) optimizes weights for 774 million parameters. This fine-tuned Chess Transformer generates plausible strategies and displays game formations identifiable as classic openings, such as English or the Slav Exchange. Finally, in live play, the novel model demonstrates a human-to-transformer interface that correctly filters illegal moves and provides a novel method to challenge the transformer’s chess strategies. We anticipate future work will build on this transformer’s promise, particularly in other strategy games where features can capture the underlying complex rule syntax from simple but expressive player annotations.
https:/
/ old.reddit.com/ r/ slatestarcodex/ comments/ el87vo/ a_very_unlikely_chess_game/ fdh0vqd/ “Legume”, (2020-12-27):
A legume is a plant in the family Fabaceae, or the fruit or seed of such a plant. The seed is also called a pulse. Legumes are grown agriculturally, primarily for human consumption, for livestock forage and silage, and as soil-enhancing green manure. Well-known legumes include alfalfa, clover, beans, peas, chickpeas, lentils, lupins, mesquite, carob, soybeans, peanuts, and tamarind. Legumes produce a botanically unique type of fruit – a simple dry fruit that develops from a simple carpel and usually dehisces on two sides.
https:/
/ possiblywrong.wordpress.com/ 2019/ 01/ 09/ identical-packs-of-skittles/ /
Turing-complete#macknik-et-al-2008-table-1-psychological-assumptions https:/
/ www.youtube.com/ channel/ UCJLIwYrmwgwbTzgmB5yVc7Q/ featured “Oral-formulaic composition”, (2020-12-22):
Oral-formulaic composition is a theory that originated in the scholarly study of epic poetry and was developed in the second quarter of the twentieth century. It seeks to explain two related issues:
- The process by which oral poets improvise poetry.
- The reasons for orally improvised poetry having the characteristics that it does.
“The Book of the Thousand Nights and a Night”, (2020-12-27):
The Book of the Thousand Nights and a Night (1885), subtitled A Plain and Literal Translation of the Arabian Nights Entertainments, is an English language translation of One Thousand and One Nights – a collection of Middle Eastern and South Asian stories and folk tales compiled in Arabic during the Islamic Golden Age – by the British explorer and Arabist Richard Francis Burton (1821–1890). It stood as the only complete translation of the Macnaghten or Calcutta II edition of the "Arabian Nights" until the Malcolm C. and Ursula Lyons translation in 2008.
“Antoine Galland”, (2020-12-27):
Antoine Galland was a French orientalist and archaeologist, most famous as the first European translator of One Thousand and One Nights, which he called Les mille et une nuits. His version of the tales appeared in twelve volumes between 1704 and 1717 and exerted a significant influence on subsequent European literature and attitudes to the Islamic world. Jorge Luis Borges has suggested that Romanticism began when his translation was first read.
“Edward William Lane”, (2020-12-27):
Edward William Lane was a British orientalist, translator and lexicographer. He is known for his Manners and Customs of the Modern Egyptians and the Arabic-English Lexicon, as well as his translations of One Thousand and One Nights and Selections from the Kur-án.
“Richard Francis Burton”, (2020-12-27):
Sir Richard Francis Burton was a British explorer, geographer, translator, writer, soldier, orientalist, cartographer, ethnographer, ethnologist, spy, linguist, poet, fencer, Freemason, and diplomat. He was famed for his travels and explorations in Asia, Africa, and the Americas, as well as his extraordinary knowledge of languages and cultures. According to one count, he spoke 29 European, Asian, and African languages.
“Enno Littmann”, (2020-12-27):
Ludwig Richard Enno Littmann was a German orientalist.
“J. C. Mardrus”, (2020-12-22):
Joseph Charles Mardrus, otherwise known as "Jean-Charles Mardrus" (1868–1949), was a French physician, poet, and a noted translator. Today he is best known for his translation of the Thousand and One Nights from Arabic into French, which was published from 1898 to 1904, and was in turn rendered into English by Edward Powys Mathers. A newer edition, Le livre des mille nuits et une nuit, was published in 1926–1932.
“Chooseco”, (2020-12-27):
Chooseco LLC is an American publishing company based in Waitsfield, Vermont. Founded in 2003 by author R. A. Montgomery and publisher Shannon Gilligan, the company primarily releases reissues of Montgomery's Choose Your Own Adventure series of gamebooks.
“Gustav Mahler”, (2020-12-27):
Gustav Mahler was an Austro-Bohemian Romantic composer, and one of the leading conductors of his generation. As a composer he acted as a bridge between the 19th century Austro-German tradition and the modernism of the early 20th century. While in his lifetime his status as a conductor was established beyond question, his own music gained wide popularity only after periods of relative neglect, which included a ban on its performance in much of Europe during the Nazi era. After 1945 his compositions were rediscovered by a new generation of listeners; Mahler then became one of the most frequently performed and recorded of all composers, a position he has sustained into the 21st century. In 2016, a BBC Music Magazine survey of 151 conductors ranked three of his symphonies in the top ten symphonies of all time.
“Oskar Kokoschka”, (2020-12-27):
Oskar Kokoschka was an Austrian artist, poet, playwright, and teacher best known for his intense expressionistic portraits and landscapes, as well as his theories on vision that influenced the Viennese Expressionist movement.
“Walter Gropius”, (2020-12-27):
Walter Adolph Georg Gropius was a German architect and founder of the Bauhaus School, who, along with Alvar Aalto, Ludwig Mies van der Rohe, Le Corbusier and Frank Lloyd Wright, is widely regarded as one of the pioneering masters of modernist architecture. He is a founder of Bauhaus in Weimar (1919). Gropius was also a leading architect of the International Style.
“Franz Werfel”, (2020-12-27):
Franz Viktor Werfel was an Austrian-Bohemian novelist, playwright, and poet whose career spanned World War I, the Interwar period, and World War II. He is primarily known as the author of The Forty Days of Musa Dagh, a novel based on events that took place during the Armenian Genocide of 1915, and The Song of Bernadette (1941), a novel about the life and visions of the French Catholic saint Bernadette Soubirous, which was made into a Hollywood film of the same name.
“RNN metadata for mimicking individual author style”, (2015-09-12):
Char-RNNs are unsupervised generative models which learn to mimic text sequences. I suggest extending char-RNNs with inline metadata such as genre or author prefixed to each line of input, allowing for better & more efficient metadata, and more controllable sampling of generated output by feeding in desired metadata. A 2015 experiment using
torch-rnn
on a set of ~30 Project Gutenberg e-books (1 per author) to train a large char-RNN shows that a char-RNN can learn to remember metadata such as authors, learn associated prose styles, and often generate text visibly similar to that of a specified author.I further try & fail to train a char-RNN on Geocities HTML for unclear reasons.
More successfully, I experiment in 2019 with a recently-developed alternative to char-RNNs, the Transformer NN architecture, by finetuning training OpenAI’s GPT-2-117M Transformer model on a much larger (117MB) Project Gutenberg poetry corpus using both unlabeled lines & lines with inline metadata (the source book). The generated poetry is much better. And GPT-3 is better still.
“Poetry Foundation”, (2020-12-27):
The Poetry Foundation is a Chicago-based American foundation created to promote poetry in the wider culture. It was formed from Poetry magazine, which it continues to publish, with a 2003 gift of $200 million from philanthropist Ruth Lilly.
/
docs/ www/ old.reddit.com/ 7eaaa81a26404ef60df4279ee1f1b0c829d73be5.html “GPT-2 Folk Music: Training a Spaceless Model”, (2019-12-12):
While training a GPT-2-117M on a folk music corpus written in ABC format, persistent syntax errors kept being generated by an otherwise-high-quality model: random spaces would be generated, rendering a music piece either erroneous or lower-quality. Why? It seems to be some issue with the GPT BPE encoder handling of spaces which makes it difficult to emit the right space-separated characters. We found that ABC does not actually require spaces, and we simply removed all spaces from the corpus—noticeably improving quality of generated pieces.
“Interacting with GPT–2 to Generate Controlled and Believable Musical Sequences in ABC Notation”, (2020-10-16):
Generating symbolic music with language models is a promising research area, with potential applications in automated music composition. Recent work shows that Transformer architectures can learn to generate compelling four-instrument scores from large MIDI datasets. In this paper, we re-train the small (117M) GPT-2 model with a large dataset in ABC notation, and generate samples of single-instrument folk music. Our BLEU and ROUGE based quantitative, and survey based qualitative, evaluations suggest that ABC notation is learned with syntactical and semantic correctness, and that samples contain robust and believable n-grams.
“Generating MIDI Music With GPT-2: Generating MIDI by converting to ABC and expanding the GPT-2 context window—works, if only just”, (2020-04-25):
To expand the ABC GPT-2 model to cover a wider variety of musical genres, I turn to the next-most compact widespread music encoding format: MIDI. There are hundreds of thousands of MIDIs which can be decompiled to ABC format, averaging ~10k BPEs—within GPT-2-117M’s feasible context window when trained on TPUs (which permit training of context windows up to 30k wide).
We compile the ABC from before and 2 large MIDI datasets, and convert to ABC, yielding ~453k usable ABC-MIDI musical files (~5.1GB of text). We trained January–April 2020 on our TPU swarm (with many interruptions), achieving a final loss of ~0.2 (underfit).
Sampling from the final model is hit-or-miss as it is prone to the likelihood repetition trap and it generates instruments one-by-one so it is common for instruments to be cut off or otherwise broken during sampling (indicating that sampling is increasingly a bigger problem than training for long-range sequence modeling). However, successful pieces are possible, and are musically far more diverse than the folk ABC corpus, with many pleasingly complex samples.
“Artbreeder”, (2019-09-09):
[Artbreeder is an interactive GAN generator website. Originally named “Ganbreeder” and providing only the 256px BigGAN generator, it now provides a variety of BigGAN & StyleGAN models, including the anime portrait StyleGAN model. (It is more general than the similar Waifu Labs, but my anime model is not as good.) Users can generate random samples and explore slight variants of them to gradually explore the “latent space” and find interesting images, but they can also edit images more directly, upload existing images to find the most similar image produced by the model, etc. A popular website, it has generated >56m images from September 2019 to January 2020.]
“Image Generation From Small Datasets via Batch Statistics Adaptation”, (2019-04-03):
Thanks to the recent development of deep generative models, it is becoming possible to generate high-quality images with both fidelity and diversity. However, the training of such generative models requires a large dataset. To reduce the amount of data required, we propose a new method for transferring prior knowledge of the pre-trained generator, which is trained with a large dataset, to a small dataset in a different domain. Using such prior knowledge, the model can generate images leveraging some common sense that cannot be acquired from a small dataset. In this work, we propose a novel method focusing on the parameters for batch statistics, scale and shift, of the hidden layers in the generator. By training only these parameters in a supervised manner, we achieved stable training of the generator, and our method can generate higher quality images compared to previous methods without collapsing, even when the dataset is small ( 100). Our results show that the diversity of the filters acquired in the pre-trained generator is important for the performance on the target domain. Our method makes it possible to add a new class or domain to a pre-trained generator without disturbing the performance on the original domain.
“Waifu Labs”, (2019-07-23):
[Waifu Labs is an interactive website for generating (1024px?) anime faces using a customized StyleGAN trained on Danbooru2018. Similar to Artbreeder, it supports face exploration and face editing, and at the end, a user can purchase prints of a particular face.]
We taught a world-class artificial intelligence how to draw anime. All the drawings you see were made by a non-human artist! Wild, right? It turns out machines love waifus almost as much as humans do. We proudly present the next chapter of human history: lit waifu commissions from the world's smartest AI artist. In less than 5 minutes, the artist learns your preferences to make the perfect waifu just for you.
“A Style-Based Generator Architecture for Generative Adversarial Networks”, (2018-12-12):
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.
“Danbooru2019 Portraits”, (2019):
Danbooru2019 Portraits is a dataset of n = 302,652 (16GB) 512px anime faces cropped from ‘solo’ SFW Danbooru2019 images in a relatively broad ‘portrait’ style encompassing necklines/ears/hats/etc rather than tightly focused on the face, upscaled to 512px as necessary, and low-quality images deleted by manual review using Discriminator ranking. This dataset has been used for creating TWDNE.
“Danbooru2019: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, (2015-12-15):
Deep learning for computer revision relies on large annotated datasets. Classification/
categorization has benefited from the creation of ImageNet, which classifies 1m photos into 1000 categories. But classification/ categorization is a coarse description of an image which limits application of classifiers, and there is no comparably large dataset of images with many tags or labels which would allow learning and detecting much richer information about images. Such a dataset would ideally be >1m images with at least 10 descriptive tags each which can be publicly distributed to all interested researchers, hobbyists, and organizations. There are currently no such public datasets, as ImageNet, Birds, Flowers, and MS COCO fall short either on image or tag count or restricted distribution. I suggest that the “image -boorus” be used. The image boorus are longstanding web databases which host large numbers of images which can be ‘tagged’ or labeled with an arbitrary number of textual descriptions; they were developed for and are most popular among fans of anime, who provide detailed annotations. The best known booru, with a focus on quality, is Danbooru. We provide a torrent/
rsync mirror which contains ~3tb of 3.69m images with 108m tag instances (of 392k defined tags, ~29/ image) covering Danbooru from 2005-05-24–2019-12-31 (final ID: #3,734,659), providing the image files & a JSON export of the metadata. We also provide a smaller torrent of SFW images downscaled to 512×512px JPGs (295GB; 2,828,400 images) for convenience. Our hope is that a Danbooru2019 dataset can be used for rich large-scale classification/
tagging & learned embeddings, test out the transferability of existing computer vision techniques (primarily developed using photographs) to illustration/ anime-style images, provide an archival backup for the Danbooru community, feed back metadata improvements & corrections, and serve as a testbed for advanced techniques such as conditional image generation or style transfer. “Making Anime Faces With StyleGAN: Reversing StyleGAN To Control & Modify Images”, (2019-03-24):
Discussion of how to modify existing images with GANs. There are several possibilities: train another NN to turn an image back into the original encoding; run blackbox search on encodings, repeatedly tweaking it to approximate a target face; or the whitebox approach, directly backpropagating through the model from the image to the encoding while holding the model fixed. All of these have been implemented for StyleGAN, and a combination works best. There are even GUIs for editing StyleGAN anime faces!
“Generating Anime Faces with BigGAN”, (2019-06-04):
I explore BigGAN, another recent GAN with SOTA results on the most complex image domain tackled by GANs so far, ImageNet. BigGAN’s capabilities come at a steep compute cost, however. I experiment with 128px ImageNet transfer learning (successful) with ~6 GPU-days, and from-scratch 256px anime portraits of 1000 characters on a 8×2080ti machine for a month (mixed results). My BigGAN results are good but compromised by practical problems with the released BigGAN code base. While BigGAN is not yet superior to StyleGAN for many purposes, BigGAN-like approaches may turn out to be necessary to scale to whole anime images.
“GPT-3 Weaknesses: Byte-Pair Encodings (BPEs)”, (2020-06-23):
Compared to GPT-2, GPT-3 improves performance on character-level tasks like rhyming, alliteration, punning, anagrams or permutations, acrostic poems, and arithmetic less than expected, despite being very good at many other closely-related kinds of writings like satire.
Why? A plausible explanation is an obscure technical detail: as a performance optimization, GPT does not see characters but sub-word-chunks called “byte-pair encodings” (BPEs). Because GPTs never see characters but opaque partial-words, which vary chaotically based on the specific word and even the surrounding context, they are unable to easily learn about character-level aspects of language, like similar spellings or sounds, and are forced to learn relationships much more indirectly, like by brute-force memorizing of pairs of words.
Some experiments with reformatting GPT-3’s poorest-performing tasks to avoid inconsistent BPE encodings of strings shows small to large performance gains, consistent with this theory.
“GPT-3: Prompts As Programming”, (2020-06-23):
The GPT-3 neural network is so large a model in terms of power and dataset that it exhibits qualitatively different behavior: you do not apply it to a fixed set of tasks which were in the training dataset, requiring retraining on additional data if one wants to handle a new task (as one would have to retrain GPT-2); instead, you interact with it, expressing any task in terms of natural language descriptions, requests, and examples, tweaking the prompt until it “understands” & it meta-learns the new task based on the high-level abstractions it learned from the pretraining.
This is a rather different way of using a DL model, and it’s better to think of it as a new kind of programming, where the prompt is now a “program” which programs GPT-3 to do new things.
“Decompiler”, (2020-12-22):
A decompiler is a computer program that takes an executable file as input, and attempts to create a high level source file which can be recompiled successfully. It is therefore the opposite of a compiler, which takes a source file and makes an executable. Decompilers are usually unable to perfectly reconstruct the original source code, and as such, will frequently produce obfuscated code. Nonetheless, decompilers remain an important tool in the reverse engineering of computer software.
“Large Scale GAN Training for High Fidelity Natural Image Synthesis”, (2018-09-28):
Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator’s input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128×128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6.
“Generating Anime Faces with StyleGAN: Using a trained Discriminator to Rank and Clean Data”, (2019-04-22):
The Discriminator of a GAN is trained to detect outliers or bad datapoints. So it can be used for cleaning the original dataset of aberrant samples. This works reasonably well and I obtained BigGAN/StyleGAN quality improvements by manually deleting the worst samples (typically badly-cropped or low-quality faces), but has peculiar behavior which indicates that the Discriminator is not learning anything equivalent to a “quality” score but may be doing some form of memorization of specific real datapoints. What does this mean for how GANs work?