2018 news

Annual summary of 2018 gwern.net newsletters, selecting my best writings, the best 2018 links by topic, and the best books/movies/anime I saw in 2018, with some general discussion of the year. (newsletter)
created: 8 Dec 2018; modified: 13 Jan 2019; status: finished; confidence: log; importance: 0

This is the 2018 summary edition of the gwern.net newsletter (archives), summarizing the best of the monthly 2018 newsletters:

  1. January
  2. February
  3. March
  4. April
  5. May
  6. June
  7. July
  8. August
  9. September
  10. October
  11. November
  12. December

Previous annual newsletters: 2017, 2016, 2015.



  1. Danbooru2017: a new dataset of 2.94m anime images (1.9tb) with 77.5m descriptive tags
  2. Embryo selection work: Overview of major current approaches for complex-trait genetic engineering, FAQ, multi-stage selection, chromosome/gamete selection, optimal search of batches, & robustness to error in utility weights
  3. Laws of Tech: Commoditize Your Complement
  4. SMPY bibliography
  5. reviews:

  6. How many computers are in your computer?
  7. On the Existence of Powerful Natural Languages

TODO update: Site traffic (July 2017-January 2018) was up: 326,852 page-views by 155,532 unique users.



Overall, 2018 was much like 2017 was, but more so. In all of AI, genetics, VR, Bitcoin, and general culture/politics, the trends of 2017 continued through 2018 or even accelerated.


AI: as I hoped in 2016, 2017 saw a re-emergence of model-based RL with various deep approaches to learning reasoning, meta-RL, and environment models. Using relational logics and doing planning over internal models and zero/few-shot learning are no longer things deep learning can’t do. My selection for the single biggest breakthrough of the year was when AlphaGo racked up a second major intellectual victory with the demonstration by Zero that using a simple expert iteration algorithm (with MCTS as the expert) does not only solve the long-standing problem of NN self-play being wildly unstable (dating back to attempts to failed attempts to extend TD-Gammon to non-backgammon domains in the 1990s), but also allows superior learning to the complicated human-initialized AGs - in both wallclock time & end strength, which is deeply humbling. 2000 years of study and tens of millions of active players, and that’s all it takes to surpass the best human Go players ever in the supposedly uniquely human domain of subtle global pattern recognition. Not to mention chess. (Silver et al 2017a, Silver et al 2017b.) Expert iteration is an intriguingly general and underused design pattern, which I think may prove useful, especially if people can remember that it is not limited to two-player games but is a general method for solving any MDP. The second most notable would be GAN work: Wasserstein GAN losses (Arjovsky et al 2017) considerably ameliorated the instability issues when using GANs with various architectures, and although WGANs can still diverge or fail to learn, they are not so much of a black art as the original DCGANs tended to be. This probably helped with later GAN work in 2017, such as the invention of the CycleGAN architecture (Zhu et al 2017) which accomplishes magical & bizarre kinds of learning such as learning, using horse and zebra images, to turn an arbitrary horse image into a zebra & vice-versa, or your face into a car or a bowl of ramen soup. Who ordered that? I didn’t, but it’s delicious & hilarious anyway, and suggests that GANs really will be important in unsupervised learning because they appear to be learning a lot about their domains. Additional demonstrations like being able to translate between human languages given only monolingual corpuses merely emphasize that lurking power - I still feel that CycleGAN should not work, much less high-quality neural translation without any translation pairs, but it does. The path to larger-scale photorealistic GANs was discovered by Nvidia’s ProGAN paper (Karras et al 2017): essentially StackGAN’s approach of layering several GANs trained incrementally as upscalers does work (as I expected), but you need much more GPU-compute to reach 1024x1024-size photos and it helps if each new upscaling GAN is only gradually blended in to avoid the random initialization destroying everything previously learned (analogous to transfer learning needing low learning rates or to freeze layers). Time will tell if the ProGAN approach is a one-trick pony for GANs limited to photos. Finally, GANs started turning up as useful components in semi-supervised learning in the GAIL paradigm (Ho & Ermon 2016) for deep RL robotics. I expect GANs are still a while off from being productized or truly critical for anything - they remain a solution in search of a problem, but less so than I commented last year. Indeed, from AlphaGo to GANs, 2017 was the year of deep RL (subreddit traffic octupled). Papers tumbled out constantly, accompanied by ambitious commercial moves: Jeff Dean laid out a vision for using NNs/deep RL essentially everywhere inside Google’s software stack, Google began full self-driving services in Phoenix, while noted researchers like Pieter Abbeel founded robotics startups betting that deep RL has finally cracked imitation & few-shot learning. I can only briefly highlight, in deep RL, continued work on meta-RL & neural net architecture search with fast weights, relational reasoning & logic modules, zero/few-shot learning, deep environment models (critical for planning), and robot progress in sample efficiency/imitation learning/model-based & off-policy learning, in addition to the integration of GANs a la GAIL. What will happen if every year from now on sees as much progress in deep reinforcement learning as we saw in 2017? (Suppose deep learning ultimately does lead to a Singularity; how would it look any different than it does now?) One thing missing from 2017 for me was use of very large NNs using expert mixtures, synthetic gradients, or other techniques; in retrospect, this may reflect hardware limitations as non-Googlers increasingly hit the limits of what can be iterated on reasonably quickly using just 1080tis or P100s. So I am intrigued by the increasing availability of Google’s second-generation TPUs (which can do training) and by discussions of multiple maturing NN accelerator startups which might break Nvidia’s costly monopoly and offer 100s of teraflops or petaflops at non-AmaGoogBookSoft researcher/hobbyist budgets.

The rise of DL has been the fall of Moravec’s paradox. No one is surprised in the least bit when a computer masters some complex symbolic task like chess or Go these days; we are surprised by the details like it happening about 10 years before many would’ve predicted, or that the Go player can be trained overnight in wallclock time, or that the same architecture can be applied with minimal modification to give a top chess engine. For all the fuss over AlphaGo, no one paying attention was really surprised. If you went back 10 years ago and told someone, by the way, by 2030, both Go and Arimaa can be played at a human level by an AI, they’d shrug. People are much more surprised to see Google Waymo cars driving around fairly creditably, or photorealistic faces being generated or totally realistic voices being synthesized. The progress in robotics has also been exciting to anyone paying attention to the space: the DRL approaches are getting ever better and sample-efficient and good at imitation. I don’t know how many blue-collar workers they will put out of work - even if software is solved, the robotic hardware is still extremely expensive! But factories will be salivating over them, I’m sure. (The future of self-driving cars is in considerably more doubt.) A standard-issue minimum-wage Homo sapiens worker-unit has a lot of advantages. I expect there will be a lot of blue-collar jobs for a long time to come, for those who want them. But they’ll be increasingly crummy jobs. This will make a lot of people very unhappy. I think of Turchin’s elite overproduction concept - how much of political strife now is simply that we’ve overeducated so many people in degrees that were almost entirely signaling-based and not of intrinsic value in the real world and there were no slots available for them and now their expectations are colliding with reality? In political science, they say revolutions happen not when things are going badly, but when things are going not as well as everyone expected.

We’re at an interesting point - as LeCun put it, I think, anything a human can do with <1s of thought, deep learning can do now, while older symbolic methods can outperform humans in a number of domains where they use >>1s of thought. As NNs get bigger and the training methods and architectures and datasets are refined, the <1s will gradually expand. So there’s a pincer movement going on, and sometimes hybrid approaches can crack a human redoubt (eg AlphaGo combined the hoary tree search for long-term >>1s thought with CNNs for the intuitive instantaneous gut-reaction evaluation of a board <1s, and together they could learn to be superhuman).

As long as what humans do with <1s of thought was out of reach, as long as the simple primitives of vision and movement couldn’t be handled, the symbol grounding and frame problems were hopeless. How does your design turn a photo of a cat into the symbol CAT which is useful for inference/planning/learning? But now we have a way to reliably go from chaotic real-world data to rich semantic - even linear! - numeric encodings like vector embeddings.

That’s why people are so excited about the future of DL and are daring to talk about AGI looking possible.

The biggest disappointment in AI was by far self-driving cars. 2018 was going to be the year of self-driving cars, as Waymo promised all & sundry a full public launch and the start of scaling out, and every report of expensive deals & investments bade fair to launch, but the launch kept not happening, and then the Uber pedestrian fatality happened. This fatality was the result of a cascade of internal decisions & pressure to put an unstable, erratic, known dangerous self-driving car on the road, then deliberately disable its emergency braking, deliberately disable the car’s emergency braking, not provide any alerts to the safety drivers, and then remove half the safety drivers, resulting in a fatality happening under what should have been near-ideal circumstances, and indeed the software detected the pedestrian long in advance and would have braked if it had been allowed (Preliminary NTSB Report: Highway HWY18MH010); particularly egregious given Uber’s past incidents like covering up running a red light. Comparisons to Challenger come to mind. The incident should not have affected perception of self-driving cars - the fact that a far from SOTA system can unsafe when its brakes are deliberately disabled & it cannot avoid (foreseen) accidents tells us nothing about the safety of the best self-driving cars. That self-driving cars are dangerous when done very badly should not come as news to anyone or change any beliefs, but it blackened perceptions of self-driving cars nevertheless. Perhaps because of it, the promised Waymo launch was delayed all the way to December and then was purely a paper launch, with no discernible difference from its previous small-scale operations. (Which leads me to question why the credible buildup beforehand of vehicles & personnel & deals if the paper launch was what was always intended; did the Uber incident trigger an internal review and a major re-evaluation of how capable & safe their system really is and a resort to a paper launch to save face?)

TODO: subreddit traffic stats update https://www.reddit.com/r/reinforcementlearning/comments/7noder/meta_rrl_subreddit_traffic_substantial_increase/

Genetics in 2017 was a straight-line continuation of 2016: the UKBB dataset came online and is fully armed & operational, with exomes now following (and whole-genomes soon), resulting in the typical flurries of papers on everything which is heritable (which is everything). Genetic engineering had a banner year between CRISPR and older methods in the pipeline - it seemed like every week there was a new mouse or human trial curing something or other, to the point where I lost track and the NYT has begun reporting on clinical trials being delayed by lack of virus manufacturing capacity. (A good problem to have!) Genome synthesis continues to greatly concern me but nothing newsworthy happened in 2017 other than, presumably, continuing to get cheaper on schedule. Intelligence research did not deliver any particularly amazing results as the SSGAC paper has apparently been delayed to 2018 (with a glimpse in Plomin & von Stumm 2018), but we saw two critical methodological improvements which I expect to yield fruit in 2017-2018: first, as genetic correlation researchers have noted for years, genetic correlations should be able to boost power considerably by correcting for measurement error & increasing effective sample size by appropriate combination of polygenic scores, and MTAG demonstrates this works well for intelligence (Hill et al 2017b increases PGS to ~7% & Hill et al 2018 to ~10%); second, Hsu’s lasso predictions were proven true by Lello et al 2017 demonstrating the creation of a polygenic score explaining most SNP heritability/predicting 40% of height variance. The use of these two simultaneously with SSGAC & other datasets ought to boost IQ PGSes to >10% and possibly much more. Perhaps the most notable single development was the resolution of the long-standing dysgenics question using molecular genetics: has the demographic transition in at least some Western countries led to decreases in the genetic potential for intelligence (mean polygenic score), as suggested by most but not all phenotypic analyses of intelligence/education/fertility? Yes, in Iceland/USA/UK, dysgenics has indeed done that on a meaningful scale, as shown by straightforward calculations of mean polygenic score by birth decade & genetic correlations. More interestingly, the increasing availability of ancient DNA allows for preliminary analyses of how polygenic scores change over time: over tens of thousands of years, human intelligence & disease traits appear to have been slowly selected against (consistent with most genetic variants being harmful & under purifying selection) but that trend reversed at some point relatively recent.

Clinical predictions are reaching a point where they justify further investment all on their own. It’s striking how the 3 biggest intellectual trends in my life to date - cryptography for cryptocurrency, behavioral genetics, & AI - have all been due in considerable part to changes in incentives & business models (currencies/ICOs, database licensing like 23andMe+universal healthcare like UKBB+NHS, & online advertising respectively).

The oncoming datasets keep getting more enormous; you’ve seen Razib’s projections about consumer DTC extrapolating from announced sales numbers, and there are various announcements like the UKBB aiming for 5 million whole-genomes, which would’ve been bonkers even a few years ago. (Why now? Prices have fallen enough. Perhaps an enterprising journalist could dig into why Illumina could keep WGS prices so high for so long. :) The PGSes improved a good deal in predictive power. Just accumulating data and some improvements in methodology, like MTAG and other genetic correlation-based methods. I’ve been saying that for a while but it’s nice to see it happening. (The SSGAC paper wasn’t quite as good as I expected but then Allegrini fixed that.) I liked https://www.gwern.net/docs/genetics/selection/2018-torkamani.pdf and https://www.nejm.org/doi/full/10.1056/NEJMoa1804710 for the topic of increasing clinical utility of various kinds of genetic predictors, in addition to that one. The promised land is finally being reached, you might say. GenPred officially launching their embryo selection service underscores that. I’m glad I was able to visit them and talk a little about stuff. Simple multi-trait embryo selection is pretty much done intellectually, IMO. (The short-term question now is what can be done with oocyte harvesting & maturation, medium-term is gametogenesis for IES, and long-term is genome synthesis.)

He Jiankui is quite interesting of course, and I’m particularly intrigued by this apparent Chinese crackdown/backlash. What’s up with that? Still, more or less a sideshow. Bad for medical geneticists of course, but CRISPR is largely irrelevant to complex traits for reasons I think I’ve explained to you before, and crackdowns on it won’t affect PGD/IES/synthesis.

Probably the most interesting area in terms of fundamental work for me was IES. Considerable progress on both mice & human gametogenesis and stem cell control.

VR continued steady gradual growth; with no major new hardware releases (Oculus Go doesn’t count), there was not much to tell beyond the Steam statistics or Sony announcing PSVR sales >3m. (I did have an opportunity to play the very popular Beat Saber with my mother & sister; all of us enjoyed it.) More interesting will be the 2019 launch of Oculus Quest which comes close to the hypothetical mass-consumer breakthrough VR headset: mobile/no wires, with a resolution boost and full hand/position tracking, in a reasonably priced package, with a promised large library of established VR games ported to it. It lacks foveated rendering or retina resolution, but otherwise seems like a major upgrade in terms of mass appeal; if it continues to eke out modest sales, that will be consistent with the narrative that VR is on the long slow slog adoption path similar to early PCs or the Internet (instantly appealing and clearly the future to the early adopters who try it, but still taking decades to achieve any mass penetration) rather than post-iPhone smartphones.

Bitcoin: the long slide from the bubble continued, to my considerable schadenfreude (2017 but more so…) The most interesting story of the year for me was the reasonably successful launch of the long-awaited Augur prediction markets. Otherwise, not much to remark on.

A short note on politics: I maintain my 2017 comments (but more so…). For all the emotion invested in it and the continued Girardian scapegoating/backlash, Donald Trump’s presidency remains overrated in importance despite his ability to do substantial damage like launching trade wars; blunders like his North Korea policy merely continue a long history of ineffective policy & was largely inevitable once the South Korean population chose to elect Moon Jae-in (they have made their bed & must lie in it). Let’s try to focus more on long-term issues such as global economic growth or genetic engineering.



  1. McNamara’s Folly: The Use of Low-IQ Troops in the Vietnam War, Gregory 2015 (review; see also Low Aptitude Men in the Military: Who Profits, Who Pays?, Laurence & Ramsberger 1991.)
  2. Bad Blood: Secrets and Lies in a Silicon Valley Startup, Carreyrou 2018 (review)
  3. The Vaccinators: Smallpox, Medical Knowledge, and the Opening of Japan, Jannetta 2007 (TODO: reread for note-taking, expand review, and put on GoodReads)
  4. Like Engendr’ing Like: Heredity and Animal Breeding in Early Modern England, Russell 1986 (review)
  5. Cat Sense: How the New Feline Science Can Make You a Better Friend to Your Pet, Bradshaw 2013 (review)
  6. Strategic Computing: DARPA and the Quest for Machine Intelligence, 1983-1993, Roland & Shiman 2002 (review)
  7. The Operations Evaluation Group: A History of Naval Operations Analysis, Tidman 1984 (review)


  1. Fujiwara Teika’s Hundred-Poem Sequence of the Shōji Era, 1200: A Complete Translation, with Introduction and Commentary, Brower 1978 (review)


Nonfiction movies:

  1. Kedi, 2016 (review)



  1. A Quiet Place (review)
  2. Shadow of the Vampire (2000)
  3. Conan the Barbarian (review)
  4. Ant-Man and the Wasp (review)


  1. My Little Pony: Friendship is Magic, seasons 1-8 (review)
  2. Kurozuka (review)