Annual summary of 2019 gwern.net newsletters, selecting my best writings, the best 2019 links by topic, and the best books/movies/anime I saw in 2019, with some general discussion of the year.
source; created: 21 Nov 2019; modified: 26 Feb 2020; status: notes; confidence: log; importance: 0
- end of year summary
2019 went well, with much interesting news and several stimulating trips. My 2019 writings included:
- “How To Generate Faces With StyleGAN”
- “Finetuning the GPT-2-small Transformer for English Poetry Generation”
- Danbooru2018 released: a dataset of 3.33m anime images (2.5tb) with 92.7m descriptive tags
- “How Should We Critique Research?”
- “One Man’s Modus Ponens…”
- “Timing Technology: Lessons from the Media Lab”
- “Everything Is Correlated”
- “On Seeing Through ‘On Seeing Through: A Unified Theory’: A Unified Theory”
- “Dog Cloning For Special Forces: Breed All You Can Breed”/NBA recruiting using height polygenic scores
- Rubrication Design Examples
I’m particularly proud of the technical improvements to the
gwern.net site design this year: along with a host of minor typographic improvements & performance optimizations,
Inflation.hs enables automatic updates of currencies (a feature I’ve long felt would make documents far less misleading), the link annotations/popups (
popups.js) are a major usability enhancement few sites have,
sidenotes.js eliminates the frustration of footnotes, collapsible sections help tame long writings by avoiding the need for hiding code or relegating material to appendices, and link icons & drop caps & epigraphs are just pretty. While changes are never unanimously received, we have received many compliments on the overall design, and are quite pleased with it now.
Site traffic (more detailed breakdown) was again up as compared with the year before: 2019 saw 1,361,195 pageviews by 671,774 unique visitors (lifetime totals: 7,988,362 pageviews by 3,808,776 users). I benefited primarily from TWDNE, although the numbers are somewhat inflated by hosting a number of popular archived pages from DeepDotWeb/OKCupid/Rotten.com, which I put Google Analytics on to keep track of referrals.
2019 was a fun year.
AI: 2019 was a great year for hobbyists and fun generative projects like mine, thanks to spinoffs and especially pretrained models. How much more boring it would have been without the GPT-2 or StyleGAN models! (There was irritatingly little meaningful news about self-driving cars.) More seriously, the theme of 2019 was scaling. Whether GPT-2 or StyleGAN 1/2, or the scaling papers, or AlphaStar, or MuZero, 2019 demonstrated the power of scaling up models, compute, data, and tasks; it is no accident that the most extensively discussed editorial on DL/DRL was Rich Sutton’s “The Bitter Lesson”. For all the critics’ carping and goalpost-moving, scaling is working, especially as we go far past the regimes where they assured us years ago that mere size and compute would break down and we would have to use more elegant and intelligent methods like Bayesian program synthesis. Instead, every year it looks increasingly like the strong connectionist thesis is correct: much like humans & evolution, AGI can be reached by training an extremely large number of relatively simple units end-to-end for a long time on a wide variety of multimodal tasks, and it will recursively self-improve meta-learning efficient internal structures & algorithms optimal for the real world which learns how to generalize, reason, self-modify with internal learned reward proxies & optimization algorithms, and do zero/few-shot learning bootstrapped purely from the ultimate reward signals—without requiring extensive hand-engineering, hardwired specialized modules designed to support symbolic reasoning, completely new paradigms of computing hardware etc. (eg Clune 2019).
2019 for genetics saw more progress on genetic-engineering topics than GWASes; the GWASes that did come out were largely confirmatory—no one really needed more SES GWASes from Hill et al, or confirmation that the IQ GWASes work and that brain size is in fact causal for intelligence, and while the recovery of full height/BMI trait heritability from WGS is a strong endorsement of the long-term value of WGS, the transition from limited SNP data to WGS is foreordained (especially since WGS costs appear to finally be dropping again after their long stagnation). Even embryo selection saw greater mainstream acceptance, with a paper in Cell concluding (for the crudest possible simple embryo selection methods) that, fortunately, the glass was half-empty and need not be feared overmuch. More interesting were the notable events along all axis of post-simple-embryo-selection strategies: Genomic Prediction claimed to have done the first embryo selection on multiple PGSes, genome synthesis saw E. coli achieved, multiple promising post-CRISPR or mass CRISPR editing methods were announced, gene drive progressed to mammals, gametogenesis saw progress (including at least two human fertility startups I know of), serious proposals for human germline CRISPR editing are being made by a Russian (among others), and while He Jiankui was imprisoned by a secret court there otherwise do not appear to have been serious repercussions such as reports of the 3 CRISPR babies being harmed or an ‘indefinite moratorium’ (ie. ban). Thus, we saw good progress towards the enabling technologies for massive embryo selection (breaking the egg bottleneck by allowing generation of hundreds or thousands of embryos and thus multiple-SD gains from selection), IES (Iterated Embryo Selection), massive embryo editing (CRISPR or derivatives), and genome synthesis.
VR’s 2019 launch of Oculus Quest proved quite successful, selling out occasionally well after launch, and appears to appeal to normal people, with even hardcore VR fans acknowledging how much they appreciate the convenience of a single integrated unit. Unfortunately… it is not successful enough. There is no VR wave. Selling out may have as much to do with Facebook not investing too much into manufacturing. Worse, there is still no killer app beyond Beat Saber. The hardware is adequate to the job, the price is nugatory, the experience unparalleled, but there is no stampede into VR. So it seems VR is doomed to the long slow multi-decade adoption slog like that of PCs: it’s too new, too different, and we’re still not sure what to do with it. One day, it would not be surprising if most people have a VR headset, but that day is a long way away.
Bitcoin: little of note. Darknet markets proved unusually interesting: Dream Market, the longest-lived DNM ever, finally expired; Reddit betrayed its users by wholesale purging of subreddits, including
/r/DarkNetMarkets, causing me a great deal of grief; and most shockingly, DeepDotWeb was raided by the FBI over affiliate commissions it received from DNMs (apparently into the tens of millions of dollars—you’d’ve thought they’d taken down those hideous ads all over DDW if the affiliate links were so profitable…)
As the end of a decade is a traditional time to look back, I thought I’d try my own version of Scott Alexander’s essay “What Intellectual Progress Did I Make In The 2010s?”, where he considered how his ideas/beliefs evolved over the past decade of blogging.
I’m not given to introspection, so I was surprised to think back to 2010 and realize how far I’ve come in every way—even listing them objectively would sound insufferably conceited, so I won’t. To thank some of the people who helped me, directly or indirectly, risks (to paraphrase Borges) repudiating my debts to the others; but nevertheless, I should at least thank the following: Satoshi Nakamoto, kiba, Ross Ulbricht, Seth Roberts, Luke Muehlhauser, Nava Whiteford, Patrick McKenzie, SDr, ModafinilCat, Steve Hsu, Jack Conte, Said Achmiz, Patrick & John Collison, and Shawn Presser.
2010 was perhaps the worst of times but also best of times, because it was the year the future rebooted.
My personal circumstances were less than ideal. Wikipedia’s deletionist involution had intensified to the point where everyone could see it both on the ground and from the global statistics, and it was becoming clear the cultural shift was irreversible. Genetic engineering continued its grindingly-slow progress towards some day doing something, while in complex trait/behavioral genetics research, the early GWASes provided no useful polygenic scores but did reveal the full measure of the candidate-gene debacle: it wasn’t a minor methodological issue, but in some fields the false-positive rate approached 100%, and tens of thousands of papers were, or were based on, absolutely worthless research. AI/machine learning were exhausted, with state-of-the-art typically some sort of hand-engineered solution or complicated incremental tweaks to something crude like an SVM or random forest, with no even slightly viable path to interestingly powerful systems (much less AGI). At best, one could say that the stagnant backwater of AI, neural network research, showed a few interesting results, and the Schmidhuber lab had won a few obscure contests, and might finally prove useful for something over the coming years beyond playing backgammon or reading ZIP codes, assuming anyone could avoid eye-rolling at Schmidhuber’s website(s) long enough to read as far as his projections that computing power and NN performance trends would both continue (“what NN trends‽”); I noted to my surprise that Shane Legg had left an apparently successful academic career to launch a new startup called ‘DeepMind’, to do something with NNs. (Good for him, I thought, maybe they’ll get acquihired for a few million bucks by some corporation needing a small improvement from exotic ML variants, and then with a financial cushion, he can get back to his real work. After all, it’s not like connectionism ever worked before…) And reinforcement learning, as far as anyone need be concerned, didn’t work. Cryptography, as far as anyone need be concerned, consisted of the art of (not) protecting your Internet connection. Mainstream technology was obsessed with the mobile shift, of little interest or value to me, and a shift that came with severe drawbacks, such as a massive migration away from FLOSS back to proprietary technologies & walled gardens & ‘services’. The future had arrived, and had brought little with it besides 144 characters.
But also in 2010, disillusioned with writing on Wikipedia, I registered
gwern.net. Some geneticists had begun buzzing over GCTA and the early GWASes’ polygenic scores, which indicated that, candidate-genes notwithstanding, the genes were there to be found, and simple power analysis implied there were simply so many of them that one would need samples of tens of thousands–no, hundreds of thousands—of people to start finding them, which sounded daunting, but fortunately the super-exponential curve in sequencing costs ensured that those samples would become available in mere years, through things like something called ‘The UK BioBank’. (Told of this early on, I was skeptical: that seemed like an awful lot of faith in extrapolating a trend, and when did mega-projects like that ever work out?) Even more obscurely, some microbial geneticists noted that an odd protein associated with ‘CRISPR’ regions seemed to be part of a sort of bacterial immune system and could cut DNA effectively. Connectionism suddenly started working, and the nascent compute and NN trends did continue for the next decade, with Alexnet overnight changing computer vision, after which NNs began rapidly expanding and colonizing adjacent fields, with Transformers+pretraining recently claiming the scalp of the great holdout, natural language processing—nous sommes tous connexionnistes. At the same time, Wikileaks reached its high-water mark, helping inspire Edward Snowden, Bitcoin was gaining traction (I would hear of it and start looking into it in late 2010), and Ross Ulbricht was making Silk Road 1 (which he would launch in January 2011). In VR, Valve had returned to tinkering with it, and a young Palmer Luckey had begun to play with using smartphone screens as cheap high-speed high-res small displays for headsets. As dominant as mobile was, 2010 was also close to the peak (eg the launch of Instagram): a mobile strategy could now be taken for granted, and infrastructure & practices had begun to catch up, so the gold rush was over and attention could refocus elsewhere.
Progression of interests: DNB -> QS + IQ -> statistics + meta-analysis + Replication Crisis -> Bitcoin + darknet markets -> behavioral genetics -> decision theory -> DL+DRL
- link rot/fulltext/leprechauns
- stainless steel laws
- correlation ≠ causation
- everything is correlated
- methods > datasets > results
- Self-blinding: we don’t know how to measure what we want
- Replication Crisis
- Decision theory
- Users are lazy
- NNs are lazy
- NNs are overparameterized
- We don’t know how to do NNs
- Methods need to scale and be simple
- NNs are dirt cheap
- image/status: weight/appearance, gwern.net & Said Achmiz, AI risk & Bostrom/Musk
- Carmen (review)
- Akhnaten (review)
- Stalker (review)
- Freaks, 1932 (review)
- Die Walküre (review)
- Manon (review)
- Madama Butterfly (review)
- Invasion of the Body Snatchers, 1978 (review)
- Rurouni Kenshin 2012/Rurouni Kenshin: Kyoto Inferno 2014/Rurouni Kenshin: The Legend Ends 2014 (review)