February 2018 news

February 2018 gwern.net newsletter with 3 new essays and links on genetics/AI/psychology/economics, 1 book review, 1 movie review, and 7 pieces of music.
28 Jan 201820 May 2020 finished certainty: log importance: 0

This is the February 2018 edition of the gwern.net newsletter; previous, 2018 (). This is a summary of the revision-history RSS feed, overlapping with my & ; brought to you by my donors on Patreon. (March issue may be delayed as I will be traveling in San Francisco 1–14 March.)







  • AlphaGo (2017 documentary on ; overall, OK; glossy and light on technical detail, it instead focuses on following around , , , and starting roughly from when Fan Hui was invited in to play the AG1 prototype & lost. Having read the AG papers repeatedly and watched some of the matches & commentary live, there wasn’t much new but it was somewhat interesting to see behind the scenes. The screenshots of DM workstations are accidentally a bit revealing: AG1 was indeed Torch-based, and enough of the code is shown that a DRL expert could probably deduce the entire AG1 architecture—the variables, directories, and NN layers clearly point at an imitation-trained CNN with some sort of policy gradient finetuning. Perhaps the most interesting behind-the-scenes aspect is the worries about “delusions”, as Silver calls them in the documentary and then in the Zero AmA. As badly as AG1 crushed Sedol, the delusions made it a closer-run thing than simply comparing move strength implies. The discussion is also revealing: at one point they debate whether to use version 18 or version 19, which was still training; 19 is vetoed, because the training and test suite would take dangerously long. This clearly implies training from scratch, and keeping in mind that a single AG1 is estimated at 3+ GPU-years, demonstrates just how much computing power DeepMind can pour into a project and also demonstrates the “hardware overhang” of NNs—Zero may run on only 4 TPUs and train in a day of wallclock, and could feasibly be trained on 2010 or earlier GPUs, but how do you learn what exact architecture to train without extremely costly iteration? And that estimate of 19 AG1s trained before Lee Sedol may not include the many failed attempts at pure self-play AGs Silver alludes to in the AmA. With NNs, the typical pattern appears to be extremely costly R&D iterations eventually producing a slow sub-human proof-of-concept, followed by massive finetuning & optimization increasing the ability and reducing size/compute requirements by OOMs. Image classification, style transfer, Go, chess… I wish the Zero papers would go into way more detail about how the expert iteration solves delusions & fixes the infamous stability of deep self-play. In any case, the core of the movie is the interviews & closeups of Sedol losing the match; one is unable to not sympathize with him, and his lone victory is much more moving with the humanizing lens of the documentarian as opposed to on the YouTube livestream. It does predictably end trying to extract a moral of “AIs will empower humans, not replace them”; unfortunately, chess centaurs have already been sent to the knacker’s to be turned into glue, and Go players won’t have even that short after-life, judging by the Master tournament’s various formats & Zero’s margin of superiority. Not that it will matter to the Go players. Neither chess nor Go are about optimal play of chess or Go, but viewer entertainment. Other things, however, actually are about those things…)