March 2020 News

Gwern Branwen

March 2020 Gwern.net newsletter with links on pandemics, politics, DL; one anime review.

2019-12-26–2024-04-26 finished certainty: log importance: 0 bibliography

Writings
Media
- Links
- Film/TV

[Warning: JavaScript Disabled!]

[For support of key website features (link annotation popups/popovers & transclusions, collapsible sections, backlinks, tablesorting, image zooming, sidenotes etc), you must enable JavaScript.]

March 2020’s Gwern.net newsletter is now out; previous, February 2020 (archives). This is a collation of links and summary of major changes, overlapping with my Changelog; brought to you by my donors on Patreon.

Writings

Order Statistics: The Probability of a Double Maximum
Gwern.net: dark mode theme switcher; relocated from S3 to Nginx, please report any server bugs like inaccessible pages

Media

Links

Genetics:

Everything Is Heritable:
- “The genetic architecture of the human cerebral cortex”, Grasby et al 2020
Engineering:
- “Genetic Architecture of Complex Traits and Disease Risk Predictors”, Yong et al 2020 (joint optimization of human traits is easy because the genetic correlations cooperate & in high-dimensional space there’s always an alternative)

AI:

“Agent57: Outperforming the Atari Human Benchmark”, Badia et al 2020 (blog; Agent57 reaches the median human level across ALE—including Pitfall!/Montezuma’s Revenge. It is impressive but still sample-inefficient & uncomfortably baroque in combining what seems like every DM model-free DRL technique in one place: DDQN, Impala, R2D2, Memory Networks, Transformers, Neural Episodic Control, RND, NGU, PBT, MABs… Is model-free DRL a dead end if this is what it takes? I would have preferred to see ALE solved by better exploration in the enormously simpler MuZero.)
“AutoML-Zero: Evolving Machine Learning Algorithms From Scratch”, Real et al 2020 (Github; blog; using evolutionary search to bootstrap SGD/regularization; presumably the evolved SGD can be used to meta-learn, and so on—it’s optimizers all the way up)
15.ai: NN TTS service for generating natural high-quality voices of characters with minimal data (especially MLP:FiM voices; demos: 0/1/2/3/4; EQ:D contest results)
Matters Of Scale: “Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers”, Li et al 2020 (previously: Kaplan et al 2020); “Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited”, Maddox et al 2020; “Bayesian Deep Learning and a Probabilistic Perspective of Generalization”, Wilson & Izmailov2020; “On Linear Identifiability of Learned Representations”, Roeder et al 2020 (why do NNs work so well, don’t overfit, and train both faster & better the more you overparameterize? Perhaps bigger = better because you’re searching over more classes of models which all have a bias towards simplicity, so you have a better chance of finding a simple sub-network which performs well.)