N/A
2016-09-25–2021-01-04
finished
certainty: log
importance: 0
This is the October 2016 edition of the Gwern.net newsletter; previous, September 2016. This is a summary of the revision-history RSS feed, overlapping with Changelog & /
If you have not been receiving issues, please check your email account’s spam folder. (Gmail in particular has been flagging as spam.)
Writings
Media
Links
Genetics:
Everything Is Heritable:
- “Association between polygenic risk scores for attention-deficit hyperactivity disorder and educational and cognitive outcomes in the general population”, Stergiakouli et al 2016
- “Educational attainment and personality are genetically intertwined”, Mottus et al 2016
- “Ultra-rare disruptive and damaging mutations influence educational attainment in the general population”, Ganna et al 2016
- “Genome-wide analyses of empathy and systemizing: heritability and correlates with sex, education, and psychiatric risk”, Warrier et al 2016a; “Genome-wide meta-analysis of cognitive empathy: heritability, and correlates with sex, neuropsychiatric conditions and brain anatomy”, Warrier et al 2016b
- “Personality Polygenes, Positive Affect, and Life Satisfaction”, Weiss et al 2016
Recent Evolution:
Politics/
- “The Long-Term Effects of Cash Assistance”, Price & Song 2016 (An unusually long-term followup to one of the old American basic income experiments. No large harmful effects… but no large benefits either, nothing remotely like we observe in the Third World BI/
transfer experiments. And if money helps with social outcomes, it should’ve helped more in the 1970s than it does now, so I feel more pessimistic about Give Directly/ YC’s USA BI experiment.) - “Men’s status and reproductive success in 33 nonindustrial societies: Effects of subsistence, marriage system, and reproductive strategy”, von Rueden & Jaeggi 2016
- “The illusion of the perfect alibi: Establishing the base rate of non-offenders’ alibis”, Nieuwkamp et al 2016
AI:
- “Hybrid computing using a neural network with dynamic external memory”, Graves et al 2016 (blog); scaling to extremely large external memories: “Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes”, Rae et al 2016
- “Achieving Human Parity in Conversational Speech Recognition”, Xiong et al 2016
- “Video Pixel Networks”, Kalchbrenner et al 2016
- “Asynchronous Methods for Deep Reinforcement Learning (A3C)”, Mnih et al 2016
- “Deep Reinforcement Learning for Robotic Manipulation”, Gu et al 2016 (video; blog)
- “Uncertainty in Deep Learning”, Gal 2016 (using dropout to turn NNs into ensembles of Bayesian NNs, allowing extraction of posterior distributions and thus uncertainty of outputs, which helps active learning & reinforcement learning)
- “Sim-to-Real Robot Learning from Pixels with Progressive Nets”, Rusu et al 2016
- Training A3C to solve Atari Pong in <4 minutes on a supercomputer through brute parallelism
- “Image Synthesis from Yahoo’s
open_nsfw
” (hilarious) - Active learning demo: interactively drag and drop photos to train a CNN+random forest to binary classify along some trait
Statistics/
“Active Learning Literature Survey”, Settles 2010
You can see this as a way to most economically increase your dataset size by only labeling the most valuable instances; it can also be used to improve dataset quality by targeting instead the errors a model makes on a noisy corpus for examination by the oracle; and finally, it can be seen as a demonstration of the advantages of reinforcement learning over simple supervised learning over heaps of data—given the curse of dimensionality, most data is useless for training a model because it is so redundant and already a solved problem, and the data the model needs to improve its performance is a needle in a haystack. So by giving a model RL capabilities, you improve its supervised/
inference performance! The way I put this: “tool AIs want to be agent AIs”. “We Gave Four Good Pollsters the Same Raw Data. They Had Four Different Results” (random error vs systematic error)
“Close but no Nobel: the scientists who never won; Archives reveal the most-nominated researchers who missed out on a Nobel Prize” (Nobels have considerable measurement error)
“Do scholars follow Betteridge’s Law? The use of questions in journal article titles”, Cook & Plourde 2016
scatterplots work much better than several other forms of visualizing data for understanding correlations, Kay & Heer 2015 (also a nice demonstration of fitting progressively more complex & realistic Bayesian models)
Psychology/
- “The negative Flynn Effect: A systematic literature review”, Dutton et al 2016
- “Safe landing strategies during a fall: systemic review and meta-analysis”, Moon & Sosnoff 2016
- “From Terman to Today: A Century of Findings on Intellectual Precocity”, Lubinski 2016
- Some early rapamycin anti-aging results from the dog people: improvement in dog heart health after 10 weeks of use.
- “Stereotype (In)Accuracy in Perceptions of Groups and Individuals”, Jussim et al 2015
- “CMV Is a Greater Threat to Infants Than Zika, but Far Less Often Discussed” (it would not surprise me if CMV was responsible for my own hearing-impairment)
- “Does brain creatine content rely on exogenous creatine in healthy youth? A proof-of-principle study”, Merege-Filho et al 2016 (null)
- “Functional MRI in awake dogs predicts suitability for assistance work”, Berns et al 2016
- Home steam distillation of 99% pure nepetalactone from catnip leaves (0.03% yield compared to theoretical max of 0.3%, so can convert 0.45kg of leaves to 143mg nepetalactone)
Technology:
- “Sam Altman’s Manifest Destiny: Is the head of Y Combinator fixing the world, or trying to take over Silicon Valley?”
- “The Joinery”: animated explanations of traditional nail-less Japanese woodworking joints (background)
- “Uber’s Ad-Toting Drones Are Heckling Drivers Stuck in Traffic: Forget billboards-motorists now have ads buzzing a few feet above their windshields” (possibly the most cyberpunk thing I’ve seen since VR headsets)
- “Keystroke Recognition Using WiFi Signals”, Ali et al 2015
- “APEX: Automatic Programming Assignment Error Explanation”, Kim et al 2016
Economics:
- Abuse of ‘arbitration’ clauses in international trade treaties to escape criminal liability
- “The View from Above: Applications of Satellite Data in Economics”, Donaldson & Storeygard 2016
- “The Dizzying Grandeur of 21st-Century Agriculture”
- “Adam Smith, Watch Prices, and the Industrial Revolution”, Kelly & Grada 2016 (commentary)
- “My First Gulfstream”
- “Everything you need to know about whether money makes you happy”
- “Young Rural Women in India Chase Big-City Dreams: Experiments like one in Bangalore, luring migrants to fill factory jobs, collide with an old way of life that keeps women and girls in seclusion until an arranged marriage” (3500 years of female slavery shrinking under industrialization and smartphones: “all fixed, fast-frozen relations, with their train of ancient and venerable prejudices and opinions, are swept away, all new-formed ones become antiquated before they can ossify. All that is solid melts into air, all that is holy is profaned…”)
Books
Nonfiction:
- Montaillou: The Promised Land of Error, Le Roy Ladurie 1975
- The Riddle of the Labyrinth: The Quest to Crack an Ancient Code, Fox (review)
- Confessions of an English Opium-Eater, by Thomas De Quincey (review)
Fiction:
- Tales of Ise, Anonymous (review)
Music
Touhou:
- “Children on their Birthdays” (Asahi feat. Emaru; Lucky 7 {R13}) [acoustic]
- “breathless” (NAGI☆ feat. 美歌; Parallel Cross {C90}) [vocal]
Doujin:
- “Spring ephemeral” (Morrigan feat. Lily; Spring ephemeral {M3 33}) [acoustic]
- “Heure bleue” (RANDO:; RADIAL DIVERSE SYSTEM FIFTEENTH ANNIVERSARY {C90}) [electronic]
- “Hall of Mirrors” (sta; RADIAL DIVERSE SYSTEM FIFTEENTH ANNIVERSARY {C90}) [rock]
- “Parabola” (tigerlily; RADIAL DIVERSE SYSTEM FIFTEENTH ANNIVERSARY {C90}) [post-rock]
- “16777215” (b; RADIAL DIVERSE SYSTEM FIFTEENTH ANNIVERSARY {C90}) [electronic]
- “てぃんさぐぬ花” (Togo Project feat. Junko Wada(BE THE VOICE); AD:HOUSE 5 {C90}) [folk/
house/ acoustic] - “Starlight,tonight” (Hommarju feat. Yukacco; Another Wor1d {C90}) [vocal]
- “Weigh the anchor” (Harito; ‘Sun Flowers’ {C90}) [house/
electronic] - “Epilogue” (n-buna; Walking on the Moon {2016}) [classical]
- “それは、本当に救済なのか” (Morrigan; from080723 DISC EDITION {C90}) [electronic/
ambient/ vocal]
Link Bibliography
Bibliography of page links in reading order (with annotations when available):
“September 2016 News”, (2016-08-18):
N/A
“Changelog”, (2013-09-15):
This page is a changelog for Gwern.net: a monthly reverse chronological list of recent major writings/
changes/ additions. Following my writing can be a little difficult because it is often so incremental. So every month, in addition to my regular /
r/ subreddit submissions, I write up reasonably-interesting changes and send it out to the mailing list in addition to a compilation of links & reviews (archives).Gwern “/r/gwern subreddit”, (2018-10-01):
A subreddit for posting links of interest and also for announcing updates to gwern.net (which can be used as a RSS feed). Submissions are categorized similar to the monthly newsletter and typically will be collated there.
“Genetic correlation”, (2020-12-22):
In multivariate quantitative genetics, a genetic correlation is the proportion of variance that two traits share due to genetic causes, the correlation between the genetic influences on a trait and the genetic influences on a different trait estimating the degree of pleiotropy or causal overlap. A genetic correlation of 0 implies that the genetic effects on one trait are independent of the other, while a correlation of 1 implies that all of the genetic influences on the two traits are identical. The bivariate genetic correlation can be generalized to inferring genetic latent variable factors across > 2 traits using factor analysis. Genetic correlation models were introduced into behavioral genetics in the 1970s–1980s.
http:/
/ ije.oxfordjournals.org/ content/ early/ 2016/ 09/ 28/ ije.dyw216.full “Educational attainment and personality are genetically intertwined”, (2016-09-28):
It is possible that heritable variance in personality characteristics does not reflect (only) genetic and biological processes specific to personality per se. We tested the possibility that Five-Factor Model personality domains and facets, as rated by people themselves and their knowledgeable informants, reflect polygenic influences that have been previously associated with educational attainment. In a sample of over 3,000 adult Estonians, polygenic scores for educational attainment, based on small contributions from more than 150,000 genetic variants, were correlated with various personality traits, mostly from the Neuroticism and Openness domains. The correlations of personality characteristics with educational attainment-related polygenic influences reflected almost entirely their correlations with phenotypic educational attainment. Structural equation modeling of the associations between polygenic risk, personality (a weighed aggregate of education-related facets) and educational attainment lent relatively strongest support to the possibility of educational attainment mediating (explaining) some of the heritable variance in personality traits.
“Genome-wide analyses of empathy and systemizing: heritability and correlates with sex, education, and psychiatric risk”, (2016-04-29):
Empathy is the drive to identify the mental states of others and respond to these with an appropriate emotion. Systemizing is the drive to analyse or build lawful systems. Difficulties in empathy have been identified in different psychiatric conditions including autism and schizophrenia. In this study, we conducted genome-wide association studies of empathy and systemizing using the Empathy Quotient (EQ) (n = 46,861) and the Systemizing Quotient-Revised (SQ-R) (n = 51,564) in participants from 23andMe, Inc. We confirmed significant sex-differences in performance on both tasks, with a male advantage on the SQ-R and female advantage on the EQ. We found highly significant heritability explained by single nucleotide polymorphisms (SNPs) for both the traits (EQ: 0.11±0.014; p = 1.7 × 10-14 and SQ-R: 0.12±0.012; p = 1.2 × 10-20) and these were similar for males and females. However, genes with higher expression in the male brain appear to contribute to the male advantage for the SQ-R. Finally, we identified significant genetic correlations between high score for empathy and risk for schizophrenia (p = 2.5 × 10-5), and correlations between high score for systemizing and higher educational attainment (p = 5 × 10-4). These results shed light on the genetic contribution to individual differences in empathy and systemizing, two major cognitive functions of the human brain.
“Genome-wide meta-analysis of cognitive empathy: heritability, and correlates with sex, neuropsychiatric conditions and brain anatomy”, (2016-10-19):
We conducted a genome-wide meta-analysis of cognitive empathy using the ‘Reading the Mind in the Eyes’ Test (Eyes Test) in 88,056 Caucasian research participants (44,574 females and 43,482 males) from 23andMe Inc., and an additional 1,497 Caucasian participants (891 females and 606 males) from the Brisbane Longitudinal Twin Study (BLTS). We confirmed a female advantage on the Eyes Test (Cohen’s d = 0.21, p &;t; 0.001), and identified a locus in 3p26.1 that is associated with scores on the Eyes Test in females (rs7641347, pmeta = 1.57 × 10−8). Common single nucleotide polymorphisms (SNPs) explained 20% of the twin heritability and 5.6% (±0.76; p = 1.72 × 10−13) of the total trait variance in both sexes. Finally, we identified significant genetic correlation between the Eyes Test and measures of empathy (the Empathy Quotient), openness (NEO-Five Factor Inventory), and different measures of educational attainment and cognitive aptitude, and show that the genetic determinants of striatal volumes (caudate nucleus, putamen, and nucleus accumbens) are positively correlated with the genetic determinants of performance on the Eyes Test.
https:/
/ economics.stanford.edu/ sites/ default/ files/ djprice_jmp.pdf “Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes”, (2016-10-27):
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows — limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs faster and with less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.
“Achieving Human Parity in Conversational Speech Recognition”, (2016-10-17):
Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcribers is 5.9 which newly acquainted pairs of people discuss an assigned topic, and 11.3 the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state of the art, and edges past the human benchmark, achieving error rates of 5.8 11.0 convolutional and LSTM acoustic model architectures, combined with a novel spatial smoothing method and lattice-free MMI acoustic training, multiple recurrent neural network language modeling approaches, and a systematic use of system combination.
“Video Pixel Networks”, (2016-10-03):
We propose a probabilistic video model, the Video Pixel Network (VPN), that estimates the discrete joint distribution of the raw pixel values in a video. The model and the neural architecture reflect the time, space and color structure of video tensors and encode it as a four-dimensional dependency chain. The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth. The VPN also produces detailed samples on the action-conditional Robotic Pushing benchmark and generalizes to the motion of novel objects.
“Asynchronous Methods for Deep Reinforcement Learning”, (2016-02-04):
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
“Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates”, (2016-10-03):
Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered policy representations and human-supplied demonstrations. Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted to simulated settings and relatively simple tasks, due to their apparent high sample complexity. In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots. We demonstrate that the training times can be further reduced by parallelizing the algorithm across multiple robots which pool their policy updates asynchronously. Our experimental evaluation shows that our method can learn a variety of 3D manipulation skills in simulation and a complex door opening skill on real robots without any prior demonstrations or manually designed representations.
https:/
/ research.googleblog.com/ 2016/ 10/ how-robots-can-acquire-new-skills-from.html “Sim-to-Real Robot Learning from Pixels with Progressive Nets”, (2016-10-13):
Applying end-to-end learning to solve complex, interactive, pixel-driven control tasks on a robot is an unsolved problem. Deep Reinforcement Learning algorithms are too slow to achieve performance on a real robot, but their potential has been demonstrated in simulated environments. We propose using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world. The progressive net approach is a general framework that enables reuse of everything from low-level visual features to high-level policies for transfer to new tasks, enabling a compositional, yet simple, approach to building complex skills. We present an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging the reality gap. Unlike other proposed approaches, our real-world experiments demonstrate successful task learning from raw visual input on a fully actuated robot manipulator. Moreover, rather than relying on model-based trajectory optimisation, the task learning is accomplished using only deep reinforcement learning and sparse rewards.
http:/
/ www.allinea.com/ blog/ 201610/ deep-learning-episode-4-supercomputer-vs-pong-ii “Image Synthesis from Yahoo's
open_nsfw
”, (2016):Yahoo’s recently open sourced neural network,
open_nsfw
, is a fine tuned Residual Network which scores images on a scale of 0 to 1 on its suitability for use in the workplace…What makes an image NSFW, according to Yahoo? I explore this question with a clever new visualization technique by Nguyen et al…Like Google’s Deep Dream, this visualization trick works by maximally activating certain neurons of the classifier. Unlike deep dream, we optimize these activations by performing descent on a parameterization of the manifold of natural images.[Demonstration of an unusual use of backpropagation to ‘optimize’ a neural network: instead of taking a piece of data to input to a neural network and then updating the neural network to change its output slightly towards some desired output (such as a correct classification), one can instead update the input so as to make the neural net output slightly more towards the desired output. When using a image classification neural network, this reversed form of optimization will ‘hallucinate’ or ‘edit’ the ‘input’ to make it more like a particular class of images. In this case, a porn/NSFW-detecting NN is reversed so as to make images more (or less) “porn-like”. Goh runs this process on various images like landscapes, musical bands, or empty images; the maximally/minimally porn-like images are disturbing, hilarious, and undeniably pornographic in some sense.]
“Why Tool AIs Want to Be Agent AIs”, (2016-09-07):
Autonomous AI systems (Agent AIs) trained using reinforcement learning can do harm when they take wrong actions, especially superintelligent Agent AIs. One solution would be to eliminate their agency by not giving AIs the ability to take actions, confining them to purely informational or inferential tasks such as classification or prediction (Tool AIs), and have all actions be approved & executed by humans, giving equivalently superintelligent results without the risk.
I argue that this is not an effective solution for two major reasons. First, because Agent AIs will by definition be better at actions than Tool AIs, giving an economic advantage. Secondly, because Agent AIs will be better at inference & learning than Tool AIs, and this is inherently due to their greater agency: the same algorithms which learn how to perform actions can be used to select important datapoints to learn inference over, how long to learn, how to more efficiently execute inference, how to design themselves, how to optimize hyperparameters, how to make use of external resources such as long-term memories or external software or large databases or the Internet, and how best to acquire new data. All of these actions will result in Agent AIs more intelligent than Tool AIs, in addition to their greater economic competitiveness. Thus, Tool AIs will be inferior to Agent AIs in both actions and intelligence, implying use of Tool AIs is a even more highly unstable equilibrium than previously argued, as users of Agent AIs will be able to outcompete them on two dimensions (and not just one).
http:/
/ andrewgelman.com/ 2016/ 09/ 21/ what-has-happened-down-here-is-the-winds-have-changed/ http:/
/ www.nature.com/ news/ close-but-no-nobel-the-scientists-who-never-won-1.20781 https:/
/ idl.cs.washington.edu/ files/ 2015-BeyondWebersLaw-InfoVis.pdf http:/
/ edition.cnn.com/ 2016/ 10/ 06/ health/ rapamycin-dog-live-longer/ index.html http:/
/ www.nytimes.com/ 2016/ 10/ 25/ health/ cmv-cytomegalovirus-pregnancy.html “Functional MRI in awake dogs predicts suitability for assistance work”, (2016-10-12):
The overall goal of this work was to measure the efficacy of fMRI for predicting whether a dog would be a successful service dog. The training and imaging were performed in 50 dogs entering advanced training at 17-21 months of age. FMRI responses were measured while each dog observed hand signals indicating either reward or no reward and given by both a familiar handler and a stranger. 49 dogs successfully completed fMRI training and scanning. Of these, 33 eventually completed service training and were matched with a person, while 10 were released for behavioral reasons. Using anatomically defined regions-of-interest in the ventral caudate, amygdala, and visual cortex, we developed a classifier based on the dogs9 outcomes. We found that responses in the stranger condition were sufficient to develop an accurate brain-based classifier. On all data, the classifier had a positive predictive value of 96% with 10% false positives. The area under the receiver operating characteristic curve was 0.90 (0.79 with 4-fold cross-validation, p = 0.02), indicating a significant diagnostic capability. Within the stranger condition, the differential response to [reward – no reward] in ventral caudate was positively correlated with a successful outcome, while the differential response in the amygdala was negatively correlated to outcome. These results show that successful service dogs transfer knowledge to strangers as indexed by ventral caudate activity without excessive arousal as measured in the amygdala.
http:/
/ www.instructables.com/ id/ DIY-Kitty-Crack%3a--ultra-potent-catnip-extract/ ?ALLSTEPS http:/
/ www.newyorker.com/ magazine/ 2016/ 10/ 10/ sam-altmans-manifest-destiny http:/
/ www.spoon-tamago.com/ 2016/ 10/ 04/ animated-gifs-illustrating-the-art-of-japanese-wood-joinery/ http:/
/ researcher.watson.ibm.com/ researcher/ files/ us-liup/ apex.pdf http:/
/ www.nytimes.com/ interactive/ 2016/ 10/ 09/ magazine/ big-food-photo-essay.html http:/
/ econlog.econlib.org/ archives/ 2016/ 10/ what_do_crimina.html http:/
/ www.nytimes.com/ 2016/ 09/ 25/ world/ asia/ bangalore-india-women-factories.html “Montaillou (book)”, (2020-12-28):
Montaillou is a book by the French historian Emmanuel Le Roy Ladurie first published in 1975. It was first translated into English in 1978 by Barbara Bray, and has been subtitled The Promised Land of Error and Cathars and Catholics in a French Village.
“Emmanuel Le Roy Ladurie”, (2020-12-28):
Emmanuel Bernard Le Roy Ladurie is a French historian whose work is mainly focused upon Languedoc in the Ancien Régime, particularly the history of the peasantry. One of the leading historians of France, Le Roy Ladurie has been called the "standard-bearer" of the third generation of the Annales school and the "rock star of the medievalists", noted for his work in social history.
https:/
/ www.amazon.com/ Riddle-Labyrinth-Quest-Crack-Ancient/ dp/ 0062228862 “Confessions of an English Opium-Eater”, (2020-12-28):
Confessions of an English Opium-Eater (1821) is an autobiographical account written by Thomas De Quincey, about his laudanum addiction and its effect on his life. The Confessions was "the first major work De Quincey published and the one which won him fame almost overnight..."
/
Book-reviews#confessions-of-an-english-opium-eater-quincey-2003 “The Tales of Ise”, (2020-12-22):
The Tales of Ise is a Japanese uta monogatari, or collection of waka poems and associated narratives, dating from the Heian period. The current version collects 125 sections, with each combining poems and prose, giving a total of 209 poems in most versions.
https:/
/ www.dropbox.com/ s/ fslrukxwwz3gapc/ b-radialdiversesystemfifteenthanniversary-16777215.ogg? https:/
/ www.dropbox.com/ s/ co1x45wjnmfod2h/ nbuna-walkingonthemoon-epilogue.ogg “Gwern.net newsletter (Substack subscription page)”, (2013-12-01):
Subscription page for the monthly gwern.net newsletter. There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews. You can also browse the archives since December 2013.
“Gwern.net newsletter archives”, (2013-12-01):
Newsletter tag: archive of all issues back to 2013 for the gwern.net newsletter (monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews.)
“Reinforcement learning”, (2020-12-22):
Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.