“The Mastermind” (In depth series about international criminal mastermind Paul Le Roux who ran his group online while ordering hits on employees using mercenaries and smuggling drug until he is captured and becomes a USG asset to entrap his employees and assist who knows what covert operations. Oh, and he created TrueCrypt. His story is even wilder than Ross Ulbricht’s.)
“CORDYCEPS: Too clever for their own good” (SF/horror humor novella on overthinking things; early chapters are best in exploring acausal & amnesiac reasoning, somewhat like Utsuro no Hako to Zero no Maria, but goes on perhaps a bit too long and explains too much)
Significant Digits (Harry Potter and the Methods of Rationality: sequel intended to conclude the story; more action-focused with a literary bent, and much less didactic/“author tract” and Harry-focused than MoR. Highly recommended for anyone who liked MoR.)
Father Goose (peculiar Cary Grant WWII comedy-drama; it’s unusual to see Grant cast as a misanthropic drunkard and the movie can’t quite decide whether to be deadly serious or comedic, but most of the comedic beats are highly predictable)
Death Parade (expansion of Death Billiards, as an episodic series; stories remain a bit heavily focused on suicide and murder, but while the dark background story arc ultimately ends in a whimper, the main story arc ends in an emotionally satisfying way)
This page is a changelog for Gwern.net: a monthly reverse chronological list of recent major writings/changes/additions.
Following my writing can be a little difficult because it is often so incremental. So every month, in addition to my regular /r/Gwern subreddit submissions, I write up reasonably-interesting changes and send it out to the mailing list in addition to a compilation of links & reviews (archives).
A subreddit for posting links of interest and also for announcing updates to gwern.net (which can be used as a RSS feed). Submissions are categorized similar to the monthly newsletter and typically will be collated there.
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants. This is done by directly quantifying the chance genetic similarity of unrelated individuals and comparing it to their measured similarity on a trait; if two unrelated individuals are relatively similar genetically and also have similar trait measurements, then the measured genetics are likely to causally influence that trait, and the correlation can to some degree tell how much. This can be illustrated by plotting the squared pairwise trait differences between individuals against their estimated degree of relatedness. The GCTA framework can be applied in a variety of settings. For example, it can be used to examine changes in heritability over aging and development. It can also be extended to analyse bivariate genetic correlations between traits. There is an ongoing debate about whether GCTA generates reliable or stable estimates of heritability when used on current SNP data. The method is based on the outdated and false dichotomy of genes versus the environment. It also suffers from serious methodological weaknesses, such as susceptibility to population stratification.
I analyze an A/B test from a mail-order company of two different kinds of box packaging from a Bayesian decision-theory perspective, balancing posterior probability of improvements & greater profit against the cost of packaging & risk of worse results, finding that as the company’s analysis suggested, the new box is unlikely to be sufficiently better than the old. Calculating expected values of information shows that it is not worth experimenting on further, and that such fixed-sample trials are unlikely to ever be cost-effective for packaging improvements. However, adaptive experiments may be worthwhile.
“Genome-wide association study identifies 74 loci associated with educational attainment”, Aysu Okbay, Jonathan P. Beauchamp, Mark Alan Fontana, James J. Lee, Tune H. Pers, Cornelius A. Rietveld, Patrick Turley, Guo-Bo Chen, Valur Emilsson, S. Fleur W. Meddens, Sven Oskarsson, Joseph K. Pickrell, Kevin Thom, Pascal Timshel, Ronald de Vlaming, Abdel Abdellaoui, Tarunveer S. Ahluwalia, Jonas Bacelis, Clemens Baumbach, Gyda Bjornsdottir, Johannes H. Brandsma, Maria Pina Concas, Jaime Derringer, Nicholas A. Furlotte, Tessel E. Galesloot, Giorgia Girotto, Richa Gupta, Leanne M. Hall, Sarah E. Harris, Edith Hofer, Momoko Horikoshi, Jennifer E. Huffman, Kadri Kaasik, Ioanna P. Kalafati, Robert Karlsson, Augustine Kong, Jari Lahti, Sven J. van der Lee, Christiaan de Leeuw, Penelope A. Lind, Karl-Oskar Lindgren, Tian Liu, Massimo Mangino, Jonathan Marten, Evelin Mihailov, Michael B. Miller, Peter J. van der Most, Christopher Oldmeadow, Antony Payton, Natalia Pervjakova, Wouter J. Peyrot, Yong Qian, Olli Raitakari, Rico Rueedi, Erika Salvi, Börge Schmidt, Katharina E. Schraut, Jianxin Shi, Albert V. Smith, Raymond A. Poot, Beate St Pourcain, Alexander Teumer, Gudmar Thorleifsson, Niek Verweij, Dragana Vuckovic, Juergen Wellmann, Harm-Jan Westra, Jingyun Yang, Wei Zhao, Zhihong Zhu, Behrooz Z. Alizadeh, Najaf Amin, Andrew Bakshi, Sebastian E. Baumeister, Ginevra Biino, Klaus Bønnelykke, Patricia A. Boyle, Harry Campbell, Francesco P. Cappuccio, Gail Davies, Jan-Emmanuel De Neve, Panos Deloukas, Ilja Demuth, Jun Ding, Peter Eibich, Lewin Eisele, Niina Eklund, David M. Evans, Jessica D. Faul, Mary F. Feitosa, Andreas J. Forstner, Ilaria Gandin, Bjarni Gunnarsson, Bjarni V. Halldórsson, Tamara B. Harris, Andrew C. Heath, Lynne J. Hocking, Elizabeth G. Holliday, Georg Homuth, Michael A. Horan, Jouke-Jan Hottenga, Philip L. de Jager, Peter K. Joshi, Astanand Jugessur, Marika A. Kaakinen, Mika Kähönen, Stavroula Kanoni, Liisa Keltigangas-Järvinen, Lambertus A. L. M. Kiemeney, Ivana Kolcic, Seppo Koskinen, Aldi T. Kraja, Martin Kroh, Zoltan Kutalik, Antti Latvala, Lenore J. Launer, Maël P. Lebreton, Douglas F. Levinson, Paul Lichtenstein, Peter Lichtner, David C. M. Liewald, LifeLines Cohort Study, Anu Loukola, Pamela A. Madden, Reedik Mägi, Tomi Mäki-Opas, Riccardo E. Marioni, Pedro Marques-Vidal, Gerardus A. Meddens, George McMahon, Christa Meisinger, Thomas Meitinger, Yusplitri Milaneschi, Lili Milani, Grant W. Montgomery, Ronny Myhre, Christopher P. Nelson, Dale R. Nyholt, William E. R. Ollier, Aarno Palotie, Lavinia Paternoster, Nancy L. Pedersen, Katja E. Petrovic, David J. Porteous, Katri Räikkönen, Susan M. Ring, Antonietta Robino, Olga Rostapshova, Igor Rudan, Aldo Rustichini, Veikko Salomaa, Alan R. Sanders, Antti-Pekka Sarin, Helena Schmidt, Rodney J. Scott, Blair H. Smith, Jennifer A. Smith, Jan A. Staessen, Elisabeth Steinhagen-Thiessen, Konstantin Strauch, Antonio Terracciano, Martin D. Tobin, Sheila Ulivi, Simona Vaccargiu, Lydia Quaye, Frank J. A. van Rooij, Cristina Venturini, Anna A. E. Vinkhuyzen, Uwe Völker, Henry Völzke, Judith M. Vonk, Diego Vozzi, Johannes Waage, Erin B. Ware, Gonneke Willemsen, John R. Attia, David A. Bennett, Klaus Berger, Lars Bertram, Hans Bisgaard, Dorret I. Boomsma, Ingrid B. Borecki, Ute Bültmann, Christopher F. Chabris, Francesco Cucca, Daniele Cusi, Ian J. Deary, George V. Dedoussis, Cornelia M. van Duijn, Johan G. Eriksson, Barbara Franke, Lude Franke, Paolo Gasparini, Pablo V. Gejman, Christian Gieger, Hans-Jörgen Grabe, Jacob Gratten, Patrick J. F. Groenen, Vilmundur Gudnason, Pim van der Harst, Caroline Hayward, David A. Hinds, Wolfgang Hoffmann, Elina Hyppönen, William G. Iacono, Bo Jacobsson, Marjo-Riitta Järvelin, Karl-Heinz Jöckel, Jaakko Kaprio, Sharon L. R. Kardia, Terho Lehtimäki, Steven F. Lehrer, Patrik K. E. Magnusson, Nicholas G. Martin, Matt McGue, Andres Metspalu, Neil Pendleton, Brenda W. J. H. Penninx, Markus Perola, Nicola Pirastu, Mario Pirastu, Ozren Polasek, Danielle Posthuma, Christine Power, Michael A. Province, Nilesh J. Samani, David Schlessinger, Reinhold Schmidt, Thorkild I. A. Sørensen, Tim D. Spector, Kari Stefansson, Unnur Thorsteinsdottir, A. Roy Thurik, Nicholas J. Timpson, Henning Tiemeier, Joyce Y. Tung, André G. Uitterlinden, Veronique Vitart, Peter Vollenweider, David R. Weir, James F. Wilson, Alan F. Wright, Dalton C. Conley, Robert F. Krueger, George Davey Smith, Albert Hofman, David I. Laibson, Sarah E. Medland, Michelle N. Meyer, Jian Yang, Magnus Johannesson, Tõnu Esko, Peter M. Visscher, Philipp D. Koellinger, David Cesarini, Daniel J. Benjamin (2016-05-11):
Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.
Statistical folklore asserts that “everything is correlated”: in any real-world dataset, most or all measured variables will have non-zero correlations, even between variables which appear to be completely independent of each other, and that these correlations are not merely sampling error flukes but will appear in large-scale datasets to arbitrarily designated levels of statistical-significance or posterior probability.
This raises serious questions for null-hypothesis statistical-significance testing, as it implies the null hypothesis of 0 will always be rejected with sufficient data, meaning that a failure to reject only implies insufficient data, and provides no actual test or confirmation of a theory. Even a directional prediction is minimally confirmatory since there is a 50% chance of picking the right direction at random.
It also has implications for conceptualizations of theories & causal models, interpretations of structural models, and other statistical principles such as the “sparsity principle”.
One of the best predictors of children’s educational achievement is their family’s socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children’s educational achievement and its association with family SES.
LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously.
Results: In this manuscript, we describe LD Hub – a centralized database of summary-level GWAS results for 177 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies.
Availability and implementation
The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/
Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
“An Atlas of Genetic Correlations across Human Diseases and Traits”, Brendan Bulik-Sullivan, Hilary K. Finucane, Verneri Anttila, Alexander Gusev, Felix R. Day, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, Laramie Duncan, John R. B. Perry, Nick Patterson, Elise B. Robinson, Mark J. Daly, Alkes L. Price, Benjamin M. Neale (2015-04-06):
Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use our method to estimate 300 genetic correlations among 25 traits, totaling more than 1.5 million unique phenotype measurements. Our results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity and associations between educational attainment and several diseases. These results highlight the power of genome-wide analyses, since there currently are no genome-wide significant SNPs for anorexia nervosa and only three for educational attainment.
“Genetic contributions to self-reported tiredness”, Vincent Deary, Saskia P. Hagenaars, Sarah E. Harris, W. David Hill, Gail Davies, David CM Liewald, International Consortium for Blood Pressure GWAS, CHARGE consortium Aging, Longevity Group, Andrew M. McIntosh, Catharine R. Gale, Ian J. Deary (2016-04-05):
Self-reported tiredness and low energy, often called fatigue, is associated with poorer physical and mental health. Twin studies have indicated that this has a heritability between 6% and 50%. In the UK Biobank sample (n = 108 976) we carried out a genome-wide association study of responses to the question, “Over the last two weeks, how often have you felt tired or had little energy?” Univariate GCTA-GREML found that the proportion of variance explained by all common SNPs for this tiredness question was 8.4% (SE = 0.6%). GWAS identified one genome-wide significant hit (Affymetrix id 1:64178756_C_T; p = 1.36 x 10-11). LD score regression and polygenic profile analysis were used to test for pleiotropy between tiredness and up to 28 physical and mental health traits from GWAS consortia. Significant genetic correlations were identified between tiredness and BMI, HDL cholesterol, forced expiratory volume, grip strength, HbA1c, longevity, obesity, self-rated health, smoking status, triglycerides, type 2 diabetes, waist-hip ratio, ADHD, bipolar disorder, major depressive disorder, neuroticism, schizophrenia, and verbal-numerical reasoning (absolute rg effect sizes between 0.11 and 0.78). Significant associations were identified between tiredness phenotypic scores and polygenic profile scores for BMI, HDL cholesterol, LDL cholesterol, coronary artery disease, HbA1c, height, obesity, smoking status, triglycerides, type 2 diabetes, and waist-hip ratio, childhood cognitive ability, neuroticism, bipolar disorder, major depressive disorder, and schizophrenia (standardised β’s between −0.016 and 0.03). These results suggest that tiredness is a partly-heritable, heterogeneous and complex phenomenon that is phenotypically and genetically associated with affective, cognitive, personality, and physiological processes.
“Hech, sirs! But I’m wabbit, I’m back frae the toon;
Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
Analyzing genetic differences between closely related populations can be a powerful way to detect recent adaptation. The very large sample size of the UK Biobank is ideal for detecting selection using population differentiation, and enables an analysis of UK population structure at fine resolution. In analyses of 113,851 UK Biobank samples, population structure in the UK is dominated by 5 principal components (PCs) spanning 6 clusters: Northern Ireland, Scotland, northern England, southern England, and two Welsh clusters. Analyses with ancient Eurasians show that populations in the northern UK have higher levels of Steppe ancestry, and that UK population structure cannot be explained as a simple mixture of Celts and Saxons. A scan for unusual population differentiation along top PCs identified a genome-wide significant signal of selection at the coding variant rs601338 in FUT2 (p = 9.16 × 10−9). In addition, by combining evidence of unusual differentiation within the UK with evidence from ancient Eurasians, we identified new genome-wide significant (p < 5 × 10−8) signals of recent selection at two additional loci: CYP1A2/CSK and F12. We detected strong associations to diastolic blood pressure in the UK Biobank for the variants with new selection signals at CYP1A2/CSK (p = 1.10 × 10−19)) and for variants with ancient Eurasian selection signals in the ATXN2/SH2B3 locus (p = 8.00 × 10−33), implicating recent adaptation related to blood pressure.
Quantitative genetics is primarily concerned with two subjects: the correlation between relatives and the response to selection. The correlation between relatives is used to determine the heritability of a trait—the key quantity that addresses the question of nature vs. nurture. Heritability, in turn, is used to predict the response to selection—the main driver of improvements in crops and livestock. The theory of quantitative genetics has been thoroughly tested and applied in plants and animals, but heritability and selection remain open questions in humans due to limited natural experimental designs.
The Donor Sibling Registry (DSR) is an organization that helps individuals conceived as a result of sperm, egg, or embryo donation make contact with genetically related individuals. Families who conceived children via anonymous sperm donation join the DSR and match with other families who used the same donor ID at the same sperm bank. The resulting donor pedigree consists of heterosexual, lesbian, and single mother families who are connected through the common anonymous sperm donor used to conceive their children.
Here, we introduce a new quantitative genetic study design based on the unprecedented family relationships found in the donor pedigree. We surveyed 945 individual families constituting 159 donor pedigrees from the Donor Sibling Registry and used their demographic, physical, and behavioral characteristics to conduct a quantitative genetic study of selection and heritability. A direct measurement of phenotypic assortment showed mothers actively selected mates for height, eye color, and religion. Artificial selection for donor height increased mean child height in a manner consistent with the selection differential. Reared-apart donor-conceived paternal half-siblings provided unbiased heritability estimates for traits influenced by maternal and contrast effects. Maternal effects were important in determining the variance of birth weight while eliminating contrast effects revealed sociability to be a highly heritable childhood temperament. Thus, the unprecedented family relationships in the donor pedigree enable a universal model for quantitative genetics.
Recent findings from molecular genetics now make it possible to test directly for natural selection by analyzing whether genetic variants associated with various phenotypes have been under selection. I leverage these findings to construct polygenic scores that use individuals’ genotypes to predict their body mass index, educational attainment (EA), glucose concentration, height, schizophrenia, total cholesterol, and (in females) age at menarche. I then examine associations between these scores and fitness to test whether natural selection has been occurring. My study sample includes individuals of European ancestry born between 1931 and 1953 in the Health and Retirement Study, a representative study of the US population. My results imply that natural selection has been slowly favoring lower EA in both females and males, and are suggestive that natural selection may have favored a higher age at menarche in females. For EA, my estimates imply a rate of selection of about -1.5 months of education per generation (which pales in comparison with the increases in EA observed in contemporary times). Though they cannot be projected over more than one generation, my results provide additional evidence that humans are still evolving—albeit slowly, especially when compared to the rapid secular changes that have occurred over the past few generations due to cultural and environmental factors.
“Molecular genetic contributions to self-rated health”, Sarah E. Harris, Saskia P. Hagenaars, Gail Davies, W. David Hill, David CM Liewald, Stuart J. Ritchie, Riccardo E. Marioni, METASTROKE consortium, International Consortium for Blood Pressure, CHARGE consortium Aging, Longevity Group, CHARGE consortium Cognitive Group, Cathie LM Sudlow, Joanna M. Wardlaw, Andrew M. McIntosh, Catharine R. Gale, Ian J. Deary (2016-04-12):
Background: Poorer self-rated health (SRH) predicts worse health outcomes, even when adjusted for objective measures of disease at time of rating. Twin studies indicate SRH has a heritability of up to 60% and that its genetic architecture may overlap with that of personality and cognition.
Methods: We carried out a genome-wide association study (GWAS) of SRH on 111 749 members of the UK Biobank sample. Univariate genome-wide complex trait analysis (GCTA)-GREML analyses were used to estimate the proportion of variance explained by all common autosomal SNPs for SRH. Linkage Disequilibrium (LD) score regression and polygenic risk scoring, two complementary methods, were used to investigate pleiotropy between SRH in UK Biobank and up to 21 health-related and personality and cognitive traits from published GWAS consortia.
Results: The GWAS identified 13 independent signals associated with SRH, including several in regions previously associated with diseases or disease-related traits. The strongest signal was on chromosome 2 (rs2360675, p = 1.77×10−10) close to KLF7, which has previously been associated with obesity and type 2 diabetes. A second strong peak was identified on chromosome 6 in the major histocompatibility region (rs76380179, p = 6.15×10−10). The proportion of variance in SRH that was explained by all common genetic variants was 13%. Polygenic scores for the following traits and disorders were associated with SRH: cognitive ability, education, neuroticism, BMI, longevity, ADHD, major depressive disorder, schizophrenia, lung function, blood pressure, coronary artery disease, large vessel disease stroke, and type 2 diabetes.
Conclusion: Individual differences in how people respond to a single item on SRH are partly explained by their genetic propensity to many common psychiatric and physical disorders and psychological traits.
Genetic variants associated with common diseases and psychological traits are associated with self-rated health.
The SNP-based heritability of self-rated health is 0.13 (SE 0.006).
There is pleiotropy between self-rated health and psychiatric and physical diseases and psychological traits.
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. In this paper, we consider the case of prior procedural knowledge for neural networks, such as knowing how a program should traverse a sequence, but not what local actions should be performed at each step. To this end, we present an end-to-end differentiable interpreter for the programming language Forth which enables programmers to write program sketches with slots that can be filled with behaviour trained from program input-output data. We can optimise this behaviour directly through gradient descent techniques on user-specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex behaviours such as sequence sorting and addition. When connected to outputs of an LSTM and trained jointly, our interpreter achieves state-of-the-art accuracy for end-to-end reasoning about quantities expressed in natural language stories.
We introduce a design strategy for neural network macro-architecture based on self-similarity. Repeated application of a simple expansion rule generates deep networks whose structural layouts are precisely truncated fractals. These networks contain interacting subpaths of different lengths, but do not include any pass-through or residual connections; every internal signal is transformed by a filter and nonlinearity before being seen by subsequent layers. In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and ImageNet classification tasks, thereby demonstrating that residual representations may not be fundamental to the success of extremely deep convolutional neural networks. Rather, the key may be the ability to transition, during training, from effectively shallow to deep. We note similarities with student-teacher behavior and develop drop-path, a natural extension of dropout, to regularize co-adaptation of subpaths in fractal architectures. Such regularization allows extraction of high-performance fixed-depth subnetworks. Additionally, fractal networks exhibit an anytime property: shallow subnetworks provide a quick answer, while deeper subnetworks, with higher latency, provide a more accurate answer.
In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Doing this for a video sequence single-handed was beyond imagination. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. We make use of recent advances in style transfer in still images and propose new initializations and loss functions applicable to videos. This allows us to generate consistent and stable stylized video sequences, even in cases with large motion and strong occlusion. We show that the proposed method clearly outperforms simpler baselines both qualitatively and quantitatively.
The educational, occupational, and creative accomplishments of the profoundly gifted participants (IQs ⩾ 160) in the Study of Mathematically Precocious Youth (SMPY) are astounding, but are they representative of equally able 12-year-olds? Duke University’s Talent Identification Program (TIP) identified 259 young adolescents who were equally gifted. By age 40, their life accomplishments also were extraordinary: Thirty-seven percent had earned doctorates, 7.5% had achieved academic tenure (4.3% at research-intensive universities), and 9% held patents; many were high-level leaders in major organizations. As was the case for the SMPY sample before them, differential ability strengths predicted their contrasting and eventual developmental trajectories—even though essentially all participants possessed both mathematical and verbal reasoning abilities far superior to those of typical Ph.D. recipients. Individuals, even profoundly gifted ones, primarily do what they are best at. Differences in ability patterns, like differences in interests, guide development along different paths, but ability level, coupled with commitment, determines whether and the extent to which noteworthy accomplishments are reached if opportunity presents itself. [Keywords intelligence, creativity, giftedness, replication, blink comparator]
There is a popular belief in neuroscience that we are primarily data limited, and that producing large, multimodal, and complex datasets will, with the help of advanced data analysis algorithms, lead to fundamental insights into the way the brain processes information. These datasets do not yet exist, and if they did we would have no way of evaluating whether or not the algorithmically-generated insights were sufficient or even correct. To address this, here we take a classical microprocessor as a model organism, and use our ability to perform arbitrary experiments on it to see if popular data analysis methods from neuroscience can elucidate the way it processes information. Microprocessors are among those artificial information processing systems that are both complex and that we understand at all levels, from the overall logical flow, via logical gates, to the dynamics of transistors. We show that the approaches reveal interesting structure in the data but do not meaningfully describe the hierarchy of information processing in the microprocessor. This suggests current analytic approaches in neuroscience may fall short of producing meaningful understanding of neural systems, regardless of the amount of data. Additionally, we argue for scientists using complex non-linear dynamical systems with known ground truth, such as the microprocessor as a validation platform for time-series and structure discovery methods.
Neuroscience is held back by the fact that it is hard to evaluate if a conclusion is correct; the complexity of the systems under study and their experimental inaccessability make the assessment of algorithmic and data analytic technqiues challenging at best. We thus argue for testing approaches using known artifacts, where the correct interpretation is known. Here we present a microprocessor platform as one such test case. We find that many approaches in neuroscience, when used na•vely, fall short of producing a meaningful understanding.
The concept of multiple discovery is the hypothesis that most scientific discoveries and inventions are made independently and more or less simultaneously by multiple scientists and inventors. The concept of multiple discovery opposes a traditional view—the "heroic theory" of invention and discovery.
Gerard Manley Hopkins was an English poet and Jesuit priest, whose posthumous fame established him among the leading Victorian poets. His manipulation of prosody – particularly his concept of sprung rhythm – established him as an innovative writer of verse, as did his technique of praising God through vivid use of imagery and nature. Only after his death did Robert Bridges begin to publish a few of Hopkins's mature poems in anthologies, hoping to prepare the way for wider acceptance of his style. By 1930 his work was recognised as one of the most original literary accomplishments of his century. It had a marked influence on such leading 20th-century poets as T. S. Eliot, Dylan Thomas, W. H. Auden, Stephen Spender and Cecil Day-Lewis.
Father Goose is a 1964 American Technicolor romantic comedy film set in World War II, starring Cary Grant, Leslie Caron and Trevor Howard. The title derives from "Mother Goose", the code name assigned to Grant's character. The film won an Oscar for Best Original Screenplay. It introduced the song "Pass Me By" by Cy Coleman and Carolyn Leigh, later recorded by Peggy Lee, Frank Sinatra and others.
Short Peace is a multimedia project composed of four short anime films produced by Sunrise and Shochiku, and a video game developed by Crispy's! and Grasshopper Manufacture. The four films were released in Japanese theaters on July 20, 2013 and were screened in North America during April 2014. Sentai Filmworks have licensed the films for North America. The video game was released in January 2014 in Japan, April 2014 in Europe, and September 2014 in North America. The game’s physical releases in Japan and Europe includes the four animated shorts as a bonus.
Death Parade is a 2015 Japanese anime television series created, written, and directed by Yuzuru Tachikawa and produced by Madhouse. The series spawned from a short film, Death Billiards, which was originally produced by Madhouse for the Young Animator Training Project's Anime Mirai 2013 and released on March 2, 2013. The television series aired in Japan between January 9, 2015 and March 27, 2015. It is licensed in North America by Funimation and in the United Kingdom by Anime Limited, the latter of which was eventually cancelled. The series was obtained by Madman Entertainment for digital distribution in Australia and New Zealand.
Subscription page for the monthly gwern.net newsletter. There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews. You can also browse the archives since December 2013.
Newsletter tag: archive of all issues back to 2013 for the gwern.net newsletter (monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews.)
“GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment”, Cornelius A. Rietveld, Sarah E. Medland, Jaime Derringer, Jian Yang, Tõnu Esko, Nicolas W. Martin, Harm-Jan Westra, Konstantin Shakhbazov, Abdel Abdellaoui, Arpana Agrawal, Eva Albrecht, Behrooz Z. Alizadeh, Najaf Amin, John Barnard, Sebastian E. Baumeister, Kelly S. Benke, Lawrence F. Bielak, Jeffrey A. Boatman, Patricia A. Boyle, Gail Davies, Christiaan de Leeuw, Niina Eklund, Daniel S. Evans, Rudolf Ferhmann, Krista Fischer, Christian Gieger, Håkon K. Gjessing, Sara Hägg, Jennifer R. Harris, Caroline Hayward, Christina Holzapfel, Carla A. Ibrahim-Verbaas, Erik Ingelsson, Bo Jacobsson, Peter K. Joshi, Astanand Jugessur, Marika Kaakinen, Stavroula Kanoni, Juha Karjalainen, Ivana Kolcic, Kati Kristiansson, Zoltán Kutalik, Jari Lahti, Sang H. Lee, Peng Lin, Penelope A. Lind, Yongmei Liu, Kurt Lohman, Marisa Loitfelder, George McMahon, Pedro Marques Vidal, Osorio Meirelles, Lili Milani, Ronny Myhre, Marja-Liisa Nuotio, Christopher J. Oldmeadow, Katja E. Petrovic, Wouter J. Peyrot, Ozren Polašek, Lydia Quaye, Eva Reinmaa, John P. Rice, Thais S. Rizzi, Helena Schmidt, Reinhold Schmidt, Albert V. Smith, Jennifer A. Smith, Toshiko Tanaka, Antonio Terracciano, Matthijs J. H. M. van der Loos, Veronique Vitart, Henry Völzke, Jürgen Wellmann, Lei Yu, Wei Zhao, Jüri Allik, John R. Attia, Stefania Bandinelli, François Bastardot, Jonathan Beauchamp, David A. Bennett, Klaus Berger, Laura J. Bierut, Dorret I. Boomsma, Ute Bültmann, Harry Campbell, Christopher F. Chabris, Lynn Cherkas, Mina K. Chung, Francesco Cucca, Mariza de Andrade, Philip L. De Jager, Jan-Emmanuel De Neve, Ian J. Deary, George V. Dedoussis, Panos Deloukas, Maria Dimitriou, Guðný Eiríksdóttir, Martin F. Elderson, Johan G. Eriksson, David M. Evans, Jessica D. Faul, Luigi Ferrucci, Melissa E. Garcia, Henrik Grönberg, Vilmundur Guðnason, Per Hall, Juliette M. Harris, Tamara B. Harris, Nicholas D. Hastie, Andrew C. Heath, Dena G. Hernandez, Wolfgang Hoffmann, Adriaan Hofman, Rolf Holle, Elizabeth G. Holliday, Jouke-Jan Hottenga, William G. Iacono, Thomas Illig, Marjo-Riitta Järvelin, Mika Kähönen, Jaakko Kaprio, Robert M. Kirkpatrick, Matthew Kowgier, Antti Latvala, Lenore J. Launer, Debbie A. Lawlor, Terho Lehtimäki, Jingmei Li, Paul Lichtenstein, Peter Lichtner, David C. Liewald, Pamela A. Madden, Patrik K. E. Magnusson, Tomi E. Mäkinen, Marco Masala, Matt McGue, Andres Metspalu, Andreas Mielck, Michael B. Miller, Grant W. Montgomery, Sutapa Mukherjee, Dale R. Nyholt, Ben A. Oostra, Lyle J. Palmer, Aarno Palotie, Brenda W. J. H. Penninx, Markus Perola, Patricia A. Peyser, Martin Preisig, Katri Räikkönen, Olli T. Raitakari, Anu Realo, Susan M. Ring, Samuli Ripatti, Fernando Rivadeneira, Igor Rudan, Aldo Rustichini, Veikko Salomaa, Antti-Pekka Sarin, David Schlessinger, Rodney J. Scott, Harold Snieder, Beate St Pourcain, John M. Starr, Jae Hoon Sul, Ida Surakka, Rauli Svento, Alexander Teumer, The LifeLines Cohort Study, Henning Tiemeier, Frank J. A. van Rooij, David R. Van Wagoner, Erkki Vartiainen, Jorma Viikari, Peter Vollenweider, Judith M. Vonk, Gérard Waeber, David R. Weir, H.-Erich Wichmann, Elisabeth Widen, Gonneke Willemsen, James F. Wilson, Alan F. Wright, Dalton Conley, George Davey-Smith, Lude Franke, Patrick J. F. Groenen, Albert Hofman, Magnus Johannesson, Sharon L. R. Kardia, Robert F. Krueger, David Laibson, Nicholas G. Martin, Michelle N. Meyer, Danielle Posthuma, A. Roy Thurik, Nicholas J. Timpson, André G. Uitterlinden, Cornelia M. van Duijn, Peter M. Visscher, Daniel J. Benjamin, David Cesarini, Philipp D. Koellinger (2013-06-21):
A genome-wide association study (GWAS) of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent single-nucleotide polymorphisms (SNPs) are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (coefficient of determination R2 ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics.
[A landmark study in behavioral genetics and intelligence: the first well-powered GWAS to detect genetic variants for intelligence and education which replicate out of sample and are proven to be causal in a between-sibling study.]
“Replicability and robustness of genome-wide-association studies for behavioral traits.”, Rietveld, Cornelius A. Conley, Dalton Eriksson, Nicholas Esko, Tõnu Medland, Sarah E. Vinkhuyzen, Anna A. E Yang, Jian Boardman, Jason D. Chabris, Christopher F. Dawes, Christopher T. Domingue, Benjamin W. Hinds, David A. Johannesson, Magnus Kiefer, Amy K. Laibson, David Magnusson, Patrik K. E Mountain, Joanna L. Oskarsson, Sven Rostapshova, Olga Teumer, Alexander Tung, Joyce Y. Visscher, Peter M. Benjamin, Daniel J. Cesarini, David Koellinger, Philipp D (2014):
A recent genome-wide-association study of educational attainment identified three single-nucleotide polymorphisms (SNPs) whose associations, despite their small effect sizes (each R2 ≈ 0.02%), reached genome-wide significance (p < 5 × 10−8) in a large discovery sample and were replicated in an independent sample (p < .05). The study also reported associations between educational attainment and indices of SNPs called "polygenic scores." In three studies, we evaluated the robustness of these findings. Study 1 showed that the associations with all three SNPs were replicated in another large (n = 34,428) independent sample. We also found that the scores remained predictive (R2 ≈ 2%) in regressions with stringent controls for stratification (Study 2) and in new within-family analyses (Study 3). Our results show that large and therefore well-powered genome-wide-association studies can identify replicable genetic associations with behavioral traits. The small effect sizes of individual SNPs are likely to be a major contributing factor explaining the striking contrast between our results and the disappointing replication record of most candidate-gene studies.