Continuing the 2015 trends, 2016 was a banner year for AI & genetics.
In AI, demonstrating the potential for rapid advance, AlphaGo went from low professional level as of October 2015 to world champion level, crushing Lee Sedol 4-1 with substantial margin, and just when everyone had forgotten, then a refined (presumably pure self-play) version of AlphaGo went 60-0 in blitz matches online with many of the top Go players (including Ke Jie). The translation RNNs finally made their long-awaited appearance in commercial production with Google Translate, making for the largest jump in translation quality in decades, bringing many translation pairs up to surprisingly high quality (even Japanese⟺English translations are now semi-comprehensible, as opposed to the status quo total gibberish); combined with the rapid progress in voice transcription and the surprising results of human-level lipreading, one can now imagine a NN-powered Babelfish (which, combined with HUDs, could be revolutionary for the deaf & hearing-impaired). Generative adversarial networks (GANs) remained a central topic of AI research, with better theoretical understanding (linking them to reinforcement learning), and many tweaks and incremental refinements increasing the size of feasible generated images (eg StackGAN’s large bird/flower image generation capability); however, GANs currently have not delivered any meaningful increases on any applied tasks & remain a solution in search of a problem, so that is something to hope for in 2017—demonstration that the unsupervised or generative aspects of GANs can be usefully employed for planning or something. Perhaps the most exciting work in 2016 was the long-term work on architecture in providing large-scale memory mechanisms (in the form of efficient external memory or encoded into the weights of large expanding or sharded NNs), in learning to train large-scale NNs (“synthetic gradients”), and in a particularly surprising set of papers, demonstrating that NNs+reinforcement-learning can efficiently learn how to design NN architectures & units. (This was not something anyone doubted could be done, but previous RL work suggested that it was years away & no one could manage it without whole GPU farms; but as far as Google was concerned… “You see, I told you it couldn’t be done without turning the whole country into a factory. You have done just that.”) Since NNs do not decay like biological neurons, and are not hard-limited by skull volumes or calories, and since all tasks share mutual information & form informative priors for each other critical to sample-efficient learning, there is a lot of inherent pressure towards large growing multi-task NNs which do transfer learning & can optimize at multiple levels end-to-end; as GPURAM limits lift, we’ll see more of these. Aside from the important work in “NNs all the way down” vein, reinforcement learning grew in importance and it is increasingly common to use RL methods to control memory or network components, interact with an environment (often broadly interpreted, as anything which can be turned into a tree, which goes far beyond games like Go or chess & includes theorem proving or program optimization), or learn to optimize a non-differentiable reward/loss function, and I am excited to see planning re-emerge as a theme after the dominance of model-free methods over the past 3 years; we will see more of that in 2017, doubtless, especially as some of the architectural tweaks from 2016 (some of which claim as much as an order of magnitude improvement on ALE sample-efficiency) get tried out & reused.
In genetics, the growth of UK Biobank and the introduction of LD score regression & other summary-statistic-only methods continued driving large-scale results; the study of human genetic correlations made an absurd amount of progress in 2016, demonstrating shared genetic influences on countless phenotypic traits and pervasive intercorrelations of good traits and disease traits, respectively. Detecting recent human evolution has been difficult due to lack of ancient DNA to compare with, but the supply of that has grown steadily, permitting some specific examples to be nailed down, and a new method based on contemporary whole genomes may blow the area wide open as whole genomes have recently crossed the $1,000 mark and in coming years, scientific projects & medical biobanks will shift over to whole genomes. Another possible field explosion is “genome synthesis”—I was astonished to learn that it is now feasible to synthesize from scratch entire chromosomes of arbitrary design, and that a human genome could potentially be synthesized for ~$1,000,000,000, which would render totally obsolete any considerations of embryo selection/CRISPR/iterated embryo selection, with an active advocacy effort for a genome synthesis project to be launched. 2017 will bring further discoveries of how humans have adapted to local environments and their societies over the past centuries & millennia. Honorable mentions should also go to the steady (and disquieting) progress towards iterated embryo selection, and a scattering of results from the continuously-growing-sample-sizes GWASes: as predicted, the education/intelligence hits have increased drastically as sample size increased, and the historically difficult targets of personality & depression have finally yielded some more hits. One particularly intriguing GWAS focused on violence & criminal behavior with good results, so that trait will yield as well to further study. Past GWASes continued to be applied; the results of Belsky et al 2016 will come as no surprise, but will frustrate the critics who insist that all non-disease results are methodological artifacts or merely reflect population structure. CRISPR progress continues as expected, with the first uses in humans in 2016 by Chinese & American scientists.
Less cosmically, one of the big tech stories of 2016 was the rollout of consumer VR—successful but not epochal, clearly the (or at least, a) future of gaming but no killer app. Oculus had a rocky launch caused by its decision to launch prematurely, without motion controls, which the launch of HTC/Valve’s Vive made clear is not an optional feature for truly compelling VR (and my own brief experience with an Oculus Rift at a Best Buy demo left me longing, after just 20 seconds in The Climb, for hand tracking), but the lack of motion controls & compelling content made for a slow launch. The Vive had a better launch with excellent motion controls & tracking, the comparable Oculus Touch controls only really shipping half a year later in December, demonstrating why Oculus launched when it did—it was either bite the bullet of a bad launch, or let Vive rule unopposed. Somewhat to my surprise, Sony’s quiet Project Morpheus launched successfully as PlayStation VR, making for 3 high-quality competing VR headsets/ecosystems. (Sony had not seemed serious about the whole VR thing so I doubted it would launch in 2016 or at all.) While most gamers, much less people, do not feel a burning need for getting into VR at the moment (myself included, as I think the screen resolutions need improvement), what is notable is what didn’t happen: we did not see widespread reports of vomiting, of people swearing off VR forever, of VR being discarded as a 3D-TV-like gimmick, of developers flooding in & getting burned, of sales plummeting and being well below the million-mark, of the initial trickle of games sputtering out… In short, of any of the things that the naysayers predicted would doom consumer VR. The worst that the early adopters, critics, and regular people have to say is that there are not enough good games (decreasingly true by the end of 2016), that the headsets and GPUs cost too much (true but will predictably be fixed as time passes), that the Oculus Rift lacked motion controls (fixed as of December 2016), and the resolution is too low / devices are wired / require external tracking (likely improved substantially in the second generation, possibly fixed entirely by the third or fourth)—nothing fatal or important, in other words. So it looks like VR is here to stay! It’s nice that at least one part of my childhood’s future has finally happened.
With genetic predictors of a phenotypic trait, it is possible to select embryos during an in vitro fertilization process to increase or decrease that trait. Extending the work of Shulman & Bostrom 2014/Hsu 2014, I consider the case of human intelligence using SNP-based genetic prediction, finding:
a meta-analysis of GCTA results indicates that SNPs can explain >33% of variance in current intelligence scores, and >44% with better-quality phenotype testing
this sets an upper bound on the effectiveness of SNP-based selection: a gain of 9 IQ points when selecting the top embryo out of 10
the best 2016 polygenic score could achieve a gain of ~3 IQ points when selecting out of 10
the marginal cost of embryo selection (assuming IVF is already being done) is modest, at $1500 + $200 per embryo, with the sequencing cost projected to drop rapidly
a model of the IVF process, incorporating number of extracted eggs, losses to abnormalities & vitrification & failed implantation & miscarriages from 2 real IVF patient populations, estimates feasible gains of 0.39 & 0.68 IQ points
embryo selection is currently unprofitable (mean: -$358) in the USA under the lowest estimate of the value of an IQ point, but profitable under the highest (mean: $6230). The main constraints on selection profitability is the polygenic score; under the highest value, the NPVEVPI of a perfect SNP predictor is $24b and the EVSI per education/SNP sample is $71k
under the worst-case estimate, selection can be made profitable with a better polygenic score, which would require n > 237,300 using education phenotype data (and much less using fluid intelligence measures)
selection can be made more effective by selecting on multiple phenotype traits: considering an example using 7 traits (IQ/height/BMI/diabetes/ADHD/bipolar/schizophrenia), there is a factor gain over IQ alone; the outperformance of multiple selection remains after adjusting for genetic correlations & polygenic scores and using a broader set of 16 traits.
Computational complexity theory describes the steep increase in computing power required for many algorithms to solve larger problems; frequently, the increase is large enough to render problems a few times larger totally intractable. Many of these algorithms are used in AI-relevant contexts. It has been argued that this implies that AIs will fundamentally be limited in accomplishing real-world tasks better than humans because they will run into the same computational complexity limit as humans, and so the consequences of developing AI will be small, as it is impossible for there to be any large fast global changes due to human or superhuman-level AIs. I examine the assumptions of this argument and find it neglects the many conditions under which computational complexity theorems are valid and so the argument doesn’t work: problems can be solved more efficiently than complexity classes would imply, large differences in problem solubility between humans and AIs is possible, greater resource consumption is possible, the real-world consequences of small differences on individual tasks can be large on agent impacts, such consequences can compound, and many agents can be created; any of these independent objections being true destroys the argument.
Autonomous AI systems (Agent AIs) trained using reinforcement learning can do harm when they take wrong actions, especially superintelligent Agent AIs. One solution would be to eliminate their agency by not giving AIs the ability to take actions, confining them to purely informational or inferential tasks such as classification or prediction (Tool AIs), and have all actions be approved & executed by humans, giving equivalently superintelligent results without the risk.
I argue that this is not an effective solution for two major reasons. First, because Agent AIs will by definition be better at actions than Tool AIs, giving an economic advantage. Secondly, because Agent AIs will be better at inference & learning than Tool AIs, and this is inherently due to their greater agency: the same algorithms which learn how to perform actions can be used to select important datapoints to learn inference over, how long to learn, how to more efficiently execute inference, how to design themselves, how to optimize hyperparameters, how to make use of external resources such as long-term memories or external software or large databases or the Internet, and how best to acquire new data. All of these actions will result in Agent AIs more intelligent than Tool AIs, in addition to their greater economic competitiveness. Thus, Tool AIs will be inferior to Agent AIs in both actions and intelligence, implying use of Tool AIs is a even more highly unstable equilibrium than previously argued, as users of Agent AIs will be able to outcompete them on two dimensions (and not just one).
I analyze an A/B test from a mail-order company of two different kinds of box packaging from a Bayesian decision-theory perspective, balancing posterior probability of improvements & greater profit against the cost of packaging & risk of worse results, finding that as the company’s analysis suggested, the new box is unlikely to be sufficiently better than the old. Calculating expected values of information shows that it is not worth experimenting on further, and that such fixed-sample trials are unlikely to ever be cost-effective for packaging improvements. However, adaptive experiments may be worthwhile.
Char-RNNs are unsupervised generative models which learn to mimic text sequences. I suggest extending char-RNNs with inline metadata such as genre or author prefixed to each line of input, allowing for better & more efficient metadata, and more controllable sampling of generated output by feeding in desired metadata. A 2015 experiment using torch-rnn on a set of ~30 Project Gutenberg e-books (1 per author) to train a large char-RNN shows that a char-RNN can learn to remember metadata such as authors, learn associated prose styles, and often generate text visibly similar to that of a specified author.
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants. This is done by directly quantifying the chance genetic similarity of unrelated individuals and comparing it to their measured similarity on a trait; if two unrelated individuals are relatively similar genetically and also have similar trait measurements, then the measured genetics are likely to causally influence that trait, and the correlation can to some degree tell how much. This can be illustrated by plotting the squared pairwise trait differences between individuals against their estimated degree of relatedness. The GCTA framework can be applied in a variety of settings. For example, it can be used to examine changes in heritability over aging and development. It can also be extended to analyse bivariate genetic correlations between traits. There is an ongoing debate about whether GCTA generates reliable or stable estimates of heritability when used on current SNP data. The method is based on the outdated and false dichotomy of genes versus the environment. It also suffers from serious methodological weaknesses, such as susceptibility to population stratification.
In multivariate quantitative genetics, a genetic correlation is the proportion of variance that two traits share due to genetic causes, the correlation between the genetic influences on a trait and the genetic influences on a different trait estimating the degree of pleiotropy or causal overlap. A genetic correlation of 0 implies that the genetic effects on one trait are independent of the other, while a correlation of 1 implies that all of the genetic influences on the two traits are identical. The bivariate genetic correlation can be generalized to inferring genetic latent variable factors across > 2 traits using factor analysis. Genetic correlation models were introduced into behavioral genetics in the 1970s–1980s.
My laptop in my apartment receives Internet via a WiFi repeater to another house, yielding slow speeds and frequent glitches. I replaced the obsolete WiFi router and increased connection speeds somewhat but still inadequate. For a better solution, I used a directional antenna to connect directly to the new WiFi router, which, contrary to my expectations, yielded a ~6× increase in speed. Extensive benchmarking of all possible arrangements of laptops/dongles/repeaters/antennas/routers/positions shows that the antenna+router is inexpensive and near optimal speed, and that the only possible improvement would be a hardwired Ethernet line, which I installed a few weeks later after learning it was not as difficult as I thought it would be.
Randomized experiments require more subjects the more variable each datapoint is to overcome the noise which obscures any effects of the intervention. Reducing noise enables better inferences with the same data, or less data to be collected, which can be done by balancing observed characteristics between control and experimental datapoints.
A particularly dramatic example of this approach is running experiments on identical twins rather than regular people, because twins vary far less from each other than random people due to shared genetics & family environment. In 1931, the great statistician Student (William Sealy Gosset) noted problems with an extremely large (n = 20,000) Scottish experiment in feeding children milk (to see if they grew more in height or weight), and claimed that the experiment could have been done far more cost-effectively with an extraordinary reduction of >95% fewer children if it had been conducted using twins, and claimed that 100 identical twins would have been more accurate than 20,000 children. He, however, did not provide any calculations or data demonstrating this.
I revisit the issue and run a power calculation on height indicating that Student’s claims were correct and that the experiment would have required ~97% fewer children if run with twins.
This reduction is not unique to the Scottish milk experiment on height/weight, and in general, one can expect a reduction of 89% in experiment sample sizes using twins rather than regular people, demonstrating the benefits of using behavioral genetics in experiment design/power analysis.
Genius Revisited documents the longitudinal results of a high-IQ/gifted-and-talented elementary school, Hunter College Elementary School (HCES); one of the most striking results is the general high education & income levels, but absence of great accomplishment on a national or global scale (eg a Nobel prize). The authors suggest that this may reflect harmful educational practices at their elementary school or the low predictive value of IQ.
I suggest that there is no puzzle to this absence nor anything for HCES to be blamed for, as the absence is fully explainable by their making two statistical errors: base-rate neglect, and regression to the mean.
First, their standards fall prey to a base-rate fallacy and even extreme predictive value of IQ would not predict 1 or more Nobel prizes because Nobel prize odds are measured at 1 in millions, and with a small total sample size of a few hundred, it is highly likely that there would simply be no Nobels.
Secondly, and more seriously, the lack of accomplishment is inherent and unavoidable as it is driven by the regression to the mean caused by the relatively low correlation of early childhood with adult IQs—which means their sample is far less elite as adults than they believe. Using early-childhood/adult IQ correlations, regression to the mean implies that HCES students will fall from a mean of 157 IQ in kindergarten (when selected) to somewhere around 133 as adults (and possibly lower). Further demonstrating the role of regression to the mean, in contrast, HCES’s associated high-IQ/gifted-and-talented high school, Hunter High, which has access to the adolescents’ more predictive IQ scores, has much higher achievement in proportion to its lesser regression to the mean (despite dilution by Hunter elementary students being grandfathered in).
This unavoidable statistical fact undermines the main rationale of HCES: extremely high-IQ adults cannot be accurately selected as kindergartners on the basis of a simple test. This greater-regression problem can be lessened by the use of additional variables in admissions, such as parental IQs or high-quality genetic polygenic scores; unfortunately, these are either politically unacceptable or dependent on future scientific advances. This suggests that such elementary schools may not be a good use of resources and HCES students should not be assigned scarce magnet high school slots.
A previous genome-wide association study (GWAS) of more than 100,000 individuals identified molecular-genetic predictors of educational attainment. We undertook in-depth life-course investigation of the polygenic score derived from this GWAS using the four-decade Dunedin Study (N = 918). There were five main findings. First, polygenic scores predicted adult economic outcomes even after accounting for educational attainments. Second, genes and environments were correlated: Children with higher polygenic scores were born into better-off homes. Third, children’s polygenic scores predicted their adult outcomes even when analyses accounted for their social-class origins; social-mobility analysis showed that children with higher polygenic scores were more upwardly mobile than children with lower scores. Fourth, polygenic scores predicted behavior across the life course, from early acquisition of speech and reading skills through geographic mobility and mate choice and on to financial planning for retirement. Fifth, polygenic-score associations were mediated by psychological characteristics, including intelligence, self-control, and interpersonal skill. Effect sizes were small. Factors connecting GWAS sequence with life outcomes may provide targets for interventions to promote population-wide positive development. [Keywords: genetics, behavior genetics, intelligence, personality, adult development]
People’s differences in cognitive functions are partly heritable and are associated with important life outcomes. Previous genome-wide association (GWA) studies of cognitive functions have found evidence for polygenic effects yet, to date, there are few replicated genetic associations. Here we use data from the UK Biobank sample to investigate the genetic contributions to variation in tests of three cognitive functions and in educational attainment. GWA analyses were performed for verbal-numerical reasoning (n = 36 035), memory (n = 112 067), reaction time (n = 111 483) and for the attainment of a college or a university degree (n = 111 114). We report genome-wide significant single-nucleotide polymorphism (SNP)-based associations in 20 genomic regions, and significant gene-based findings in 46 regions. These include findings in the ATXN2, CYP2DG, APBA1 and CADM2 genes. We report replication of these hits in published GWA studies of cognitive function, educational attainment and childhood intelligence. There is also replication, in UK Biobank, of SNP hits reported previously in GWA studies of educational attainment and cognitive function. GCTA-GREML analyses, using common SNPs (minor allele frequency>0.01), indicated significant SNP-based heritabilities of 31% (s.e.m.=1.8%) for verbal-numerical reasoning, 5% (s.e.m. = 0.6%) for memory, 11% (s.e.m. = 0.6%) for reaction time and 21% (s.e.m. = 0.6%) for educational attainment. Polygenic score analyses indicate that up to 5% of the variance in cognitive test scores can be predicted in an independent cohort. The genomic regions identified include several novel loci, some of which have been associated with intracranial volume, neurodegeneration, Alzheimer’s disease and schizophrenia.
“Genome-wide association study identifies 74 loci associated with educational attainment”, Aysu Okbay, Jonathan P. Beauchamp, Mark Alan Fontana, James J. Lee, Tune H. Pers, Cornelius A. Rietveld, Patrick Turley, Guo-Bo Chen, Valur Emilsson, S. Fleur W. Meddens, Sven Oskarsson, Joseph K. Pickrell, Kevin Thom, Pascal Timshel, Ronald de Vlaming, Abdel Abdellaoui, Tarunveer S. Ahluwalia, Jonas Bacelis, Clemens Baumbach, Gyda Bjornsdottir, Johannes H. Brandsma, Maria Pina Concas, Jaime Derringer, Nicholas A. Furlotte, Tessel E. Galesloot, Giorgia Girotto, Richa Gupta, Leanne M. Hall, Sarah E. Harris, Edith Hofer, Momoko Horikoshi, Jennifer E. Huffman, Kadri Kaasik, Ioanna P. Kalafati, Robert Karlsson, Augustine Kong, Jari Lahti, Sven J. van der Lee, Christiaan de Leeuw, Penelope A. Lind, Karl-Oskar Lindgren, Tian Liu, Massimo Mangino, Jonathan Marten, Evelin Mihailov, Michael B. Miller, Peter J. van der Most, Christopher Oldmeadow, Antony Payton, Natalia Pervjakova, Wouter J. Peyrot, Yong Qian, Olli Raitakari, Rico Rueedi, Erika Salvi, Börge Schmidt, Katharina E. Schraut, Jianxin Shi, Albert V. Smith, Raymond A. Poot, Beate St Pourcain, Alexander Teumer, Gudmar Thorleifsson, Niek Verweij, Dragana Vuckovic, Juergen Wellmann, Harm-Jan Westra, Jingyun Yang, Wei Zhao, Zhihong Zhu, Behrooz Z. Alizadeh, Najaf Amin, Andrew Bakshi, Sebastian E. Baumeister, Ginevra Biino, Klaus Bønnelykke, Patricia A. Boyle, Harry Campbell, Francesco P. Cappuccio, Gail Davies, Jan-Emmanuel De Neve, Panos Deloukas, Ilja Demuth, Jun Ding, Peter Eibich, Lewin Eisele, Niina Eklund, David M. Evans, Jessica D. Faul, Mary F. Feitosa, Andreas J. Forstner, Ilaria Gandin, Bjarni Gunnarsson, Bjarni V. Halldórsson, Tamara B. Harris, Andrew C. Heath, Lynne J. Hocking, Elizabeth G. Holliday, Georg Homuth, Michael A. Horan, Jouke-Jan Hottenga, Philip L. de Jager, Peter K. Joshi, Astanand Jugessur, Marika A. Kaakinen, Mika Kähönen, Stavroula Kanoni, Liisa Keltigangas-Järvinen, Lambertus A. L. M. Kiemeney, Ivana Kolcic, Seppo Koskinen, Aldi T. Kraja, Martin Kroh, Zoltan Kutalik, Antti Latvala, Lenore J. Launer, Maël P. Lebreton, Douglas F. Levinson, Paul Lichtenstein, Peter Lichtner, David C. M. Liewald, LifeLines Cohort Study, Anu Loukola, Pamela A. Madden, Reedik Mägi, Tomi Mäki-Opas, Riccardo E. Marioni, Pedro Marques-Vidal, Gerardus A. Meddens, George McMahon, Christa Meisinger, Thomas Meitinger, Yusplitri Milaneschi, Lili Milani, Grant W. Montgomery, Ronny Myhre, Christopher P. Nelson, Dale R. Nyholt, William E. R. Ollier, Aarno Palotie, Lavinia Paternoster, Nancy L. Pedersen, Katja E. Petrovic, David J. Porteous, Katri Räikkönen, Susan M. Ring, Antonietta Robino, Olga Rostapshova, Igor Rudan, Aldo Rustichini, Veikko Salomaa, Alan R. Sanders, Antti-Pekka Sarin, Helena Schmidt, Rodney J. Scott, Blair H. Smith, Jennifer A. Smith, Jan A. Staessen, Elisabeth Steinhagen-Thiessen, Konstantin Strauch, Antonio Terracciano, Martin D. Tobin, Sheila Ulivi, Simona Vaccargiu, Lydia Quaye, Frank J. A. van Rooij, Cristina Venturini, Anna A. E. Vinkhuyzen, Uwe Völker, Henry Völzke, Judith M. Vonk, Diego Vozzi, Johannes Waage, Erin B. Ware, Gonneke Willemsen, John R. Attia, David A. Bennett, Klaus Berger, Lars Bertram, Hans Bisgaard, Dorret I. Boomsma, Ingrid B. Borecki, Ute Bültmann, Christopher F. Chabris, Francesco Cucca, Daniele Cusi, Ian J. Deary, George V. Dedoussis, Cornelia M. van Duijn, Johan G. Eriksson, Barbara Franke, Lude Franke, Paolo Gasparini, Pablo V. Gejman, Christian Gieger, Hans-Jörgen Grabe, Jacob Gratten, Patrick J. F. Groenen, Vilmundur Gudnason, Pim van der Harst, Caroline Hayward, David A. Hinds, Wolfgang Hoffmann, Elina Hyppönen, William G. Iacono, Bo Jacobsson, Marjo-Riitta Järvelin, Karl-Heinz Jöckel, Jaakko Kaprio, Sharon L. R. Kardia, Terho Lehtimäki, Steven F. Lehrer, Patrik K. E. Magnusson, Nicholas G. Martin, Matt McGue, Andres Metspalu, Neil Pendleton, Brenda W. J. H. Penninx, Markus Perola, Nicola Pirastu, Mario Pirastu, Ozren Polasek, Danielle Posthuma, Christine Power, Michael A. Province, Nilesh J. Samani, David Schlessinger, Reinhold Schmidt, Thorkild I. A. Sørensen, Tim D. Spector, Kari Stefansson, Unnur Thorsteinsdottir, A. Roy Thurik, Nicholas J. Timpson, Henning Tiemeier, Joyce Y. Tung, André G. Uitterlinden, Veronique Vitart, Peter Vollenweider, David R. Weir, James F. Wilson, Alan F. Wright, Dalton C. Conley, Robert F. Krueger, George Davey Smith, Albert Hofman, David I. Laibson, Sarah E. Medland, Michelle N. Meyer, Jian Yang, Magnus Johannesson, Tõnu Esko, Peter M. Visscher, Philipp D. Koellinger, David Cesarini, Daniel J. Benjamin (2016-05-11):
Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.
Individuals with lower socio-economic status (SES) are at increased risk of physical and mental illnesses and tend to die at an earlier age [1-3]. Explanations for the association between SES and health typically focus on factors that are environmental in origin . However, common single nucleotide polymorphisms (SNPs) have been found collectively to explain around 18% (SE = 5%) of the phenotypic variance of an area-based social deprivation measure of SES . Molecular genetic studies have also shown that physical and psychiatric diseases are at least partly heritable . It is possible, therefore, that phenotypic associations between SES and health arise partly due to a shared genetic etiology. We conducted a genome-wide association study (GWAS) on social deprivation and on household income using the 112,151 participants of UK Biobank. We find that common SNPs explain 21% (SE = 0.5%) of the variation in social deprivation and 11% (SE = 0.7%) in household income. Two independent SNPs attained genome-wide significance for household income, rs187848990 on chromosome 2, and rs8100891 on chromosome 19. Genes in the regions of these SNPs have been associated with intellectual disabilities, schizophrenia, and synaptic plasticity. Extensive genetic correlations were found between both measures of socioeconomic status and illnesses, anthropometric variables, psychiatric disorders, and cognitive ability. These findings show that some SNPs associated with SES are involved in the brain and central nervous system. The genetic associations with SES are probably mediated via other partly-heritable variables, including cognitive ability, education, personality, and health.
One of the best predictors of children’s educational achievement is their family’s socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children’s educational achievement and its association with family SES.
The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (n = 370, n = 5850 for controls, GWAS; n = 173, n = 3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), p = 1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR = 1.59 (1.37–1.85), p = 1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β = 0.68, p = 0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder.
Higher paternal age at offspring conception increases de novo genetic mutations (Kong et al., 2012). Based on evolutionary genetic theory we predicted that the offspring of older fathers would be less likely to survive and reproduce, i.e. have lower fitness. In a sibling control study, we find clear support for negative paternal age effects on offspring survival, mating and reproductive success across four large populations with an aggregate n > 1.3 million in main analyses. Compared to a sibling born when the father was 10 years younger, individuals had 4-13% fewer surviving children in the four populations. Three populations were pre-industrial (1670-1850) Western populations and showed a pattern of paternal age effects across the offspring’s lifespan. In 20th-century Sweden, we found no negative paternal age effects on child survival or marriage odds. Effects survived tests for competing explanations, including maternal age and parental loss. To the extent that we succeeded in isolating a mutation-driven effect of paternal age, our results can be understood to show that de novo mutations reduce offspring fitness across populations and time. We can use this understanding to predict the effect of increasingly delayed reproduction on offspring genetic load, mortality and fertility.
Causes of the well-documented association between low levels of cognitive functioning and many adverse neuropsychiatric outcomes, poorer physical health and earlier death remain unknown. We used linkage disequilibrium regression and polygenic profile scoring to test for shared genetic aetiology between cognitive functions and neuropsychiatric disorders and physical health. Using information provided by many published genome-wide association study consortia, we created polygenic profile scores for 24 vascular—metabolic, neuropsychiatric, physiological—anthropometric and cognitive traits in the participants of UK Biobank, a very large population-based sample (n = 112 151). Pleiotropy between cognitive and health traits was quantified by deriving genetic correlations using summary genome-wide association study statistics and to the method of linkage disequilibrium score regression. Substantial and significant genetic correlations were observed between cognitive test scores in the UK Biobank sample and many of the mental and physical health-related traits and disorders assessed here. In addition, highly significant associations were observed between the cognitive test scores in the UK Biobank sample and many polygenic profile scores, including coronary artery disease, stroke, Alzheimer’s disease, schizophrenia, autism, major depressive disorder, body mass index, intracranial volume, infant head circumference and childhood cognitive ability. Where disease diagnosis was available for UK Biobank participants, we were able to show that these results were not confounded by those who had the relevant disease. These findings indicate that a substantial level of pleiotropy exists between cognitive abilities and many human mental and physical health disorders and traits and that it can be used to predict phenotypic variance across samples.
“An Atlas of Genetic Correlations across Human Diseases and Traits”, Brendan Bulik-Sullivan, Hilary K. Finucane, Verneri Anttila, Alexander Gusev, Felix R. Day, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, Laramie Duncan, John R. B. Perry, Nick Patterson, Elise B. Robinson, Mark J. Daly, Alkes L. Price, Benjamin M. Neale (2015-04-06):
Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use our method to estimate 300 genetic correlations among 25 traits, totaling more than 1.5 million unique phenotype measurements. Our results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity and associations between educational attainment and several diseases. These results highlight the power of genome-wide analyses, since there currently are no genome-wide significant SNPs for anorexia nervosa and only three for educational attainment.
Heritability estimation provides important information about the relative contribution of genetic and environmental factors to phenotypic variation, and provides an upper bound for the utility of genetic risk prediction models. Recent technological and statistical advances have enabled the estimation of additive heritability attributable to common genetic variants (SNP heritability) across a broad phenotypic spectrum. However, assessing the comparative heritability of multiple traits estimated in different cohorts may be misleading due to the population-specific nature of heritability. Here we report the SNP heritability for 551 complex traits derived from the large-scale, population-based UK Biobank, comprising both quantitative phenotypes and disease codes, and examine the moderating effect of three major demographic variables (age, sex and socioeconomic status) on the heritability estimates. Our study represents the first comprehensive phenome-wide heritability analysis in the UK Biobank, and underscores the importance of considering population characteristics in comparing and interpreting heritability.
“Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs”, Lee, S. Hong Ripke, Stephan Neale, Benjamin M. Faraone, Stephen V. Purcell, Shaun M. Perlis, Roy H. Mowry, Bryan J. Thapar, Anita Goddard, Michael E. Witte, John S. Absher, Devin Agartz, Ingrid Akil, Huda Amin, Farooq Andreassen, Ole A. Anjorin, Adebayo Anney, Richard Anttila, Verneri Arking, Dan E. Asherson, Philip Azevedo, Maria H. Backlund, Lena Badner, Judith A. Bailey, Anthony J. Banaschewski, Tobias Barchas, Jack D. Barnes, Michael R. Barrett, Thomas B. Bass, Nicholas Battaglia, Agatino Bauer, Michael Bayés, Mònica Bellivier, Frank Bergen, Sarah E. Berrettini, Wade Betancur, Catalina Bettecken, Thomas Biederman, Joseph Binder, Elisabeth B. Black, Donald W. Blackwood, Douglas H. R Bloss, Cinnamon S. Boehnke, Michael Boomsma, Dorret I. Breen, Gerome Breuer, René Bruggeman, Richard Cormican, Paul Buccola, Nancy G. Buitelaar, Jan K. Bunney, William E. Buxbaum, Joseph D. Byerley, William F. Byrne, Enda M. Caesar, Sian Cahn, Wiepke Cantor, Rita M. Casas, Miguel Chakravarti, Aravinda Chambert, Kimberly Choudhury, Khalid Cichon, Sven Cloninger, C. Robert Collier, David A. Cook, Edwin H. Coon, Hilary Cormand, Bru Corvin, Aiden Coryell, William H. Craig, David W. Craig, Ian W. Crosbie, Jennifer Cuccaro, Michael L. Curtis, David Czamara, Darina Datta, Susmita Dawson, Geraldine Day, Richard De Geus, Eco J. Degenhardt, Franziska Djurovic, Srdjan Donohoe, Gary J. Doyle, Alysa E. Duan, Jubao Dudbridge, Frank Duketis, Eftichia Ebstein, Richard P. Edenberg, Howard J. Elia, Josephine Ennis, Sean Etain, Bruno Fanous, Ayman Farmer, Anne E. Ferrier, I. Nicol Flickinger, Matthew Fombonne, Eric Foroud, Tatiana Frank, Josef Franke, Barbara Fraser, Christine Freedman, Robert Freimer, Nelson B. Freitag, Christine M. Friedl, Marion Frisén, Louise Gallagher, Louise Gejman, Pablo V. Georgieva, Lyudmila Gershon, Elliot S. Geschwind, Daniel H. Giegling, Ina Gill, Michael Gordon, Scott D. Gordon-Smith, Katherine Green, Elaine K. Greenwood, Tiffany A. Grice, Dorothy E. Gross, Magdalena Grozeva, Detelina Guan, Weihua Gurling, Hugh De Haan, Lieuwe Haines, Jonathan L. Hakonarson, Hakon Hallmayer, Joachim Hamilton, Steven P. Hamshere, Marian L. Hansen, Thomas F. Hartmann, Annette M. Hautzinger, Martin Heath, Andrew C. Henders, Anjali K. Herms, Stefan Hickie, Ian B. Hipolito, Maria Hoefels, Susanne Holmans, Peter A. Holsboer, Florian Hoogendijk, Witte J. Hottenga, Jouke-Jan Hultman, Christina M. Hus, Vanessa Ingason, Andrés Ising, Marcus Jamain, Stéphane Jones, Edward G. Jones, Ian Jones, Lisa Tzeng, Jung-Ying Kähler, Anna K. Kahn, René S. Kandaswamy, Radhika Keller, Matthew C. Kennedy, James L. Kenny, Elaine Kent, Lindsey Kim, Yunjung Kirov, George K. Klauck, Sabine M. Klei, Lambertus Knowles, James A. Kohli, Martin A. Koller, Daniel L. Konte, Bettina Korszun, Ania Krabbendam, Lydia Krasucki, Robert Kuntsi, Jonna Kwan, Phoenix Landén, Mikael Långström, Niklas Lathrop, Mark Lawrence, Jacob Lawson, William B. Leboyer, Marion Ledbetter, David H. Lee, Phil H. Lencz, Todd Lesch, Klaus-Peter Levinson, Douglas F. Lewis, Cathryn M. Li, Jun Lichtenstein, Paul Lieberman, Jeffrey A. Lin, Dan-Yu Linszen, Don H. Liu, Chunyu Lohoff, Falk W. Loo, Sandra K. Lord, Catherine Lowe, Jennifer K. Lucae, Susanne MacIntyre, Donald J. Madden, Pamela A. F Maestrini, Elena Magnusson, Patrik K. E Mahon, Pamela B. Maier, Wolfgang Malhotra, Anil K. Mane, Shrikant M. Martin, Christa L. Martin, Nicholas G. Mattheisen, Manuel Matthews, Keith Mattingsdal, Morten McCarroll, Steven A. McGhee, Kevin A. McGough, James J. McGrath, Patrick J. McGuffin, Peter McInnis, Melvin G. McIntosh, Andrew McKinney, Rebecca McLean, Alan W. McMahon, Francis J. McMahon, William M. McQuillin, Andrew Medeiros, Helena Medland, Sarah E. Meier, Sandra Melle, Ingrid Meng, Fan Meyer, Jobst Middeldorp, Christel M. Middleton, Lefkos Milanova, Vihra Miranda, Ana Monaco, Anthony P. Montgomery, Grant W. Moran, Jennifer L. Moreno-De-Luca, Daniel Morken, Gunnar Morris, Derek W. Morrow, Eric M. Moskvina, Valentina Muglia, Pierandrea Mühleisen, Thomas W. Muir, Walter J. Müller-Myhsok, Bertram Murtha, Michael Myers, Richard M. Myin-Germeys, Inez Neale, Michael C. Nelson, Stan F. Nievergelt, Caroline M. Nikolov, Ivan Nimgaonkar, Vishwajit Nolen, Willem A. Nöthen, Markus M. Nurnberger, John I. Nwulia, Evaristus A. Nyholt, Dale R. O'Dushlaine, Colm Oades, Robert D. Olincy, Ann Oliveira, Guiomar Olsen, Line Ophoff, Roel A. Osby, Urban Owen, Michael J. Palotie, Aarno Parr, Jeremy R. Paterson, Andrew D. Pato, Carlos N. Pato, Michele T. Penninx, Brenda W. Pergadia, Michele L. Pericak-Vance, Margaret A. Pickard, Benjamin S. Pimm, Jonathan Piven, Joseph Posthuma, Danielle Potash, James B. Poustka, Fritz Propping, Peter Puri, Vinay Quested, Digby J. Quinn, Emma M. Ramos-Quiroga, Josep Antoni Rasmussen, Henrik B. Raychaudhuri, Soumya Rehnström, Karola Reif, Andreas Ribasés, Marta Rice, John P. Rietschel, Marcella Roeder, Kathryn Roeyers, Herbert Rossin, Lizzy Rothenberger, Aribert Rouleau, Guy Ruderfer, Douglas Rujescu, Dan Sanders, Alan R. Sanders, Stephan J. Santangelo, Susan L. Sergeant, Joseph A. Schachar, Russell Schalling, Martin Schatzberg, Alan F. Scheftner, William A. Schellenberg, Gerard D. Scherer, Stephen W. Schork, Nicholas J. Schulze, Thomas G. Schumacher, Johannes Schwarz, Markus Scolnick, Edward Scott, Laura J. Shi, Jianxin Shilling, Paul D. Shyn, Stanley I. Silverman, Jeremy M. Slager, Susan L. Smalley, Susan L. Smit, Johannes H. Smith, Erin N. Sonuga-Barke, Edmund J. S St Clair, David State, Matthew Steffens, Michael Steinhausen, Hans-Christoph Strauss, John S. Strohmaier, Jana Stroup, T. Scott Sutcliffe, James S. Szatmari, Peter Szelinger, Szabocls Thirumalai, Srinivasa Thompson, Robert C. Todorov, Alexandre A. Tozzi, Federica Treutlein, Jens Uhr, Manfred van den Oord, Edwin J. C G. Van Grootheest, Gerard Van Os, Jim Vicente, Astrid M. Vieland, Veronica J. Vincent, John B. Visscher, Peter M. Walsh, Christopher A. Wassink, Thomas H. Watson, Stanley J. Weissman, Myrna M. Werge, Thomas Wienker, Thomas F. Wijsman, Ellen M. Willemsen, Gonneke Williams, Nigel Willsey, A. Jeremy Witt, Stephanie H. Xu, Wei Young, Allan H. Yu, Timothy W. Zammit, Stanley Zandi, Peter P. Zhang, Peng Zitman, Frans G. Zöllner, Sebastian Devlin, Bernie Kelsoe, John R. Sklar, Pamela Daly, Mark J. O'Donovan, Michael C. Craddock, Nicholas Sullivan, Patrick F. Smoller, Jordan W. Kendler, Kenneth S. Wray, Naomi R (2013):
Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17-29% of the variance in liability. The genetic correlation calculated using common SNPs was high between schizophrenia and bipolar disorder (0.68 ± 0.04 s.e.), moderate between schizophrenia and major depressive disorder (0.43 ± 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 ± 0.06 s.e.), and ADHD and major depressive disorder (0.32 ± 0.07 s.e.), low between schizophrenia and ASD (0.16 ± 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn's disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders.
Disorders of the brain can exhibit considerable epidemiological comorbidity and often share symptoms, provoking debate about their etiologic overlap. We quantified the genetic sharing of 25 brain disorders from genome-wide association studies of 265,218 patients and 784,643 control participants and assessed their relationship to 17 phenotypes from 1,191,588 individuals. Psychiatric disorders share common variant risk, whereas neurological disorders appear more distinct from one another and from the psychiatric disorders. We also identified significant sharing between disorders and a number of brain phenotypes, including cognitive measures. Further, we conducted simulations to explore how statistical power, diagnostic misclassification, and phenotypic heterogeneity affect genetic correlations. These results highlight the importance of common genetic variation as a risk factor for brain disorders and the value of heritability-based methods in understanding their etiology.
“Physical and neurobehavioral determinants of reproductive onset and success”, Felix R. Day, Hannes Helgason, Daniel I. Chasman, Lynda M. Rose, Po-Ru Loh, Robert A. Scott, Agnar Helgason, Augustine Kong, Gisli Masson, Olafur Th Magnusson, Daniel Gudbjartsson, Unnur Thorsteinsdottir, Julie E. Buring, Paul M. Ridker, Patrick Sulem, Kari Stefansson, Ken K. Ong & John R. B. Perry (2016-04-18):
The ages of puberty, first sexual intercourse and first birth signify the onset of reproductive ability, behavior and success, respectively. In a genome-wide association study of 125,667 UK Biobank participants, we identify 38 loci associated (p < 5 × 10−8) with age at first sexual intercourse. These findings were taken forward in 241,910 men and women from Iceland and 20,187 women from the Women’s Genome Health Study. Several of the identified loci also exhibit associations (p < 5 × 10−8) with other reproductive and behavioral traits, including age at first birth (variants in or near ESR1 and RBM6–SEMA3F), number of children (CADM2 and ESR1), irritable temperament (MSRA) and risk-taking propensity (CADM2). Mendelian randomization analyses infer causal influences of earlier puberty timing on earlier first sexual intercourse, earlier first birth and lower educational attainment. In turn, likely causal consequences of earlier first sexual intercourse include reproductive, educational, psychiatric and cardiometabolic outcomes.
LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously.
Results: In this manuscript, we describe LD Hub – a centralized database of summary-level GWAS results for 177 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies.
Availability and implementation
The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/
Educated people are generally healthier, have fewer comorbidities and live longer than people with less education. Previous evidence about the effects of education come from observational studies many of which are affected by residual confounding. Legal changes to the minimum school leave age is a potential natural experiment which provides a potentially more robust source of evidence about the effects of schooling. Previous studies have exploited this natural experiment using population-level administrative data to investigate mortality, and relatively small surveys to investigate the effect on mortality. Here, we add to the evidence using data from a large sample from the UK Biobank. We exploit the raising of the school-leaving age in the UK in September 1972 as a natural experiment and regression discontinuity and instrumental variable estimators to identify the causal effects of staying on in school. Remaining in school was positively associated with 23 of 25 outcomes. After accounting for multiple hypothesis testing, we found evidence of causal effects on twelve outcomes, however, the associations of schooling and intelligence, smoking, and alcohol consumption may be due to genomic and socioeconomic confounding factors. Education affects some, but not all health and socioeconomic outcomes. Differences between educated and less educated people may be partially due to residual genetic and socioeconomic confounding.
On average people who choose to stay in education for longer are healthier, wealthier, and live longer. We investigated the causal effects of education on health, income, and well-being later in life. This is the largest study of its kind to date and it has objective clinic measures of morbidity and aging. We found evidence that people who were forced to remain in school had higher wages and lower mortality. However, there was little evidence of an effect on intelligence later in life. Furthermore, estimates of the effects of education using conventionally adjusted regression analysis are likely to suffer from genomic confounding. In conclusion, education affects some, but not all health outcomes later in life.
The Medical Research Council (MRC) and the University of Bristol fund the MRC Integrative Epidemiology Unit [MC_UU_12013/1, MC_UU_12013/9]. NMD is supported by the Economics and Social Research Council (ESRC) via a Future Research Leaders Fellowship [ES/N000757/1]. The research described in this paper was specifically funded by a grant from the Economics and Social Research Council for Transformative Social Science. No funding body has influenced data collection, analysis or its interpretations. This publication is the work of the authors, who serve as the guarantors for the contents of this paper. This work was carried out using the computational facilities of the Advanced Computing Research Centre -http://www.bris.ac.uk/acrc/ and the Research Data Storage Facility of the University of Bristol — http://www.bris.ac.uk/acrc/storage/. This research was conducted using the UK Biobank Resource.
The statistical code used to produce these results can be accessed here: (https://github.com/nmdavies/UKbiobankROSLA). The final analysis dataset used in this study is archived with UK Biobank, which can be accessed by contacting UK Biobank firstname.lastname@example.org.
It is possible that heritable variance in personality characteristics does not reflect (only) genetic and biological processes specific to personality per se. We tested the possibility that Five-Factor Model personality domains and facets, as rated by people themselves and their knowledgeable informants, reflect polygenic influences that have been previously associated with educational attainment. In a sample of over 3,000 adult Estonians, polygenic scores for educational attainment, based on small contributions from more than 150,000 genetic variants, were correlated with various personality traits, mostly from the Neuroticism and Openness domains. The correlations of personality characteristics with educational attainment-related polygenic influences reflected almost entirely their correlations with phenotypic educational attainment. Structural equation modeling of the associations between polygenic risk, personality (a weighed aggregate of education-related facets) and educational attainment lent relatively strongest support to the possibility of educational attainment mediating (explaining) some of the heritable variance in personality traits.
Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
Analyzing genetic differences between closely related populations can be a powerful way to detect recent adaptation. The very large sample size of the UK Biobank is ideal for detecting selection using population differentiation, and enables an analysis of UK population structure at fine resolution. In analyses of 113,851 UK Biobank samples, population structure in the UK is dominated by 5 principal components (PCs) spanning 6 clusters: Northern Ireland, Scotland, northern England, southern England, and two Welsh clusters. Analyses with ancient Eurasians show that populations in the northern UK have higher levels of Steppe ancestry, and that UK population structure cannot be explained as a simple mixture of Celts and Saxons. A scan for unusual population differentiation along top PCs identified a genome-wide significant signal of selection at the coding variant rs601338 in FUT2 (p = 9.16 × 10−9). In addition, by combining evidence of unusual differentiation within the UK with evidence from ancient Eurasians, we identified new genome-wide significant (p < 5 × 10−8) signals of recent selection at two additional loci: CYP1A2/CSK and F12. We detected strong associations to diastolic blood pressure in the UK Biobank for the variants with new selection signals at CYP1A2/CSK (p = 1.10 × 10−19)) and for variants with ancient Eurasian selection signals in the ATXN2/SH2B3 locus (p = 8.00 × 10−33), implicating recent adaptation related to blood pressure.
Recent findings from molecular genetics now make it possible to test directly for natural selection by analyzing whether genetic variants associated with various phenotypes have been under selection. I leverage these findings to construct polygenic scores that use individuals’ genotypes to predict their body mass index, educational attainment (EA), glucose concentration, height, schizophrenia, total cholesterol, and (in females) age at menarche. I then examine associations between these scores and fitness to test whether natural selection has been occurring. My study sample includes individuals of European ancestry born between 1931 and 1953 in the Health and Retirement Study, a representative study of the US population. My results imply that natural selection has been slowly favoring lower EA in both females and males, and are suggestive that natural selection may have favored a higher age at menarche in females. For EA, my estimates imply a rate of selection of about -1.5 months of education per generation (which pales in comparison with the increases in EA observed in contemporary times). Though they cannot be projected over more than one generation, my results provide additional evidence that humans are still evolving—albeit slowly, especially when compared to the rapid secular changes that have occurred over the past few generations due to cultural and environmental factors.
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no change in the model architecture from our base system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. The rest of the model, which includes encoder, decoder and attention, remains unchanged and is shared across all languages. Using a shared wordpiece vocabulary, our approach enables Multilingual NMT using a single model without any increase in parameters, which is significantly simpler than previous proposals for Multilingual NMT. Our method often improves the translation quality of all involved language pairs, even while keeping the total number of model parameters constant. On the WMT’14 benchmarks, a single multilingual model achieves comparable performance for English→French and surpasses state-of-the-art results for English→German. Similarly, a single multilingual model surpasses state-of-the-art results for French→English and German→English on WMT’14 and WMT’15 benchmarks respectively. On production corpora, multilingual models of up to twelve language pairs allow for better translation of many individual pairs. In addition to improving the translation quality of language pairs that the model was trained with, our models can also learn to perform implicit bridging between language pairs never seen explicitly during training, showing that transfer learning and zero-shot translation is possible for neural translation. Finally, we show analyses that hints at a universal interlingua representation in our models and show some interesting examples when mixing languages.
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows — limiting their applicability to real-world domains. Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs 1,000× faster and with 3,000× less physical memory than non-sparse models. SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring 100,000s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.
“Progressive Neural Networks”, Andrei A. Rusu, Neil C. Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, Raia Hadsell (2016-06-15):
Learning to solve complex sequences of tasks–while both leveraging transfer and avoiding catastrophic forgetting–remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
Applying end-to-end learning to solve complex, interactive, pixel-driven control tasks on a robot is an unsolved problem. Deep Reinforcement Learning algorithms are too slow to achieve performance on a real robot, but their potential has been demonstrated in simulated environments. We propose using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world. The progressive net approach is a general framework that enables reuse of everything from low-level visual features to high-level policies for transfer to new tasks, enabling a compositional, yet simple, approach to building complex skills. We present an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging the reality gap. Unlike other proposed approaches, our real-world experiments demonstrate successful task learning from raw visual input on a fully actuated robot manipulator. Moreover, rather than relying on model-based trajectory optimisation, the task learning is accomplished using only deep reinforcement learning and sparse rewards.
“Overcoming catastrophic forgetting in neural networks”, James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, Raia Hadsell (2016-12-02):
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.
The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent. Towards this goal, we define a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains. This method, termed "Actor-Mimic", exploits the use of deep reinforcement learning and model compression techniques to train a single policy network that learns how to act in a set of distinct tasks by using the guidance of several expert teachers. We then show that the representations learnt by the deep policy network are capable of generalizing to new tasks with no prior expert guidance, speeding up learning in novel environments. Although our method can in general be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate these methods.
Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates. All layers, or more generally, modules, of the network are therefore locked, in the sense that they must wait for the remainder of the network to execute forwards and propagate error backwards before they can be updated. In this work we break this constraint by decoupling modules by introducing a model of the future computation of the network graph. These models predict what the result of the modelled subgraph will produce using only local information. In particular we focus on modelling error gradients: by using the modelled synthetic gradient in place of true backpropagated error gradients we decouple subgraphs, and can update them independently and asynchronously i.e. we realise decoupled neural interfaces. We show results for feed-forward models, where every layer is trained asynchronously, recurrent neural networks (RNNs) where predicting one’s future gradient extends the time over which the RNN can effectively model, and also a hierarchical RNN system with ticking at different timescales. Finally, we demonstrate that in addition to predicting gradients, the same framework can be used to predict inputs, resulting in models which are decoupled in both the forward and backwards pass – amounting to independent networks which co-learn such that they can be composed into a single functioning corporation.
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. In this paper, we consider the case of prior procedural knowledge for neural networks, such as knowing how a program should traverse a sequence, but not what local actions should be performed at each step. To this end, we present an end-to-end differentiable interpreter for the programming language Forth which enables programmers to write program sketches with slots that can be filled with behaviour trained from program input-output data. We can optimise this behaviour directly through gradient descent techniques on user-specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and learn complex behaviours such as sequence sorting and addition. When connected to outputs of an LSTM and trained jointly, our interpreter achieves state-of-the-art accuracy for end-to-end reasoning about quantities expressed in natural language stories.
We develop a first line of attack for solving programming competition-style problems from input-output examples using deep learning. The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network’s predictions to augment search techniques from the programming languages community, including enumerative search and an SMT-based solver. Empirically, we show that our approach leads to an order of magnitude speedup over the strong non-augmented baselines and a Recurrent Neural Network approach, and that we are able to solve problems of difficulty comparable to the simplest problems on programming competition websites.
Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcribers is 5.9 which newly acquainted pairs of people discuss an assigned topic, and 11.3 the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state of the art, and edges past the human benchmark, achieving error rates of 5.8 11.0 convolutional and LSTM acoustic model architectures, combined with a novel spatial smoothing method and lattice-free MMI acoustic training, multiple recurrent neural network language modeling approaches, and a systematic use of system combination.
The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem—unconstrained natural language sentences, and in the wild videos.
Our key contributions are: (1) a ’Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to characters; (2) a curriculum learning strategy to accelerate training and to reduce overfitting; (3) a ’Lip Reading Sentences’ (LRS) dataset for visual speech recognition, consisting of over 100,000 natural sentences from British television.
The WLAS model trained on the LRS dataset surpasses the performance of all previous work on standard lip reading benchmark datasets, often by a significant margin. This lip reading performance beats a professional lip reader on videos from BBC television, and we also demonstrate that visual information helps to improve speech recognition performance even when the audio is available.
“Mastering the game of Go with deep neural networks and tree search”, David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis (2016-01-28):
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.
At present, designing convolutional neural network (CNN) architectures requires both human expertise and labor. New architectures are handcrafted by careful experimentation or modified from a handful of existing networks. We introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learning agent is trained to sequentially choose CNN layers using Q-learning with an ϵ-greedy exploration strategy and experience replay. The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task. On image classification benchmarks, the agent-designed networks (consisting of only standard convolution, pooling, and fully-connected layers) beat existing networks designed with the same layer types and are competitive against the state-of-the-art methods that use more complex layer types. We also outperform existing meta-modeling approaches for network design on image classification tasks.
“Learning to reinforcement learn”, Jane X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick (2016-11-17):
In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.
We introduce the value iteration network (VIN): a fully differentiable neural network with a ‘planning module’ embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation. We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains.
One of the key challenges of artificial intelligence is to learn models that are effective in the context of planning. In this document we introduce the predictron architecture. The predictron consists of a fully abstract model, represented by a Markov reward process, that can be rolled forward multiple "imagined" planning steps. Each forward pass of the predictron accumulates internal rewards and values over multiple planning depths. The predictron is trained end-to-end so as to make these accumulated values accurately approximate the true value function. We applied the predictron to procedurally generated random mazes and a simulator for the game of pool. The predictron yielded significantly more accurate predictions than conventional deep neural network architectures.
We consider an agent’s uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma’s Revenge.
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning. All of these tasks share a common representation that, like unsupervised learning, continues to develop in the absence of extrinsic rewards. We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task. Our agent significantly outperforms the previous state-of-the-art on Atari, averaging 880% expert human performance, and a challenging suite of first-person, three-dimensional Labyrinth tasks leading to a mean speedup in learning of 10× and averaging 87% expert human performance on Labyrinth.
Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal for end-to-end optimization. To augment reward, we consider a range of self-supervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquitous and instantaneous supervision for representation learning even in the absence of reward. While current results show that learning from reward alone is feasible, pure reinforcement learning methods are constrained by computational and data efficiency issues that can be remedied by auxiliary losses. Self-supervised pre-training and joint optimization improve the data efficiency and policy returns of end-to-end reinforcement learning.
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independently of camera calibration or the current robot pose. This requires the network to observe the spatial relationship between the gripper and objects in the scene, thus learning hand-eye coordination. We then use this network to servo the gripper in real time to achieve successful grasps. To train our network, we collected over 800,000 grasp attempts over the course of two months, using between 6 and 14 robotic manipulators at any given time, with differences in camera placement and hardware. Our experimental evaluation demonstrates that our method achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing.
Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered policy representations and human-supplied demonstrations. Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted to simulated settings and relatively simple tasks, due to their apparent high sample complexity. In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots. We demonstrate that the training times can be further reduced by parallelizing the algorithm across multiple robots which pool their policy updates asynchronously. Our experimental evaluation shows that our method can learn a variety of 3D manipulation skills in simulation and a complex door opening skill on real robots without any prior demonstrations or manually designed representations.
A striking contrast runs through the last 60 years of biopharmaceutical discovery, research, and development. Huge scientific and technological gains should have increased the quality of academic science and raised industrial R&D efficiency. However, academia faces a “reproducibility crisis”; inflation-adjusted industrial R&D costs per novel drug increased nearly 100× between 1950 and 2010; and drugs are more likely to fail in clinical development today than in the 1970s. The contrast is explicable only if powerful headwinds reversed the gains and/or if many “gains” have proved illusory. However, discussions of reproducibility and R&D productivity rarely address this point explicitly.
The main objectives of the primary research in this paper are: (a) to provide quantitatively and historically plausible explanations of the contrast; and (b) identify factors to which R&D efficiency is sensitive.
We present a quantitative decision-theoretic model of the R&D process [a ‘leaky pipeline’; cf the log-normal]. The model represents therapeutic candidates (e.g., putative drug targets, molecules in a screening library, etc.) within a “measurement space”, with candidates’ positions determined by their performance on a variety of assays (e.g., binding affinity, toxicity, in vivo efficacy, etc.) whose results correlate to a greater or lesser degree. We apply decision rules to segment the space, and assess the probability of correct R&D decisions.
We find that when searching for rare positives (e.g., candidates that will successfully complete clinical development), changes in the predictive validity of screening and disease models that many people working in drug discovery would regard as small and/or unknowable (i.e., an 0.1 absolute change in correlation coefficient between model output and clinical outcomes in man) can offset large (e.g., 10×, even 100×) changes in models’ brute-force efficiency. We also show how validity and reproducibility correlate across a population of simulated screening and disease models.
We hypothesize that screening and disease models with high predictive validity are more likely to yield good answers and good treatments, so tend to render themselves and their diseases academically and commercially redundant. Perhaps there has also been too much enthusiasm for reductionist molecular models which have insufficient predictive validity. Thus we hypothesize that the average predictive validity of the stock of academically and industrially “interesting” screening and disease models has declined over time, with even small falls able to offset large gains in scientific knowledge and brute-force efficiency. The rate of creation of valid screening and disease models may be the major constraint on R&D productivity.
Context: There is substantial debate about whether the results of nonrandomized studies are consistent with the results of randomized controlled trials on the same topic.
Objectives: To compare results of randomized and nonrandomized studies that evaluated medical interventions and to examine characteristics that may explain discrepancies between randomized and nonrandomized studies.
Data Sources: MEDLINE (1966–March 2000), the Cochrane Library (Issue 3, 2000), and major journals were searched.
Study Selection: Forty-five diverse topics were identified for which both randomized trials (n = 240) and nonrandomized studies (n = 168) had been performed and had been considered in meta-analyses of binary outcomes.
Data Extraction: Data on events per patient in each study arm and design and characteristics of each study considered in each meta-analysis were extracted and synthesized separately for randomized and nonrandomized studies.
Data Synthesis: Very good correlation was observed between the summary odds ratios of randomized and nonrandomized studies (r = 0.75; p < 0.001); however, nonrandomized studies tended to show larger treatment effects (28 vs 11; p = 0.009). Between-study heterogeneity was frequent among randomized trials alone (23%) and very frequent among nonrandomized studies alone (41%). The summary results of the 2 types of designs differed beyond chance in 7 cases (16%). Discrepancies beyond chance were less common when only prospective studies were considered (8%). Occasional differences in sample size and timing of publication were also noted between discrepant randomized and nonrandomized studies. In 28 cases (62%), the natural logarithm of the odds ratio differed by at least 50%, and in 15 cases (33%), the odds ratio varied at least 2-fold between nonrandomized studies and randomized trials.
Conclusions: Despite good correlation between randomized trials and nonrandomized studies—in particular, prospective studies—discrepancies beyond chance do occur and differences in estimated magnitude of treatment effect are very common.
Measuring the causal effects of digital advertising remains challenging despite the availability of granular data. Unobservable factors make exposure endogenous, and advertising’s effect on outcomes tends to be small. In principle, these concerns could be addressed using randomized controlled trials (RCTs). In practice, few online ad campaigns rely on RCTs and instead use observational methods to estimate ad effects. We assess empirically whether the variation in data typically available in the advertising industry enables observational methods to recover the causal effects of online advertising. Using data from 15 U.S. advertising experiments at Facebook comprising 500 million user-experiment observations and 1.6 billion ad impressions, we contrast the experimental results to those obtained from multiple observational models. The observational methods often fail to produce the same effects as the randomized experiments, even after conditioning on extensive demographic and behavioral variables. In our setting, advances in causal inference methods do not allow us to isolate the exogenous variation needed to estimate the treatment effects. We also characterize the incremental explanatory power our data would require to enable observational methods to successfully measure advertising effects. Our findings suggest that commonly used observational approaches based on the data usually available in the industry often fail to accurately measure the true effect of advertising.
We introduce the network model as a formal psychometric model, conceptualizing the covariance between psychometric indicators as resulting from pairwise interactions between observable variables in a network structure. This contrasts with standard psychometric models, in which the covariance between test items arises from the influence of one or more common latent variables. Here, we present two generalizations of the network model that encompass latent variable structures, establishing network modeling as parts of the more general framework of Structural Equation Modeling (SEM). In the first generalization, we model the covariance structure of latent variables as a network. We term this framework Latent Network Modeling (LNM) and show that, with LNM, a unique structure of conditional independence relationships between latent variables can be obtained in an explorative manner. In the second generalization, the residual variance-covariance structure of indicators is modeled as a network. We term this generalization Residual Network Modeling (RNM) and show that, within this framework, identifiable models can be obtained in which local independence is structurally violated. These generalizations allow for a general modeling framework that can be used to fit, and compare, SEM models, network models, and the RNM and LNM generalizations. This methodology has been implemented in the free-to-use software package lvnet, which contains confirmatory model testing as well as two exploratory search algorithms: stepwise search algorithms for low-dimensional datasets and penalized maximum likelihood estimation for larger datasets. We show in simulation studies that these search algorithms performs adequately in identifying the structure of the relevant residual or latent networks. We further demonstrate the utility of these generalizations in an empirical example on a personality inventory dataset.
The educational, occupational, and creative accomplishments of the profoundly gifted participants (IQs ⩾ 160) in the Study of Mathematically Precocious Youth (SMPY) are astounding, but are they representative of equally able 12-year-olds? Duke University’s Talent Identification Program (TIP) identified 259 young adolescents who were equally gifted. By age 40, their life accomplishments also were extraordinary: Thirty-seven percent had earned doctorates, 7.5% had achieved academic tenure (4.3% at research-intensive universities), and 9% held patents; many were high-level leaders in major organizations. As was the case for the SMPY sample before them, differential ability strengths predicted their contrasting and eventual developmental trajectories—even though essentially all participants possessed both mathematical and verbal reasoning abilities far superior to those of typical Ph.D. recipients. Individuals, even profoundly gifted ones, primarily do what they are best at. Differences in ability patterns, like differences in interests, guide development along different paths, but ability level, coupled with commitment, determines whether and the extent to which noteworthy accomplishments are reached if opportunity presents itself. [Keywords intelligence, creativity, giftedness, replication, blink comparator]
Little is known about whether people make good choices when facing important decisions. This article reports on a large-scale randomized field experiment in which research subjects having difficulty making a decision flipped a coin to help determine their choice. For important decisions (e.g. quitting a job or ending a relationship), individuals who are told by the coin toss to make a change are more likely to make a change, more satisfied with their decisions, and happier six months later than those whose coin toss instructed maintaining the status quo. This finding suggests that people may be excessively cautious when facing life-changing choices. [Keywords: quitting, happiness, decision biases.]
Portia may be about the size of a fat raisin, with eyes no larger than sesame seeds, yet it has a visual acuity that beats a cat or a pigeon. The human eye is better, but only about five times better. So from a safe distance a foot or two away, Portia sits scanning Scytodes, looking to see if it is carrying an egg sac in its fangs… The retinas of its principle eyes have only about a thousand receptors compared to the 200 million or so of the human eyeball. But Portia can swivel these tiny eyes across the scene in systematic fashion, patiently building up an image point by point. Having rejected a few alternatives routes, Portia makes up its mind and disappears from sight. A couple of hours later, the silent assassin is back, dropping straight down on Scytodes from a convenient rock overhang on a silk dragline—looking like something out of the movie, Mission Impossible. Once again, Portia’s guile wins the day.
…Undoubtedly many of Portia’s cognitive abilities are genetic. Laboratory tests carried out by Robert Jackson, chief of Canterbury’s spider unit, have shown that only Portia from the particular area where Scytodes is common can recognise the difference between an egg sac carrying and non-egg sac carrying specimen. And it is a visual skill they are born with. The same species of Portia trapped a few hundred miles away doesn’t show any evidence of seeing the egg sac. But as Jackson points out, this just deepens the mystery. First there is the fact that such a specific mental behaviour as looking for an egg sac could be wired into a spider’s genome. And then there is the realisation that this is a population-specific, not species-specific, trait! It is a bit of locally acquired genetic knowledge. How does any simple hardwiring story account for that?
… “The White Tail can pluck, but only in a programmed, stereotyped, way. It doesn’t bother with tactics, or experimenting, or looking to see which way the other spider is facing. It just charges in and overpowers its prey with its size. Portia is a really weedy little spider and has to spend ages planning a careful attack. But its eyesight and trial and error approach means it can tackle any sort of web spider it comes across, even ones it has never met before in the history of its species,” says Harland. While Portia’s deception skills are impressive, the real admiration is reserved for its ability to plot a path to its victim. For an instinctive animal, out of sight is supposed to be out of mind. But Portia can take several hours to get into the right spot, even if it means losing sight of its prey for long periods.
…As a maze to be worked out from a single viewing—and with no previous experience of such mazes—this would be a tall order even for a rat or monkey. Yet more often than not, Portia could identify the right path. There was nothing quick about it. Portia would sit on top of the dowel for up to an hour, twisting to and fro as it appeared to track its eyes across the various possible routes. Sometimes it couldn’t decide and would just give up. However, once it had a plan, it would clamber down and pick the correct wire, even if this meant at first heading back behind where it had been perched. And walking right past the other wire. Harland says it seems that Portia can see where it has to get to in order to start its journey and ignore distractions along the way. This impression was strengthened by the fact that on trials where Portia made a wrong choice, it often gave up on reaching the first high bend of the wire—even though the bait was not yet in sight. It was as if Portia knew where it should be in the apparatus and could tell straight away when it had made a dumb mistake.
Crazy talk, obviously. There just ain’t room in Portia’s tiny head for anything approaching a plan, an expectation, or any other kind of inner life. The human brain has some 100 billion neurons, or brain cells, and even a mouse has around 70 million. Harland says no one has done a precise count on Portia but it is reckoned to have about 600,000 neurons, putting it midway between the quarter million of a housefly and the one million of a honey bee. Yet in the lab over the past few years, Portia has kept on surprising.
…Rather controversially, Li calls this the forming of a search image. Yet even if this mental priming is reduced to some thoroughly robotic explanation, such as an enhanced sensitivity of certain prey-recognising circuits and a matching damping of others, it still says that there is a general shift in the running state of Portia’s nervous system. Portia is responding in a globally cohesive fashion and is not just a loose bundle of automatic routines.
…Harland says Portia’s eyesight is the place to start. Jumping spiders already have excellent vision and Portia’s is ten times as good, making it sharper than most mammals. However being so small, there is a trade-off in that Portia can only focus its eyes on a tiny spot. It has to build up a picture of the world by scanning almost pixel by pixel across the visual scene. Whatever Portia ends up seeing, the information is accumulated slowly, as if peering through a keyhole, over many minutes. So there might be something a little like visual experience, but nothing like a full and “all at once” experience of a visual field. Harland feels that the serial nature of this scanning vision also makes it easier to imagine how prey recognition and other such decision processes could be controlled by some quite stereotyped genetic programs. When Portia is looking for an egg sac obscuring the face of Scytodes, it wouldn’t need to be representing the scene as a visual whole. Instead it could be checking a template, ticking off critical features in a sequence of fixations. In such a case, the less the eye sees with each fixation, perhaps the better. The human brain has to cope with a flood of information. Much of the work lies in discovering what to ignore about any moment. So the laser-like focus of Portia’s eyes might do much of this filtering by default. Yet while much of Portia’s mental abilities may reduce to the way its carefully designed eyes are coupled to largely reflexive motor patterns, Harland says there is still a disconcerting plasticity in its gene-encoded knowledge of the world. If one population of Portia can recognise an egg-carrying Scytodes but specimens from another region can’t, then this seems something quite new—a level of learning somewhere in-between the brain of an individual and the genome of a species… As Harland says, Portia just doesn’t fit anyone’s theories right at the moment.
Policy-makers are interested in early-years interventions to ameliorate childhood risks. They hope for improved adult outcomes in the long run, bringing return on investment. How much return can be expected depends, partly, on how strongly childhood risks forecast adult outcomes. But there is disagreement about whether childhood determines adulthood. We integrated multiple nationwide administrative databases and electronic medical records with the four-decade Dunedin birth-cohort study to test child-to-adult prediction in a different way, by using a population-segmentation approach. A segment comprising one-fifth of the cohort accounted for 36% of the cohort's injury insurance-claims; 40% of excess obese-kilograms; 54% of cigarettes smoked; 57% of hospital nights; 66% of welfare benefits; 77% of fatherless childrearing; 78% of prescription fills; and 81% of criminal convictions. Childhood risks, including poor age-three brain health, predicted this segment with large effect sizes. Early-years interventions effective with this population segment could yield very large returns on investment.
Health and social scientists have documented the hospital revolving-door problem, the concentration of crime, and long-term welfare dependence. Have these distinct fields identified the same citizens? Using administrative databases linked to 1.7 million New Zealanders, we quantified and monetized inequality in distributions of health and social problems and tested whether they aggregate within individuals. Marked inequality was observed: Gini coefficients equalled 0.96 for criminal convictions, 0.91 for public-hospital nights, 0.86 for welfare benefits, 0.74 for prescription-drug fills and 0.54 for injury-insurance claims. Marked aggregation was uncovered: a small population segment accounted for a disproportionate share of use-events and costs across multiple sectors. These findings were replicated in 2.3 million Danes. We then integrated the New Zealand databases with the four-decade-long Dunedin Study. The high-need/high-cost population segment experienced early-life factors that reduce workforce readiness, including low education and poor mental health. In midlife they reported low life satisfaction. Investing in young people’s education and training potential could reduce health and social inequalities and enhance population wellbeing.
Design principles and operational modes are explored that underlie the information processing capacity of the human brain. The hypothesis is put forward that in higher organisms, especially in primates, the complexity of the neural circuitry of the cerebral cortex is the neural correlate of the brain’s coherence and predictive power, and, thus, a measure of intelligence. It will be argued that with the evolution of the human brain we have nearly reached the limits of biological intelligence. [Keywords: Biological intelligence, Cognition, Consciousness, Cerebral cortex, Primates, Information processing, Neural networks, Cortical design, Human brain evolution]
“The Iron Law Of Evaluation And Other Metallic Rules” is a classic review paper by American “sociologistPeter Rossi, a dedicated progressive and the nation’s leading expert on social program evaluation from the 1960s through the 1980s”; it discusses the difficulties of creating a useful social program, and proposed some aphoristic summary rules, including most famously:
The Iron law: “The expected value of any net impact assessment of any large scale social program is zero”
the Stainless Steel law: “the better designed the impact assessment of a social program, the more likely is the resulting estimate of net impact to be zero.”
Whether China and the United States are destined to compete for domination in international politics is one of the major questions facing DoD. In a competition with the People’s Republic of China, the United States must explore all of its advantages and all of the weaknesses of China that may provide an asymmetry for the United States. This study examines one such asymmetry, the strategic consequences of Chinese racism. After having examined the literature on China extensively, this author is not aware of a single study that addresses this important topic. This study explores the causes of Chinese racism, the strategic consequences of Chinese racism, and how the United States may use this situation to advance its interests in international politics.
the study finds that xenophobia, racism, and ethnocentrism are caused by human evolution. These behaviors are not unique to the Chinese. However, they are made worse by Chinese history and culture.
considers the Chinese conception of race in Chinese history and culture. It finds that Chinese religious-cultural and historical conceptions of race reinforce Chinese racism. In Chinese history and contemporary culture, the Chinese are seen to be unique and superior to the rest of the world. Other peoples and groups are seen to be inferior, with a sliding scale of inferiority. The major Chinese distinction is between degrees of barbarians, the “black devils,” or savage inferiors, beyond any hope of interaction and the “white devils” or tame barbarians with whom the Chinese can interact. These beliefs are widespread in Chinese society, and have been for its history…
evaluates the 9 strategic consequences of Chinese racism.
virulent racism and eugenics heavily inform Chinese perceptions of the world…
racism informs their view of the United States…
racism informs their view of international politics in three ways.
states are stable, and thus good for the Chinese, to the degree that they are unicultural.
Chinese ethnocentrism and racism drive their outlook to the rest of the world. Their expectation is of a tribute system where barbarians know that the Chinese are superior.
there is a strong, implicit, racialist view of international politics that is alien and anathema to Western policy-makers and analysts. The Chinese are comfortable using race to explain events and appealing to racist stereotypes to advance their interests. Most insidious is the Chinese belief that Africans in particular need Chinese leadership.
the Chinese will make appeals to Third World states based on “racial solidarity,”…
Chinese racism retards their relations with the Third World…
Chinese racism, and the degree to which the Chinese permit their view of the United States to be informed by racism, has the potential to hinder China in its competition with the United States because it contributes to their overconfidence…
as lamentable as it is, Chinese racism helps to make the Chinese a formidable adversary…
the Chinese are never going to go through a civil rights movement like the United States…
China’s treatment of Christians and ethnic minorities is poor…
considers the 5 major implications for United States decision-makers and asymmetries that may result from Chinese racism.
Chinese racism provides empirical evidence of how the Chinese will treat other international actors if China becomes dominant…
it allows the United States to undermine China in the Third World…
it permits a positive image of the United States to be advanced in contrast to China…
calling attention to Chinese racism allows political and ideological alliances of the United States to be strengthened…
United States defense decision-makers must recognize that racism is a cohesive force for the Chinese…
…The study’s fundamental conclusion is that endemic Chinese racism offers the United States a major asymmetry it may exploit with major countries, regions like Africa, as well as with important opinion makers in international politics. The United States is on the right side of the struggle against racism and China is not. The United States should call attention to this to aid its position in international politics.
Every product in the marketplace has substitutes and complements. A substitute is another product you might buy if the first product is too expensive. Chicken is a substitute for beef. If you’re a chicken farmer and the price of beef goes up, the people will want more chicken, and you will sell more. A complement is a product that you usually buy together with another product. Gas and cars are complements. Computer hardware is a classic complement of computer operating systems. And babysitters are a complement of dinner at fine restaurants. In a small town, when the local five star restaurant has a two-for-one Valentine’s day special, the local babysitters double their rates. (Actually, the nine-year-olds get roped into early service.) All else being equal, demand for a product increases when the prices of its complements decrease.
Let me repeat that because you might have dozed off, and it’s important. Demand for a product increases when the prices of its complements decrease. For example, if flights to Miami become cheaper, demand for hotel rooms in Miami goes up—because more people are flying to Miami and need a room. When computers become cheaper, more people buy them, and they all need operating systems, so demand for operating systems goes up, which means the price of operating systems can go up.
…Once again: demand for a product increases when the price of its complements decreases. In general, a company’s strategic interest is going to be to get the price of their complements as low as possible. The lowest theoretically sustainable price would be the “commodity price”—the price that arises when you have a bunch of competitors offering indistinguishable goods. So:
Smart companies try to commoditize their products’ complements.
If you can do this, demand for your product will increase and you will be able to charge more and make more.
[HTTPS connections can provide third-party-verifiable signatures and so HTTPS is a valid Proof-of-Work and one can incentivize creating HTTPS connections and hence DDoSes. This could also be used non-maliciously to create a distributed anonymous uptime-checking service, by incentivizing only a few connections each time period for small bounties.]
Since its creation in 2009, Bitcoin has used a hash-based proof-of-work to generate new blocks, and create a single public ledger of transactions. The hash-based computational puzzle employed by Bitcoin is instrumental to its security, preventing Sybil attacks and making double-spending attacks more difficult. However, there have been concerns over the efficiency of this proof-of-work puzzle, and alternative “useful” proofs have been proposed. In this paper, we present DDoSCoin, which is a cryptocurrency with a malicious proof-of-work. DDoSCoin allows miners to prove that they have contributed to a distributed denial of service attack against specific target servers. This proof involves making a large number of TLS connections to a target server, and using cryptographic responses to prove that a large number of connections has been made. Like proof-of-work puzzles, these proofs are inexpensive to verify, and can be made arbitrarily difficult to solve.
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short-term and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain’s specialized systems can be interpreted as enabling efficient optimization for specific problem classes. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.
Some risks have extremely high stakes. For example, a worldwide pandemic or asteroid impact could potentially kill more than a billion people. Comfortingly, scientific calculations often put very low probabilities on the occurrence of such catastrophes. In this paper, we argue that there are important new methodological problems which arise when assessing global catastrophic risks and we focus on a problem regarding probability estimation. When an expert provides a calculation of the probability of an outcome, they are really providing the probability of the outcome occurring, given that their argument is watertight. However, their argument may fail for a number of reasons such as a flaw in the underlying theory, a flaw in the modeling of the problem, or a mistake in the calculations. If the probability estimate given by an argument is dwarfed by the chance that the argument itself is flawed, then the estimate is suspect. We develop this idea formally, explaining how it differs from the related distinctions of model and parameter uncertainty. Using the risk estimates from the Large Hadron Collider as a test case, we show how serious the problem can be when it comes to catastrophic risks and how best to address it.
Ted Chiang is an American science fiction writer. His work has won four Nebula awards, four Hugo awards, the John W. Campbell Award for Best New Writer, and four Locus awards. His short story "Story of Your Life" was the basis of the film Arrival (2016). He is also artist in residence at the University of Notre Dame.
This fantasy short story by Ted Chiang follows Fuwaad ibn Abbas, a fabric merchant in the ancient city of Baghdad. It begins when he is searching for a gift to give a business associate and happens to discover a new shop in the marketplace. The shop owner, who makes and sells a variety of very interesting items, invites Fuwaad into the back workshop to see a mysterious black stone arch which serves as a gateway into the future, which the shop owner has made by the use of alchemy. Fuwaad is intrigued, and the shop owner tells him 3 stories of others who have traveled through the gate to meet and have conversation with their future selves. When Fuwaad learns that the shop keeper has another gate in Cairo that will allow people to travel even into the past, he makes the journey there to try to rectify a mistake he made 20 years earlier. [Summary adapted from Wikipedia]
Gerard Manley Hopkins was an English poet and Jesuit priest, whose posthumous fame established him among the leading Victorian poets. His manipulation of prosody – particularly his concept of sprung rhythm – established him as an innovative writer of verse, as did his technique of praising God through vivid use of imagery and nature. Only after his death did Robert Bridges begin to publish a few of Hopkins's mature poems in anthologies, hoping to prepare the way for wider acceptance of his style. By 1930 his work was recognised as one of the most original literary accomplishments of his century. It had a marked influence on such leading 20th-century poets as T. S. Eliot, Dylan Thomas, W. H. Auden, Stephen Spender and Cecil Day-Lewis.
De rerum natura is a first-century BC didactic poem by the Roman poet and philosopher Lucretius with the goal of explaining Epicurean philosophy to a Roman audience. The poem, written in some 7,400 dactylic hexameters, is divided into six untitled books, and explores Epicurean physics through poetic language and metaphors. Namely, Lucretius explores the principles of atomism; the nature of the mind and soul; explanations of sensation and thought; the development of the world and its phenomena; and explains a variety of celestial and terrestrial phenomena. The universe described in the poem operates according to these physical principles, guided by fortuna ("chance"), and not the divine intervention of the traditional Roman deities.
The Pirahã are an indigenous people of the Amazon Rainforest in Brazil. They are the sole surviving subgroup of the Mura people, and are hunter-gatherers. They live mainly on the banks of the Maici River in Humaitá and Manicoré in the state of Amazonas. As of 2018, they number 800 individuals. The Pirahã people do not call themselves Pirahã but instead the Hi'aiti'ihi, roughly translated as "the straight ones."
Montaillou is a book by the French historian Emmanuel Le Roy Ladurie first published in 1975. It was first translated into English in 1978 by Barbara Bray, and has been subtitled The Promised Land of Error and Cathars and Catholics in a French Village.
Emmanuel Bernard Le Roy Ladurie is a French historian whose work is mainly focused upon Languedoc in the Ancien Régime, particularly the history of the peasantry. One of the leading historians of France, Le Roy Ladurie has been called the "standard-bearer" of the third generation of the Annales school and the "rock star of the medievalists", noted for his work in social history.
William Harry McRaven is a retired United States Navy four-star admiral who last served as the ninth commander of the United States Special Operations Command from August 8, 2011, to August 28, 2014. From 2015 to 2018, he was the chancellor of The University of Texas System.
A Beautiful Planet is a 2016 American documentary film that explores Earth by showing IMAX footage, recorded over the course of fifteen months by astronauts aboard the International Space Station. It is narrated by actress Jennifer Lawrence.
The Kingdom of Dreams and Madness is a 2013 Japanese documentary film directed by Mami Sunada. The film follows the routines of those employed at Studio Ghibli, including filmmakers Hayao Miyazaki, Isao Takahata, and Toshio Suzuki as they work to release two films simultaneously, The Wind Rises and The Tale of the Princess Kaguya.
Florence Foster Jenkins is a 2016 biographical film directed by Stephen Frears and written by Nicholas Martin. It stars Meryl Streep as Florence Foster Jenkins, a New York heiress known for her poor singing. Hugh Grant plays her manager and long-time companion, St. Clair Bayfield. Other cast members include Simon Helberg, Rebecca Ferguson, and Nina Arianda.
The Tale of the Princess Kaguya is a 2013 Japanese animated fantasy drama film co-written and directed by Isao Takahata, based on The Tale of the Bamboo Cutter, a 10th-century Japanese literary tale. It was produced by Studio Ghibli for Nippon Television Network, Dentsu, Hakuhodo DYMP, Walt Disney Japan, Mitsubishi, Toho and KDDI, and distributed by Toho.
Short Peace is a multimedia project composed of four short anime films produced by Sunrise and Shochiku, and a video game developed by Crispy's! and Grasshopper Manufacture. The four films were released in Japanese theaters on July 20, 2013 and were screened in North America during April 2014. Sentai Filmworks have licensed the films for North America. The video game was released in January 2014 in Japan, April 2014 in Europe, and September 2014 in North America. The game’s physical releases in Japan and Europe includes the four animated shorts as a bonus.
Death Parade is a 2015 Japanese anime television series created, written, and directed by Yuzuru Tachikawa and produced by Madhouse. The series spawned from a short film, Death Billiards, which was originally produced by Madhouse for the Young Animator Training Project's Anime Mirai 2013 and released on March 2, 2013. The television series aired in Japan between January 9, 2015 and March 27, 2015. It is licensed in North America by Funimation and in the United Kingdom by Anime Limited, the latter of which was eventually cancelled. The series was obtained by Madman Entertainment for digital distribution in Australia and New Zealand.
The Wind Rises is a 2013 Japanese animated historical drama film written and directed by Hayao Miyazaki, animated by Studio Ghibli for the Nippon Television Network, Dentsu, Hakuhodo DY Media Partners, Walt Disney Japan, Mitsubishi, Toho and KDDI and distributed by Toho. It was released on 20 July 2013, in Japan, and was released by Touchstone Pictures in North America on 21 February 2014.
Basilisk is a Japanese manga series written and illustrated by Masaki Segawa. It was published in Japan in 2003 and 2004 in Kodansha's Young Magazine Uppers magazine, based on the novel The Kouga Ninja Scrolls by Futaro Yamada published in 1958. The anime, produced in 2005 by Gonzo, closely follows the manga aside from a handful of distinctions. The manga won the 2004 Kodansha Manga Award for general manga. Segawa continued producing serialized adaptations of Futaro Yamada's novels with The Yagyu Ninja Scrolls in 2005, Yama Fu-Tang in 2010, and Jū: Ninpō Makai Tensei in 2012. Additionally, a two-part novel sequel titled The Ouka Ninja Scrolls: Basilisk New Chapter, penned by Masaki Yamada, was published in 2015 with illustrations by Segawa; a manga adaptation, Basilisk: The Ouka Ninja Scrolls, illustrated by Tatsuya Shihira with character designs by Masaki Segawa, was serialized between 2017 and 2019, and an anime adaptation by Seven Arcs Pictures aired in 2018.
Ringing Bell is a 1978 Japanese anime adventure-drama short film adaption of the storybook of the same name written by Takashi Yanase, the creator of Anpanman. It is most notable by fans and critics as a family film which makes a sharp sudden turn into a dark and violent story that criticizes and reflects upon the theme of revenge and war. It is also recognized as one of the only Japanese shock films directed towards children and families.
How do genes affect cognitive ability or other human quantitative traits such as height or disease risk? Progress on this challenging question is likely to be significant in the near future. I begin with a brief review of psychometric measurements of intelligence, introducing the idea of a "general factor" or g score. The main results concern the stability, validity (predictive power), and heritability of adult g. The largest component of genetic variance for both height and intelligence is additive (linear), leading to important simplifications in predictive modeling and statistical estimation. Due mainly to the rapidly decreasing cost of genotyping, it is possible that within the coming decade researchers will identify loci which account for a significant fraction of total g variation. In the case of height analogous efforts are well under way. I describe some unpublished results concerning the genetic architecture of height and cognitive ability, which suggest that roughly 10k moderately rare causal variants of mostly negative effect are responsible for normal population variation. Using results from Compressed Sensing (L1-penalized regression), I estimate the statistical power required to characterize both linear and nonlinear models for quantitative traits. The main unknown parameter s (sparsity) is the number of loci which account for the bulk of the genetic variation. The required sample size is of order 100s, or roughly a million in the case of cognitive ability.
Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML) is a statistical method for variance component estimation in genetics which quantifies the total narrow-sense (additive) contribution to a trait's heritability of a particular subset of genetic variants. This is done by directly quantifying the chance genetic similarity of unrelated individuals and comparing it to their measured similarity on a trait; if two unrelated individuals are relatively similar genetically and also have similar trait measurements, then the measured genetics are likely to causally influence that trait, and the correlation can to some degree tell how much. This can be illustrated by plotting the squared pairwise trait differences between individuals against their estimated degree of relatedness. The GCTA framework can be applied in a variety of settings. For example, it can be used to examine changes in heritability over aging and development. It can also be extended to analyse bivariate genetic correlations between traits. There is an ongoing debate about whether GCTA generates reliable or stable estimates of heritability when used on current SNP data. The method is based on the outdated and false dichotomy of genes versus the environment. It also suffers from serious methodological weaknesses, such as susceptibility to population stratification.
Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.
In February 2019, following up on my 2015–2016 text-generation experiments with char-RNNs, I experiment with the cutting-edge Transformer NN architecture for language modeling & text generation. Using OpenAI’s GPT-2-117M (117M) model pre-trained on a large Internet corpus and nshepperd’s finetuning code, I retrain GPT-2-117M on a large (117MB) Project Gutenberg poetry corpus. I demonstrate how to train 2 variants: “GPT-2-poetry”, trained on the poems as a continuous stream of text, and “GPT-2-poetry-prefix”, with each line prefixed with the metadata of the PG book it came from. In May 2019, I trained the next-largest GPT-2, GPT-2-345M, similarly, for a further quality boost in generated poems. In October 2019, I retrained GPT-2-117M on a Project Gutenberg corpus with improved formatting, and combined it with a contemporary poem dataset based on Poetry Foundation’swebsite. .> With just a few GPU-days on 1080ti GPUs, GPT-2-117M finetuning can produce high-quality poetry which is more thematically consistent than my char-RNN poems, capable of modeling subtle features like rhyming, and sometimes even a pleasure to read. I list the many possible ways to improve poem generation and further approach human-level poems. For the highest-quality AI poetry to date, see my followup page, “GPT-3 Creative Writing”.
I continue my AI poetry generation experiments with OpenAI’s 2020 GPT-3, which is 116× larger, and much more powerful, than the 2019 GPT-2. GPT-3, however, is not merely a quantitative tweak yielding “GPT-2 but better”—it is qualitatively different, exhibiting eerie runtime learning capabilities allowing even the raw model, with zero finetuning, to “meta-learn” many textual tasks purely by example or instruction. One does not train or program GPT-3 in a normal way, but one engages in dialogue and writes prompts to teach GPT-3 what one wants.
Experimenting through the OpenAI Beta API in June 2020, I find that GPT-3 does not just match my finetuned GPT-2-1.5b-poetry for poem-writing quality, but exceeds it, while being versatile in handling poetry, Tom Swifty puns, science fiction, dialogue like Turing’s Turing-test dialogue, literary style parodies… As the pièce de résistance, I recreate Stanislaw Lem’s Cyberiad’s “Trurl’s Electronic Bard” poetry using GPT-3. (Along the way, I document instances of how the BPE text encoding unnecessarily damagesGPT-3’s performance on a variety of tasks, how to best elicit the highest-quality responses, common errors people make in using GPT-3, and test out GPT-3’s improvements in NN weak points like logic or commonsense knowledge.)
GPT-3’s samples are not just close to human level: they are creative, witty, deep, meta, and often beautiful. They demonstrate an ability to handle abstractions, like style parodies, I have not seen in GPT-2 at all. Chatting with GPT-3 feels uncannily like chatting with a human. I was impressed by the results reported in the GPT-3 paper, and after spending a week trying it out, I remain impressed.
This page records GPT-3 samples I generated in my explorations, and thoughts on how to use GPT-3 and its remaining weaknesses. I hope you enjoy them even a tenth as much as I enjoyed testing GPT-3 and watching the completions scroll across my screen.
The design of experiments is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.
The power of a binary hypothesis test is the probability that the test fails to reject the null hypothesis when a specific alternative hypothesis is true — i.e., it indicates the probability of avoiding a type II error. The statistical power ranges from 0 to 1, and as statistical power increases, the probability of making a type II error decreases.
The base rate fallacy, also called base rate neglect or base rate bias, is a fallacy. If presented with related base rate information and specific information, people tend to ignore the base rate in favor of the individuating information, rather than correctly integrating the two.
In statistics, regression toward the mean is the phenomenon that arises if a sample point of a random variable is extreme, a future point will be closer to the mean or average on further measurements. To avoid making incorrect inferences, regression toward the mean must be considered when designing scientific experiments and interpreting data. Historically, what is now called regression toward the mean was also called reversion to the mean and reversion to mediocrity.
“GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment”, Cornelius A. Rietveld, Sarah E. Medland, Jaime Derringer, Jian Yang, Tõnu Esko, Nicolas W. Martin, Harm-Jan Westra, Konstantin Shakhbazov, Abdel Abdellaoui, Arpana Agrawal, Eva Albrecht, Behrooz Z. Alizadeh, Najaf Amin, John Barnard, Sebastian E. Baumeister, Kelly S. Benke, Lawrence F. Bielak, Jeffrey A. Boatman, Patricia A. Boyle, Gail Davies, Christiaan de Leeuw, Niina Eklund, Daniel S. Evans, Rudolf Ferhmann, Krista Fischer, Christian Gieger, Håkon K. Gjessing, Sara Hägg, Jennifer R. Harris, Caroline Hayward, Christina Holzapfel, Carla A. Ibrahim-Verbaas, Erik Ingelsson, Bo Jacobsson, Peter K. Joshi, Astanand Jugessur, Marika Kaakinen, Stavroula Kanoni, Juha Karjalainen, Ivana Kolcic, Kati Kristiansson, Zoltán Kutalik, Jari Lahti, Sang H. Lee, Peng Lin, Penelope A. Lind, Yongmei Liu, Kurt Lohman, Marisa Loitfelder, George McMahon, Pedro Marques Vidal, Osorio Meirelles, Lili Milani, Ronny Myhre, Marja-Liisa Nuotio, Christopher J. Oldmeadow, Katja E. Petrovic, Wouter J. Peyrot, Ozren Polašek, Lydia Quaye, Eva Reinmaa, John P. Rice, Thais S. Rizzi, Helena Schmidt, Reinhold Schmidt, Albert V. Smith, Jennifer A. Smith, Toshiko Tanaka, Antonio Terracciano, Matthijs J. H. M. van der Loos, Veronique Vitart, Henry Völzke, Jürgen Wellmann, Lei Yu, Wei Zhao, Jüri Allik, John R. Attia, Stefania Bandinelli, François Bastardot, Jonathan Beauchamp, David A. Bennett, Klaus Berger, Laura J. Bierut, Dorret I. Boomsma, Ute Bültmann, Harry Campbell, Christopher F. Chabris, Lynn Cherkas, Mina K. Chung, Francesco Cucca, Mariza de Andrade, Philip L. De Jager, Jan-Emmanuel De Neve, Ian J. Deary, George V. Dedoussis, Panos Deloukas, Maria Dimitriou, Guðný Eiríksdóttir, Martin F. Elderson, Johan G. Eriksson, David M. Evans, Jessica D. Faul, Luigi Ferrucci, Melissa E. Garcia, Henrik Grönberg, Vilmundur Guðnason, Per Hall, Juliette M. Harris, Tamara B. Harris, Nicholas D. Hastie, Andrew C. Heath, Dena G. Hernandez, Wolfgang Hoffmann, Adriaan Hofman, Rolf Holle, Elizabeth G. Holliday, Jouke-Jan Hottenga, William G. Iacono, Thomas Illig, Marjo-Riitta Järvelin, Mika Kähönen, Jaakko Kaprio, Robert M. Kirkpatrick, Matthew Kowgier, Antti Latvala, Lenore J. Launer, Debbie A. Lawlor, Terho Lehtimäki, Jingmei Li, Paul Lichtenstein, Peter Lichtner, David C. Liewald, Pamela A. Madden, Patrik K. E. Magnusson, Tomi E. Mäkinen, Marco Masala, Matt McGue, Andres Metspalu, Andreas Mielck, Michael B. Miller, Grant W. Montgomery, Sutapa Mukherjee, Dale R. Nyholt, Ben A. Oostra, Lyle J. Palmer, Aarno Palotie, Brenda W. J. H. Penninx, Markus Perola, Patricia A. Peyser, Martin Preisig, Katri Räikkönen, Olli T. Raitakari, Anu Realo, Susan M. Ring, Samuli Ripatti, Fernando Rivadeneira, Igor Rudan, Aldo Rustichini, Veikko Salomaa, Antti-Pekka Sarin, David Schlessinger, Rodney J. Scott, Harold Snieder, Beate St Pourcain, John M. Starr, Jae Hoon Sul, Ida Surakka, Rauli Svento, Alexander Teumer, The LifeLines Cohort Study, Henning Tiemeier, Frank J. A. van Rooij, David R. Van Wagoner, Erkki Vartiainen, Jorma Viikari, Peter Vollenweider, Judith M. Vonk, Gérard Waeber, David R. Weir, H.-Erich Wichmann, Elisabeth Widen, Gonneke Willemsen, James F. Wilson, Alan F. Wright, Dalton Conley, George Davey-Smith, Lude Franke, Patrick J. F. Groenen, Albert Hofman, Magnus Johannesson, Sharon L. R. Kardia, Robert F. Krueger, David Laibson, Nicholas G. Martin, Michelle N. Meyer, Danielle Posthuma, A. Roy Thurik, Nicholas J. Timpson, André G. Uitterlinden, Cornelia M. van Duijn, Peter M. Visscher, Daniel J. Benjamin, David Cesarini, Philipp D. Koellinger (2013-06-21):
A genome-wide association study (GWAS) of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent single-nucleotide polymorphisms (SNPs) are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (coefficient of determination R2 ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics.
[A landmark study in behavioral genetics and intelligence: the first well-powered GWAS to detect genetic variants for intelligence and education which replicate out of sample and are proven to be causal in a between-sibling study.]
“Replicability and robustness of genome-wide-association studies for behavioral traits.”, Rietveld, Cornelius A. Conley, Dalton Eriksson, Nicholas Esko, Tõnu Medland, Sarah E. Vinkhuyzen, Anna A. E Yang, Jian Boardman, Jason D. Chabris, Christopher F. Dawes, Christopher T. Domingue, Benjamin W. Hinds, David A. Johannesson, Magnus Kiefer, Amy K. Laibson, David Magnusson, Patrik K. E Mountain, Joanna L. Oskarsson, Sven Rostapshova, Olga Teumer, Alexander Tung, Joyce Y. Visscher, Peter M. Benjamin, Daniel J. Cesarini, David Koellinger, Philipp D (2014):
A recent genome-wide-association study of educational attainment identified three single-nucleotide polymorphisms (SNPs) whose associations, despite their small effect sizes (each R2 ≈ 0.02%), reached genome-wide significance (p < 5 × 10−8) in a large discovery sample and were replicated in an independent sample (p < .05). The study also reported associations between educational attainment and indices of SNPs called "polygenic scores." In three studies, we evaluated the robustness of these findings. Study 1 showed that the associations with all three SNPs were replicated in another large (n = 34,428) independent sample. We also found that the scores remained predictive (R2 ≈ 2%) in regressions with stringent controls for stratification (Study 2) and in new within-family analyses (Study 3). Our results show that large and therefore well-powered genome-wide-association studies can identify replicable genetic associations with behavioral traits. The small effect sizes of individual SNPs are likely to be a major contributing factor explaining the striking contrast between our results and the disappointing replication record of most candidate-gene studies.
Portia is a genus of jumping spider that feeds on other spiders. They are remarkable for their intelligent hunting behaviour, which suggests that they are capable of learning and problem solving, traits normally attributed to much larger animals.
Portia is a behaviourally complex and aberrant salticid genus. The genus is of unusual importance because it is morphologically primitive. Five species were studied in nature (Australia, Kenya, Malaysia, Sri Lanka) and in the laboratory in an effort to clarify the origins of the salticids and of their unique, complex eyes. All the species of Portia studied were both web builders and cursorial.
Portia was also an araneophagic web invader, and it was a highly effective predator on diverse types of alien webs. Portia was an aggressive mimic, using a complex repertoire of vibratory behaviour to deceive the host spiders on which it fed. The venom of Portia was unusually potent to other spiders; its easily autotomised legs may have helped Portia escape if attacked by its frequently dangerous prey. Portia was also kleptoparasitic and oophagic when occupying alien webs. P. fimbriata from Queensland, where cursorial salticids were superabundant, used a unique manner of stalking and capturing other salticids.
The display repertoires used during intraspecific interactions were complex and varied between species. Both visual (typical of other salticids) and vibratory (typical of other web spiders) displays were used. Portia copulated both on and away from webs and frequently with the female hanging from a dragline. Males cohabited with subadult females on webs, mating after the female matured. Adult and subadult females sometimes used specialised predatory attacks against courting or mating males. Sperm induction in Portia was similar to that in other cursorial spiders.
Portia mimicked detritus in shape and colour, and its slow, mechanical locomotion preserved concealment. Portia occasionally used a special defensive behaviour (wild leaping) if disturbed by a potential predator. Two types of webs were spun by all species (Type 1, small resting platforms; Type 2, large prey-capture webs). Two types of egg sacs were made, both of which were highly aberrant for a salticid. Responses of different species and both sexes of Portia were quantitatively compared for different types of prey.
Many of the trends in behaviour within the genus, including quantitative differences in predatory behaviour, seemed to be related to differences in the effectiveness of the cryptic morphology of Portia in concealing the spider in its natural habitat (‘effective crypsis’).
The results of the study supported, in general, Jackson & Blest’s (1982a) hypothesis of salticid evolution which, in part, proposes that salticid ancestors were web builders with poorly developed vision and that acute vision evolved in conjunction with the ancestral spiders becoming proficient as araneophagic invaders of diverse types of webs.
The influence of prey movement on the performance of simple detours by salticids was investigated. Seven species were studied. Two subject species, Portia fimbriata and Portia labiata, are specialized web-invading species that eat other spiders. The other five species investigated (Euryattus sp., Euophrys parvula, Marpissa marina, Trite auricoma and Trite planiceps) are more typical cursorial hunters of insects. We provide evidence that:
salticids will initiate detours toward motionless prey;
salticids are more inclined to initiate detours toward moving than toward motionless prey;
salticids tend to complete detours even when prey that had been moving at the start remains stationary during the detour;
prey movement makes the salticid more likely to stalk and attack when prey is only a few centimetres away and in a position from which it can be reached by a straight-line pursuit;
Portia is more inclined than the other salticids to initiate detours to motionless prey, then to stalk and attack motionless prey when close, than the other salticids are.
Mechanisms that might account for Portia being different from the other salticids are discussed.
Portia is a jumping spider that invades other spiders’ webs, makes vibratory signals that deceive the resident spider (aggressive mimicry), then attacks and eats the spider. Portia exploits a wide range of prey-spider species.
Evidence is provided from observation and experimentation that Portia uses a trial-and-error method as part of its strategy for deriving appropriate signals for different prey. To use this method, Portia first broadcasts an array of different signals, then narrows to particular signals as a consequence of feedback from the prey spider. Feedback can be web vibration or seeing spiders move, or both.
This appears to be an example of deception involving at least a limited form of learning, an uncommon phenomenon in invertebrates.
The terms “reversed-route detours” and “forward-route detours” are introduced to distinguish between detours that require moving away from a goal and those that do not. We provide the first evidence under controlled laboratory conditions that salticids can perform reversed-route detours.
Two species were tested: 1. Portia fimbriata, a web-invading salticid from Queensland, Australia, that normally preys on web-building spiders; 2. Trite planiceps, an insectivorous cursorial salticid from New Zealand.
Although both of these species completed reversed-route detours, Trite planiceps was much more dependent on prey movement than Portia fimbriata. Interspecific differences appear to be related to the different predatory styles of these two salticids.
Portia is a web-invading araneophagic spider that uses aggressive mimicry to deceive its prey. The present paper is a first step toward clarifying experimentally the cues that govern Portia’s decisions of whether to enter a web, whether to make signals once in a web, and whether to persist at signalling once started.
The following conclusions are supported: cues from seeing a web elicit web entry, but volatile chemical cues from webs of prey spiders are not important; seeing a spider in a web increases Portia’s inclination to enter the web; after web entry, cues from webs of prey spiders are sufficient to elicit signalling behaviour, even in the absence of other cues coming directly from the prey spider; seeing a prey spider or detecting vibrations on the web make Portia more prone to signal, but volatile chemical cues from prey spiders are not important; once Portia is on a web and signalling, seeing a moving spider and detecting vibrations on the web encourage Portia to persist in signalling; on the basis of visual cues alone, Portia can distinguish between quiescent spiders, insects and eggsacs.
The stalking behaviour of four species of jumping spiders, Portia fimbriata, P. labiata, P. schultzi and P. africana, was examined to determine whether Portia opportunistically exploits situations in which the prey spider is distracted by environmental disturbances.
Disturbances were created mainly by wind blowing on webs and a magnet shaking webs. All four Portia species moved significantly further during disturbance than during non-disturbance, a behaviour labeled ‘opportunistic smokescreen behaviour’. Portia can discriminate between spiders and other prey such as live insects, wrapped-up insects in the web, and egg sacs, because_Portia_ used opportunistic smokescreen behaviour only against spiders and not against these other types of prey. If the location of disturbances and the location of prey differ, Portia can accurately discriminate between them. Portia’s smokescreen behaviour apparently is a true predatory tactic because Portia attacked prey more often during disturbances than at other times.
Smokescreen behaviour appears to work in part because the disturbances that Portia uses for smokescreens interfere with the prey’s ability to sense_Portia_’s stalking movements.
Salticids, the largest family of spiders, have unique eyes, acute vision, and elaborate vision-mediated predatory behavior, which is more pronounced than in any other spider group. Diverse predatory strategies have evolved, including araneophagy, aggressive mimicry, myrmicophagy, and prey-specific prey-catching behavior. Salticids are also distinctive for development of behavioral flexibility, including conditional predatory strategies, the use of trial-and-error to solve predatory problems, and the undertaking of detours to reach prey. Predatory behavior of araneophagic salticids has undergone local adaptation to local prey, and there is evidence of predator-prey coevolution. Trade-offs between mating and predatory strategies appear to be important in ant-mimicking and araneophagic species. [Keywords: salticids, salticid eyes, Portia, predatory versatility, aggressive mimicry]
In a laboratory study, 12 different experimental set-ups were used to examine the ability of Portia fimbriata, a web-invading araneophagic jumping spider from Queensland, Australia, to choose between two detour paths, only one of which led to a lure (a dead, dried spider). Regardless of set-up, the spider could see the lure when on the starting platform of the apparatus, but not after leaving the starting platform. The spider consistently chose the ‘correct route’ (the route that led to the lure) more often than the ‘wrong route’ (the route that did not lead to the lure). In these tests, the spider was able to make detours that required walking about 180° away from the lure and walking past where the incorrect route began. There was also a pronounced relationship between time of day when tests were carried out and the spider’s tendency to choose a route. Furthermore, those spiders that chose the wrong route abandoned the detour more frequently than those that chose the correct route, despite both groups being unable to see the lure when the decision was made to abandon the detour.
This chapter illustrates the cognitive abilities of araneophagic jumping spiders. “Portia”, a genus of araneophagic jumping spiders (family Salticidae), appears to have the most versatile and flexible predatory strategy known for an arthropod. A dominant feature of Portia’s predatory strategy is aggressive mimicry, a system in which the predator communicates deceitfully with its prey. Typical salticids do not build webs. Instead, they are hunters that catch their prey in stalk-and-leap sequences guided by vision. Salticids differ from all other spiders by having large anteromedial eyes and acute vision. However, the behavior of Portia is anything but typical for a salticid. Besides hunting its prey cursorily, Portia also builds a prey-catching web. The typical prey of a salticid is insects, but Portia’s preferred prey is other spiders. Portia frequently hunts web-building spiders from other families by invading their webs and deceiving them with aggressive-mimicry signals. While in the other spider’s web, it makes aggressive-mimicry signals by moving legs, palps, abdomen, or some combination of these to make web-borne vibrations. Portia’s typical victim, a web-building spider but not a salticid, typically lacks acute vision and instead perceives the world it lives in by interpreting tension and vibration patterns in its web.
Table of Contents: Introduction · Spiders that eat other spiders · Predator-prey interactions between Portia fimbriata and Euryattus sp. · Detecting Portia’s footsteps · Smokescreen tactics · Flexibly adjusting signals to prey behavior · Making detours and planning ahead · Cognitive levels · Levels of deception · Design options for animal brains
Jumping spiders Portia labiata were tested in the laboratory on three different kinds of detours. In one, both routes led to the lure. In the other variants, one of the routes had a gap, making that route impassable. When tested with only one complete route, Portia chose this route after visually inspecting both routes. An analysis of scanning showed that, at the beginning of the scanning routine, the spiders scanned both the complete and the incomplete route but that, by the end of the scanning routine, they predominantly scanned only the complete route. Two rules seemed to govern their scanning: (1) they would continue turning in one direction when scanning away from the lure along horizontal features of the detour route; and (2) when the end of the horizontal feature being scanned was reached, they would change direction and turn back towards the lure. These rules ‘channeled’ the spiders’ scanning on to the complete route, and they then overwhelmingly chose to head towards the route they had fixated most while scanning.
Portia fimbriata is a web-invading araneophagic jumping spider (Salticidae). The use of signal-generating behaviours is characteristic of how P. fimbriata captures its prey, with three basic categories of signal-generating behaviours being prevalent when the prey spider is in an orb web. The predatory behaviour of P. fimbriata has been referred to as “aggressive mimicry”, but no previous studies have provided details concerning the characteristics of P. fimbriata’s signals.
We attempt to determine the model signals for P. fimbriata’s ‘aggressive mimicry’ signals. Using laser Doppler vibrometer and the orb webs of Zygiella x-notata and Zosis geniculatus, P. fimbriata’s signals are compared with signals from other sources. Each of P. fimbriata’s three categories of behaviour makes a signal that resembles one of three signals from other sources: prey of the web spider (insects) ensnared in the capture zone of the web, prey making faint contact with the periphery of the web and large-scale disturbance of the web (jarring the spider’s cage).
Experimental evidence from testing P. fimbriata with two sizes of lure made from Zosis (dead, mounted in a lifelike posture in standard-size orb web) clarifies P. fimbriata’s signal-use strategy:
when the resident spider is small, begin by simulating signals from an insect ensnared in the capture zone (attempt to lure in the resident spider);
when the resident spider is large, start by simulating signals from an insect brushing against the periphery of the web (keep the resident spider out in the web, but avoid provoking from it a full-scale predatory attack);
when walking in the resident spider’s web, regardless of the resident spider’s size, step toward the spider while making a signal that simulates a large-scale disturbance of the web (mask footsteps with a self-made vibratory smokescreen).
Portia fimbriata from Queensland, Australia, is an araneophagic jumping spider (Salticidae) that includes in its predatory strategy a tactic (cryptic stalking) enabling it to prey effectively on a wide range of salticids from other genera.
Optical cues used by P. fimbriata to identify the salticid species on which it most commonly preys, Jacksonoides queenslandicus, were investigated experimentally in the laboratory using odorless lures made from dead prey on which various combinations of features were altered. P. fimbriata adopted cryptic stalking only against intact salticid lures and modified lures on which the large anterior-median eyes were visible. Ordinary stalking was usually adopted when the lure did not have the anterior-median eyes visible. There was no evidence that cues from the legs of prey salticids influence the choice of stalking style of P. fimbriata, but cues from the legs do appear to influence strongly whether a prey is stalked at all. Cues from the cephalothorax and abdomen also influenced the stalking tendency, but to a lesser degree than cues from the legs.
An algorithm to describe the perceptual processes of P. fimbriata when visually discriminating between salticid and non-salticid prey is discussed.
Recent research on the eyes and vision-guided behaviour of jumping spiders (Salticidae) is reviewed. Special attention is given to Portia Karsch. The species in this African, Asian and Australian genus have especially complex predatory strategies. Portia’s preferred prey are other spiders, which are captured through behavioural sequences based on making aggressive-mimicry web signals, problem solving and planning. Recent research has used Portia to study cognitive attributes more often associated with large predatory mammals such as lions and rarely considered in studies on spiders. In salticids, complex behaviour and high-spatial-acuity vision are tightly interrelated. Salticid eyes are unique and complex. How salticid eyes function is reviewed. Size constraints are discussed.
Portia fimbriata, an araneophagic jumping spider (Salticidae), makes undirected leaps (erratic leaping with no particular target being evident) in the presence of chemical cues from Jacksonoides queenslandicus, another salticid and a common prey of P. fimbriata. Whether undirected leaping by P. fimbriata functions as hunting by speculation is investigated experimentally.
Our first hypothesis, that undirected leaps provoke movement by J. queenslandicus, was investigated using living P. fimbriata and three types of lures made from dead, dry arthropods (P. fimbriata, J. queenslandicus, and Musca domestica). When a living P. fimbriata made undirected leaps or a spring-driven device made the lures suddenly move up and down, simulating undirected leaping, J. queenslandicus responded by waving its palps and starting to walk. There was no statistical evidence that the species from which the lure was made influenced J. queenslandicus’ response in these tests.
Our second hypothesis, that J. queenslandicus reveals its location to P. fimbriata by moving, was investigated by recording P. fimbriata’s reaction to J. queenslandicus when J. queenslandicus reacted to lures simulating undirected leaping. In these tests, P. fimbriata responded by turning toward J. queenslandicus and waving its palps.
Portia is a genus of web-invading araneophagic jumping spiders known from earlier studies to derive aggressive-mimicry signals by using a generate-and-test algorithm (trial-and-error tactic). Here P. fimbriata’s use of trial-and-error to solve a confinement problem (how to escape from an island surrounded by water) is investigated. Spiders choose between two potential escape tactics (leap or swim), one of which will fail (bring spider no closer to edge of tray) and the other of which will partially succeed (bring spider closer to edge of tray). The particular choice that will partially succeed is unknown to the spider. Using trial-and-error, P. fimbriata solves the confinement problem both when correct choices are rewarded (i.e. when the spider is moved closer to edge of tray) and when incorrect choices are punished (i.e. when the spider gets no closer to edge of tray).
Three species of Portia (Portia africana from Kenya, Portia fimbriata from Australia and Portia labiata from the Philippines) were tested with flies Drosophila immigrans and Musca domestica and with web-building spiders Badumna longinquus and Pholcus phalangioides. Badumna longinquus has powerful chelicerae, but not especially long legs, whereas Ph. phalangioides has exceptionally long legs, but only small, weak chelicerae. Typically, Portia sighted flies, walked directly towards them and attacked without adjusting orientation. However, Portia‘s attacks on the spiders were aimed primarily at the cephalothorax instead of the legs or abdomen. Portia usually targeted the posterior-dorsal region of B. longinquus’ cephalothorax by attacking this species from above and behind. When the prey was Ph. phalangioides, attack orientation was defined primarily by opportunistic gaps between this species’ long legs (gaps through which Portia could contact the pholcid’s body without contacting one of the pholcid’s legs). Portia’s attack strategy appears to be an adjustment to the different types of risk posed by different types of prey.
Portia is a genus of web-invading araneophagic (spider eating) jumping spiders known from earlier studies to derive aggressive-mimicry signals by using a generate-and-test (trial and error) algorithm. We studied individuals of Portia labiata from two populations (Los Baños and Sagada) in the Philippines that have previously been shown to differ in the level to which they rely on trial-and-error derivation of signals for prey capture (Los Baños relied on trial and error more strongly than Sagada P. labiata).
Here we investigated P. labiata’s use of trial and error in a novel situation (a confinement problem: how to escape from an island surrounded by water) that is unlikely to correspond closely to anything the spider would encounter in nature. During Experiment 1, spiders chose between two potential escape tactics (leap or swim), one of which was set at random to fail (brought spider no closer to edge of tray) and the other of which was set for partially succeeding (brought spider closer to edge of tray). By using trial and error, the Los Baños P. labiata solved the confinement problem significantly more often than the Sagada P. labiata in Experiment 1, both when the correct choices were positively reinforced (i.e., when the spider was moved closer to edge of tray) and when incorrect choices were punished (i.e., when the spider got no closer to edge of tray). In Experiment 2, the test individual’s first choice was always set to fail, and P. labiata was given repeated opportunities to respond to feedback, yet the Sagada P. labiata continued to place little reliance on trial and error for solving the confinement problem.
That the Los Baños P. labiata relied more strongly on trial-and-error problem solving than the Sagada P. labiata has now been demonstrated across two different tasks.
Our objective was to use expectancy-violation methods for determining whether Portia africana, a salticid spider that specializes in eating other spiders, is proficient at representing exact numbers of prey. In our experiments, we relied on this predator’s known capacity to gain access to prey by following pre-planned detours. After Portia first viewed a scene consisting of a particular number of prey items, it could then take a detour during which the scene went out of view. Upon reaching a tower at the end of the detour, Portia could again view a scene, but now the number of prey items might be different. We found that, compared with control trials in which the number was the same as before, Portia’s behaviour was significantly different in most instances when we made the following changes in number: 1 versus 2, 1 versus 3, 1 versus 4, 2 versus 3, 2 versus 4 or 2 versus 6. These effects were independent of whether the larger number was seen first or second. No significant effects were evident when the number of prey changed between 3 versus 4 or 3 versus 6. When we changed prey size and arrangement while keeping prey number constant, no significant effects were detected. Our findings suggest that Portia represents 1 and 2 as discrete number categories, but categorizes 3 or more as a single category that we call ‘many’.
Most of you probably know about Turing machines: hypothetical gizmos built of paper punch-tape, read-write heads, and imagination, which can—step by laborious step—emulate the operation of any computer. And some of you may be old enough to remember the Sinclair ZX-80— a sad little personal computer so primitive that it couldn’t even run its video display and its keyboard at the same time (typing would cause the screen to go dark). Peer into the darkness between these artifacts, stir in a little DNA, and what do you get? This hairy little spider right here. A pinpoint brain with less than a million neurons, somehow capable of mammalian-level problem-solving. And just maybe, a whole new approach to cognition.
Here’s the thumbnail sketch: we have here a spider who eats other spiders, who changes her foraging strategy on the fly, who resorts to trial and error techniques to lure prey into range. She will brave a full frontal assault against prey carrying an egg sac, but sneak up upon an unencumbered target of the same species…Portia improvises. But it’s not just this flexible behavioral repertoire that’s so amazing. It’s not the fact that somehow, this dumb little spider with its crude compound optics has visual acuity to rival a cat’s (even though a cat’s got orders of magnitude more neurons in one retina than our spider has in her whole damn head). It’s not even the fact that this little beast can figure out a maze which entails recognizing prey, then figuring out an approach path along which that prey is not visible (i.e., the spider can’t just keep her eyes on the ball: she has to develop and remember a search image), then follow her best-laid plans by memory including recognizing when she’s made a wrong turn and retracing her steps, all the while out of sight of her target. No, the really amazing thing is how she does all this with a measly 600,000 neurons— how she pulls off cognitive feats that would challenge a mammal with 70 million or more.
She does it like a Turing Machine, one laborious step at a time. She does it like a Sinclair ZX-80: running one part of the system then another, because she doesn’t have the circuitry to run both at once. She does it all sequentially, by timesharing. She’ll sit there for two fucking hours, just watching. It takes that long to process the image, you see: whereas a cat or a mouse would assimilate the whole hi-res vista in an instant, Portia’s poor underpowered graphics driver can only hold a fraction of the scene at any given time. So she scans, back and forth, back and forth, like some kind of hairy multilimbed Cylon centurion, scanning each little segment of the game board in turn…Portia won’t be deterred by the fact that she only has a few percent of a real brain: she emulates the brain she needs, a few percents at a time.
I wonder what the limits are to Portia’s painstaking intellect. Suppose we protected her from predators, and hooked her up to a teensy spider-sized glucose drip so she wouldn’t starve. It takes her a couple of hours to capture a snapshot; how long will it take the fuzzy-legged little beauty to compose a sonnet? Are we looking at a whole new kind of piecemeal, modular intellect here? And why the hell didn’t I think of it first? [Watts would reuse this idea in his 2014 SF novel Echopraxia.]
Peter Watts is a Canadian science fiction author. He specializes in hard science fiction. He earned a Ph.D from the University of British Columbia in Vancouver, British Columbia in 1991, from the Department of Zoology and Resource Ecology. He went on to hold several academic research and teaching positions, and worked as a marine-mammal biologist. He began publishing fiction around the time he finished graduate school.
Sociology is the study of human behavior. Sociology refers to social behavior, society, patterns of social relationships, social interaction, and culture that surrounds everyday life. It is a social science that uses various methods of empirical investigation and critical analysis to develop a body of knowledge about social order and social change. Sociology can also be defined as the general science of society. While some sociologists conduct research that may be applied directly to social policy and welfare, others focus primarily on refining the theoretical understanding of social processes. Subject matter can range from micro-level analyses of society to macro-level analyses.
Welfare is a type of government support intended to ensure that members of a society can meet basic human needs such as food and shelter. Social security may either be synonymous with welfare, or refer specifically to social insurance programs, which provide support only to those who have previously contributed, as opposed to social assistance programs, which provide support on the basis of need alone. The International Labour Organization defines social security as covering support for those in old age, support for the maintenance of children, medical treatment, parental and sick leave, unemployment and disability benefits, and support for sufferers of occupational injury.
Long-standing problems in standard scientific methodology have exploded as the “Replication Crisis”: the discovery that many results in fields as diverse as psychology, economics, medicine, biology, and sociology are in fact false or quantitatively highly inaccurately measured. I cover here a handful of the issues and publications on this large, important, and rapidly developing topic up to about 2013, at which point the Replication Crisis became too large a topic to cover more than cursorily.
The crisis is caused by methods & publishing procedures which interpret random noise as important results, far too small datasets, selective analysis by an analyst trying to reach expected/desired results, publication bias, poor implementation of existing best-practices, nontrivial levels of research fraud, software errors, philosophical beliefs among researchers that false positives are acceptable, neglect of known confounding like genetics, and skewed incentives (financial & professional) to publish ‘hot’ results.
Thus, any individual piece of research typically establishes little. Scientific validation comes not from small p-values, but from discovering a regular feature of the world which disinterested third parties can discover with straightforward research done independently on new data with new procedures—replication.
The Poetry Foundation is a Chicago-based American foundation created to promote poetry in the wider culture. It was formed from Poetry magazine, which it continues to publish, with a 2003 gift of $200 million from philanthropist Ruth Lilly.
In November 2019, I experimented with training a GPT-2 neural net model to generate folk music in the high-level ABC music text format, following previous work in 2016 which used a char-RNN trained on a ‘The Session’ dataset. A GPT-2 hypothetically can improve on an RNN by better global coherence & copying of patterns, without problems with the hidden-state bottleneck.
I encountered problems with the standard GPT-2 model’s encoding of text which damaged results, but after fixing that, I successfully trained it on n = 205,304 ABC music pieces taken from The Session & ABCnotation.com. The resulting music samples are in my opinion quite pleasant. (A similar model was later retrained by Geerlings & Meroño-Peñuela 2020.)
We followed the ABC folk model with an ABC-MIDI model: a dataset of 453k ABC pieces decompiled from MIDI pieces, which fit into GPT-2-117M with an expanded context window when trained on TPUs. The MIDI pieces are far more diverse and challenging, and GPT-2 underfits and struggles to produce valid samples but when sampling succeeds, it can generate even better musical samples.
Standard language generation neural network models, like GPT-2, are trained via likelihood training to imitate human text corpuses. Generated text suffers from persistent flaws like repetition, due to myopic generation word-by-word, and cannot improve on the training data because they are trained to predict ‘realistic’ completions of the training data.
A proposed alternative is to use reinforcement learning to train the NNs, to encourage global properties like coherence & lack of repetition, and potentially improve over the original corpus’s average quality. Preference learning trains a reward function on human ratings, and uses that as the ‘environment’ for a blackbox DRL algorithm like PPO.
OpenAI released a codebase implementing this dual-model preference learning approach for textual generation, based on GPT-2. Having previously used GPT-2 for poetry & music generation, I experimented with GPT-2 preference learning for unconditional music and poetry generation.
I found that preference learning seemed to work better for music than poetry, and seemed to reduce the presence of repetition artifacts, but the results, at n≅7,400 ratings compiled over 23 iterations of training+sampling November 2019–January 2020, are not dramatically better than alternative improvements like scaling up models or more thorough data-cleaning or more stringent sample curation. My blind ratings using n≅200 comparisons showed no large advantage for the RL-tuned samples (winning only 93 of 210 comparisons, or 46%).
This may be due to insufficient ratings, bad hyperparameters, or not using samples generated with common prefixes, but I suspect it’s the former, as some NLP tasks in Ziegler et al 2019 required up to 60k ratings for good performance, and the reward model appeared to achieve poor performance & succumb to adversarial examples easily.
Working with it, I suspect that preference learning is unnecessarily sample-inefficient & data-inefficient, and that the blackbox reinforcement learning approach is inferior to directly using the reward model to optimize text samples, and propose two major architectural overhauls: have the reward model directly model the implied ranking of every datapoint, and drop the agent model entirely in favor of backprop-powered gradient ascent which optimizes sequences to maximize the reward model’s output.
…Black is GPT-2. Its excuse [for this chess blunder] is that it’s a text prediction program with no concept of chess. As far as it knows, it’s trying to predict short alphanumeric strings like “e2e4” or “Nb7”. Nobody told it this represents a board game. It doesn’t even have a concept of 2D space that it could use to understand such a claim. But it still captured my rook! Embarrassing!…Last month, I asked him if he thought GPT-2 could play chess. I wondered if he could train it on a corpus of chess games written in standard notation (where, for example, e2e4 means “move the pawn at square e2 to square e4”). There are literally millions of games written up like this. GPT-2 would learn to predict the next string of text, which would correspond to the next move in the chess game. Then you would prompt it with a chessboard up to a certain point, and it would predict how the chess masters who had produced its training data would continue the game – ie make its next move using the same heuristics they would. Gwern handed the idea to his collaborator Shawn Presser, who had a working GPT-2 chess engine running within a week:…You can play against GPT-2 yourself by following the directions in the last tweet, though it won’t be much of a challenge for anyone better than I am.
…What does this imply? I’m not sure (and maybe it will imply more if someone manages to make it actually good). It was already weird to see something with no auditory qualia learn passable poetic meter. It’s even weirder to see something with no concept of space learn to play chess. Is any of this meaningful? How impressed should we be that the same AI can write poems, compose music, and play chess, without having been designed for any of those tasks? I still don’t know.
“Language Models are Few-Shot Learners”, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei (2020-05-28):
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions—something which current NLP systems still largely struggle to do.
Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10× more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.
Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.
Compared to GPT-2, GPT-3 improves performance on character-level tasks like rhyming, alliteration, punning, anagrams or permutations, acrostic poems, and arithmetic less than expected, despite being very good at many other closely-related kinds of writings like satire.
Why? A plausible explanation is an obscure technical detail: as a performance optimization, GPT does not see characters but sub-word-chunks called “byte-pair encodings” (BPEs). Because GPTs never see characters but opaque partial-words, which vary chaotically based on the specific word and even the surrounding context, they are unable to easily learn about character-level aspects of language, like similar spellings or sounds, and are forced to learn relationships much more indirectly, like by brute-force memorizing of pairs of words.
Some experiments with reformatting GPT-3’s poorest-performing tasks to avoid inconsistent BPE encodings of strings shows small to large performance gains, consistent with this theory.
The GPT-3 neural network is so large a model in terms of power and dataset that it exhibits qualitatively different behavior: you do not apply it to a fixed set of tasks which were in the training dataset, requiring retraining on additional data if one wants to handle a new task (as one would have to retrain GPT-2); instead, you interact with it, expressing any task in terms of natural language descriptions, requests, and examples, tweaking the prompt until it “understands” & it meta-learns the new task based on the high-level abstractions it learned from the pretraining.
This is a rather different way of using a DL model, and it’s better to think of it as a new kind of programming, where the prompt is now a “program” which programs GPT-3 to do new things.
Jumping spiders or the Salticidae are a family of spiders. As of 2019, it contained over 600 described genera and over 6000 described species, making it the largest family of spiders at 13% of all species. Jumping spiders have some of the best vision among arthropods and use it in courtship, hunting, and navigation. Although they normally move unobtrusively and fairly slowly, most species are capable of very agile jumps, notably when hunting, but sometimes in response to sudden threats or crossing long gaps. Both their book lungs and tracheal system are well-developed, and they use both systems. Jumping spiders are generally recognized by their eye pattern. All jumping spiders have four pairs of eyes, with the anterior median pair being particularly large.
Echopraxia is a hard science fiction novel by Canadian writer Peter Watts. It is a "sidequel" to his 2006 novel Blindsight. It follows the story of a biologist who gets caught up in a voyage into the heart of the Solar System among members of a transcendentalist monastic order and allies to investigate a mysterious signal seemingly coming from the mission sent to initiate first contact in Watts' previous novel.
The replication crisis is, as of 2020, an ongoing methodological crisis in which it has been found that many scientific studies are difficult or impossible to replicate or reproduce. The replication crisis affects the social sciences and medicine most severely. The crisis has long-standing roots; the phrase was coined in the early 2010s as part of a growing awareness of the problem. The replication crisis represents an important body of research in the field of metascience.
While training a GPT-2-117M on a folk music corpus written in ABC format, persistent syntax errors kept being generated by an otherwise-high-quality model: random spaces would be generated, rendering a music piece either erroneous or lower-quality. Why? It seems to be some issue with the GPT BPE encoder handling of spaces which makes it difficult to emit the right space-separated characters. We found that ABC does not actually require spaces, and we simply removed all spaces from the corpus—noticeably improving quality of generated pieces.
Generating symbolic music with language models is a promising research area, with potential applications in automated music composition. Recent work shows that Transformer architectures can learn to generate compelling four-instrument scores from large MIDI datasets. In this paper, we re-train the small (117M) GPT-2 model with a large dataset in ABC notation, and generate samples of single-instrument folk music. Our BLEU and ROUGE based quantitative, and survey based qualitative, evaluations suggest that ABC notation is learned with syntactical and semantic correctness, and that samples contain robust and believable n-grams.
To expand the ABC GPT-2 model to cover a wider variety of musical genres, I turn to the next-most compact widespread music encoding format: MIDI. There are hundreds of thousands of MIDIs which can be decompiled to ABC format, averaging ~10k BPEs—within GPT-2-117M’s feasible context window when trained on TPUs (which permit training of context windows up to 30k wide).
We compile the ABC from before and 2 large MIDI datasets, and convert to ABC, yielding ~453k usable ABC-MIDI musical files (~5.1GB of text). We trained January–April 2020 on our TPU swarm (with many interruptions), achieving a final loss of ~0.2 (underfit).
Sampling from the final model is hit-or-miss as it is prone to the likelihood repetition trap and it generates instruments one-by-one so it is common for instruments to be cut off or otherwise broken during sampling (indicating that sampling is increasingly a bigger problem than training for long-range sequence modeling). However, successful pieces are possible, and are musically far more diverse than the folk ABC corpus, with many pleasingly complex samples.
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets.
We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset—matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples.
The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text.
These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
This work applies natural language modeling to generate plausible strategic moves in the ancient game of Go. We train the Generative Pretrained Transformer (GPT-2) to mimic the style of Go champions as archived in Smart Game Format (SGF), which offers a text description of move sequences. The trained model further generates valid but previously unseen strategies for Go. Because GPT-2 preserves punctuation and spacing, the raw output of the text generator provides inputs to game visualization and creative patterns, such as the Sabaki project’s game engine using auto-replays. Results demonstrate that language modeling can capture both the sequencing format of championship Go games and their strategic formations. Compared to random game boards, the GPT-2 fine-tuning shows efficient opening move sequences favoring corner play over less advantageous center and side play. Game generation as a language modeling task offers novel approaches to more than 40 other board games where historical text annotation provides training data (e.g., Amazons & Connect 4/6).
This work demonstrates that natural language transformers can support more generic strategic modeling, particularly for text-archived games. In addition to learning natural language skills, the abstract transformer architecture can generate meaningful moves on a chessboard. With further fine-tuning, the transformer learns complex gameplay by training on 2.8 million chess games in Portable Game Notation. After 30,000 training steps, OpenAI’s Generative Pre-trained Transformer (GPT-2) optimizes weights for 774 million parameters. This fine-tuned Chess Transformer generates plausible strategies and displays game formations identifiable as classic openings, such as English or the Slav Exchange. Finally, in live play, the novel model demonstrates a human-to-transformer interface that correctly filters illegal moves and provides a novel method to challenge the transformer’s chess strategies. We anticipate future work will build on this transformer’s promise, particularly in other strategy games where features can capture the underlying complex rule syntax from simple but expressive player annotations.
A decompiler is a computer program that takes an executable file as input, and attempts to create a high level source file which can be recompiled successfully. It is therefore the opposite of a compiler, which takes a source file and makes an executable. Decompilers are usually unable to perfectly reconstruct the original source code, and as such, will frequently produce obfuscated code. Nonetheless, decompilers remain an important tool in the reverse engineering of computer software.