2016-belsky.pdf: “The Genetics of Success: How Single-Nucleotide Polymorphisms Associated With Educational Attainment Relate to Life-Course Development”, (2016-06-01; ):
A previous genome-wide association study ( ) of more than 100,000 individuals identified molecular-genetic predictors of educational attainment.
We undertook in-depth life-course investigation of the polygenic score derived from thisusing the 4-decade Dunedin Study (N = 918). There were 5 main findings.
- polygenic scores predicted adult economic outcomes even after accounting for educational attainments.
- genes and environments were correlated: Children with higher polygenic scores were born into better-off homes.
- children’s polygenic scores predicted their adult outcomes even when analyses accounted for their social-class origins; social-mobility analysis showed that children with higher polygenic scores were more upwardly mobile than children with lower scores.
- polygenic scores predicted behavior across the life course, from early acquisition of speech and reading skills through geographic mobility and mate choice and on to financial planning for retirement.
- polygenic-score associations were mediated by psychological characteristics, including intelligence, self-control, and interpersonal skill. Effect sizes were small.
Factors connectingsequence with life outcomes may provide targets for interventions to promote population-wide positive development.
[Keywords: genetics, behavior genetics, intelligence, personality, adult development]
People’s differences in cognitive functions are partly heritable and are associated with important life outcomes. Previous genome-wide association (SNP)-based associations in 20 genomic regions, and statistically-significant gene-based findings in 46 regions. These include findings in the ATXN2, CYP2DG, APBA1 and CADM2 genes. We report replication of these hits in published GWA studies of cognitive function, educational attainment and childhood intelligence. There is also replication, in UK Biobank, of hits reported previously in studies of educational attainment and cognitive function. GCTA-GREML analyses, using common SNPs (minor allele frequency>0.01), indicated statistically-significant -based heritabilities of 31% (s.e.m. = 1.8%) for verbal-numerical reasoning, 5% (s.e.m. = 0.6%) for memory, 11% (s.e.m. = 0.6%) for reaction time and 21% (s.e.m. = 0.6%) for educational attainment. Polygenic score analyses indicate that up to 5% of the variance in cognitive test scores can be predicted in an independent cohort. The genomic regions identified include several novel loci, some of which have been associated with intracranial volume, neurodegeneration, Alzheimer’s disease and schizophrenia.) studies of cognitive functions have found evidence for polygenic effects yet, to date, there are few replicated genetic associations. Here we use data from the UK Biobank sample to investigate the genetic contributions to variation in tests of three cognitive functions and in educational attainment. analyses were performed for verbal-numerical reasoning (n = 36 035), memory (n = 112 067), reaction time (n = 111 483) and for the attainment of a college or a university degree (n = 111 114). We report genome-wide statistically-significant single-nucleotide polymorphism (
2016-okbay-2.pdf: “Genome-wide association study identifies 74 loci associated with educational attainment”, (2016-05-11; ):
Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals1. Here we report the results of a ( ) for educational attainment that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide statistically-significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.
Individuals with lower socio-economic status ( ) are at increased risk of physical and mental illnesses and tend to die at an earlier age. Explanations for the association between and health typically focus on factors that are environmental in origin. However, common single nucleotide polymorphisms (SNPs) have been found collectively to explain around 18% (SE = 5%) of the phenotypic of an area-based social deprivation measure of . Molecular genetic studies have also shown that physical and psychiatric diseases are at least partly heritable. It is possible, therefore, that phenotypic associations between SES and health arise partly due to a shared genetic etiology.
We conducted a We find that common SNPs explain 21% (SE = 0.5%) of the variation in social deprivation and 11% (SE = 0.7%) in household income. 2 independent SNPs attained genome-wide statistical-significance for household income, rs187848990 on chromosome 2, and rs8100891 on chromosome 19. Genes in the regions of these SNPs have been associated with intellectual disabilities, schizophrenia, and synaptic plasticity. Extensive genetic correlations were found between both measures of socioeconomic status and illnesses, anthropometric variables, psychiatric disorders, and cognitive ability.( ) on social deprivation and on household income using the 112,151 participants of .
These findings show that some SNPs associated with are involved in the brain and central nervous system. The genetic associations with are probably mediated via other partly-heritable variables, including cognitive ability, education, personality, and health.
One of the best predictors of children’s educational achievement is their family’s SES. This study provides the first molecular evidence for substantial genetic influence on differences in children’s educational achievement and its association with family .( ), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family . Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% in educational achievement and ~2.5% in family
The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a ( ) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (n = 370, n = 5850 for controls, ; n = 173, n = 3766 for controls and replication sample). The resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR) = 2.19 (1.53–3.14), p = 1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide statistical-significance (OR = 1.59 (1.37–1.85), p = 1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β = 0.68, p = 0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide statistically-significant and replicable findings on genetic variants associated with any personality disorder.
2016-robinson.pdf: “Genetic risk for autism spectrum disorders and neuropsychiatric variation in the general population”, Elise B. Robinson, Beate St Pourcain, Verneri Anttila, Jack A. Kosmicki, Brendan Bulik-Sullivan, Jakob Grove, Julian Maller, Kaitlin E. Samocha, Stephan J. Sanders, Stephan Ripke, Joanna Martin, Mads V. Hollegaard, Thomas Werge, David M. Hougaard, Benjamin M. Neale, David M. Evans, David Skuse, Preben Bo Mortensen, Anders D. Børglum, Angelica Ronald, George Davey Smith, Mark J. Daly
2016-kendall.pdf: “Cognitive Performance Among Carriers of Pathogenic Copy Number Variants_ Analysis of 152,000 , Kimberley M. Kendall, Elliott Rees, Valentina Escott-Price, Mark Einon, Rhys Thomas, Jonathan Hewitt, Michael C. O’Donovan, Michael J. Owen, James T. R. Walters, George Kirov Subjects”
Higher paternal age at offspring conception increases de novo genetic mutations (Kong et al., 2012). Based on evolutionary genetic theory we predicted that the offspring of older fathers would be less likely to survive and reproduce, i.e. have lower fitness. In a sibling control study, we find clear support for negative paternal age effects on offspring survival, mating and reproductive success across four large populations with an aggregate N > 1.3 million in main analyses. Compared to a sibling born when the father was 10 years younger, individuals had 4–13% fewer surviving children in the four populations. Three populations were pre-industrial (1670-1850) Western populations and showed a pattern of paternal age effects across the offspring’s lifespan. In 20th-century Sweden, we found no negative paternal age effects on child survival or marriage odds. Effects survived tests for competing explanations, including maternal age and parental loss. To the extent that we succeeded in isolating a mutation-driven effect of paternal age, our results can be understood to show that de novo mutations reduce offspring fitness across populations and time. We can use this understanding to predict the effect of increasingly delayed reproduction on offspring genetic load, mortality and fertility.
Causes of the well-documented association between low levels of cognitive functioning and many adverse neuropsychiatric outcomes, poorer physical health and earlier death remain unknown. We used linkage disequilibrium regression and polygenic profile scoring to test for shared genetic aetiology between cognitive functions and neuropsychiatric disorders and physical health. Using information provided by many published genome-wide association study consortia, we created polygenic profile scores for 24 vascular-metabolic, neuropsychiatric, physiological-anthropometric and cognitive traits in the participants of linkage disequilibrium score regression. Substantial and genetic correlations were observed between cognitive test scores in the UK Biobank sample and many of the mental and physical health-related traits and disorders assessed here. In addition, highly statistically-significant associations were observed between the cognitive test scores in the sample and many polygenic profile scores, including coronary artery disease, stroke, Alzheimer’s disease, , autism, major depressive disorder, body mass index, intracranial volume, infant head circumference and childhood cognitive ability. Where disease diagnosis was available for participants, we were able to show that these results were not confounded by those who had the relevant disease. These findings indicate that a substantial level of pleiotropy exists between cognitive abilities and many human mental and physical health disorders and traits and that it can be used to predict phenotypic across samples., a very large population-based sample (n = 112 151). Pleiotropy between cognitive and health traits was quantified by deriving genetic correlations using summary genome-wide association study statistics and to the method of
2016-pickrell.pdf: “Detection and interpretation of shared genetic influences on 42 human traits”, Tomaz Berisa, Jimmy Z. Liu, Laure Ségurel, Joyce Y. Tung, David A. Hinds, Joseph K. Pickrell
Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of from genome-wide association study ( ) data with current methods are the lack of availability of individual genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique for estimating that requires only summary statistics and is not biased by sample overlap. We use our method to estimate 300 genetic correlations among 25 traits, totaling more than 1.5 million unique phenotype measurements. Our results include between anorexia nervosa and , anorexia and obesity and associations between educational attainment and several diseases. These results highlight the power of genome-wide analyses, since there currently are no genome-wide SNPs for anorexia nervosa and only three for educational attainment.
“Phenome-wide Heritability Analysis of the UK Biobank”, (2016-08-18):
Heritability estimation provides important information about the relative contribution of genetic and environmental factors to phenotypic variation, and provides an upper bound for the utility of genetic risk prediction models. Recent technological and statistical advances have enabled the estimation of additive heritability attributable to common genetic variants (SNP heritability) across a broad phenotypic spectrum. However, assessing the comparative heritability of multiple traits estimated in different cohorts may be misleading due to the population-specific nature of heritability. Here we report the SNP heritability for 551 complex traits derived from the large-scale, population-based , comprising both quantitative phenotypes and disease codes, and examine the moderating effect of three major demographic variables (age, sex and socioeconomic status) on the heritability estimates. Our study represents the first comprehensive phenome-wide heritability analysis in the UK Biobank, and underscores the importance of considering population characteristics in comparing and interpreting heritability.
Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17–29% of the in liability. The calculated using common SNPs was high between and bipolar disorder (0.68 ± 0.04 s.e.), moderate between and major depressive disorder (0.43 ± 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 ± 0.06 s.e.), and and major depressive disorder (0.32 ± 0.07 s.e.), low between and ASD (0.16 ± 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn’s disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders.
Disorders of the brain can exhibit considerable epidemiological comorbidity and often share symptoms, provoking debate about their etiologic overlap. We quantified the genetic sharing of 25 brain disorders fromof 265,218 patients and 784,643 control participants and assessed their relationship to 17 phenotypes from 1,191,588 individuals. Psychiatric disorders share common variant risk, whereas neurological disorders appear more distinct from one another and from the psychiatric disorders. We also identified sharing between disorders and a number of brain phenotypes, including cognitive measures. Further, we conducted simulations to explore how statistical power, diagnostic misclassification, and phenotypic heterogeneity affect . These results highlight the importance of common genetic variation as a risk factor for brain disorders and the value of heritability-based methods in understanding their etiology.
2016-okbay.pdf: “Genetic variants associated with subjective well–being, depressive symptoms, and neuroticism identified through genome–wide analyses”, (2018-04-26; ):
Major depressive disorder (MDD) is a common illness accompanied by considerable morbidity, mortality, costs, and heightened risk of suicide.
We conducted a genome-wide associationbased in 135,458 cases and 344,901 controls and identified 44 independent and loci. The genetic findings were associated with clinical features of major depression and implicated brain regions exhibiting anatomical differences in cases. Targets of antidepressant medications and genes involved in gene splicing were enriched for smaller association signal. We found important relationships of genetic risk for major depression with educational attainment, body mass, and : lower educational attainment and higher body mass were putatively causal, whereas major depression and reflected a partly shared biological etiology.
All humans carry lesser or greater numbers of genetic risk factors for major depression. These findings help refine the basis of major depression and imply that a continuous measure of risk underlies the clinical phenotype.
2016-day.pdf: “Physical and neurobehavioral determinants of reproductive onset and success”, (2016-04-18; ):
The ages of puberty, first sexual intercourse and first birth signify the onset of reproductive ability, behavior and success, respectively. In a ESR1 and RBM6-SEMA3F), number of children (CADM2 and ESR1), irritable temperament (MSRA) and risk-taking propensity (CADM2). Mendelian randomization analyses infer causal influences of earlier puberty timing on earlier first sexual intercourse, earlier first birth and lower educational attainment. In turn, likely causal consequences of earlier first sexual intercourse include reproductive, educational, psychiatric and cardiometabolic outcomes.of 125,667 UK Biobank participants, we identify 38 loci associated (p < 5 × 10−8) with age at first sexual intercourse. These findings were taken forward in 241,910 men and women from Iceland and 20,187 women from the Women’s Genome Health Study. Several of the identified loci also exhibit associations (p < 5 × 10−8) with other reproductive and behavioral traits, including age at first birth (variants in or near
“LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis”, (2016-05-03):
LD score regression is a reliable and efficient method of using genome-wide association study () summary-level results data to estimate the heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the between different phenotypes. Because the method relies on summary level results data, score regression is computationally tractable even for very large sample sizes. However, publicly available summary-level data are typically stored in different databases and have different formats, making it difficult to apply score regression to estimate across many different traits simultaneously.
Results: In this manuscript, we describeHub—a centralized database of summary-level results for 177 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using Hub; and estimated heritability and the across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis meta-analysis to examine the between the condition and other potentially related traits. In response to the growing availability of publicly accessible summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the score regression methodology, provide a useful database for the public dissemination of results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies.
Availability and implementation
The web interface and instructions for usingHub are available at http://ldsc.broadinstitute.org/
Educated people are generally healthier, have fewer comorbidities and live longer than people with less education. Previous evidence about the effects of education come from observational studies many of which are affected by residual confounding. Legal changes to the minimum school leave age is a potential natural experiment which provides a potentially more robust source of evidence about the effects of schooling. Previous studies have exploited this natural experiment using population-level administrative data to investigate mortality, and relatively small surveys to investigate the effect on mortality.
Here, we add to the evidence using data from a large sample from the confounding factors. Education affects some, but not all health and socioeconomic outcomes.. We exploit the raising of the school-leaving age in the UK in September 1972 as a and regression discontinuity and instrumental variable estimators to identify the causal effects of staying on in school. Remaining in school was positively associated with 23 of 25 outcomes. After accounting for multiple hypothesis testing, we found evidence of causal effects on 12 outcomes, however, the associations of schooling and intelligence, smoking, and alcohol consumption may be due to genomic and socioeconomic
Differences between educated and less educated people may be partially due to residual genetic and socioeconomic.
Significance Statement: On average people who choose to stay in education for longer are healthier, wealthier, and live longer. We investigated the causal effects of education on health, income, and well-being later in life. This is the largest study of its kind to date and it has objective clinic measures of morbidity and aging. We found evidence that people who were forced to remain in school had higher wages and lower mortality. However, there was little evidence of an effect on intelligence later in life. Furthermore, estimates of the effects of education using conventionally adjusted regression analysis are likely to suffer from genomic. In conclusion, education affects some, but not all health outcomes later in life.
Funding: The Medical Research Council (MRC) and the University of Bristol fund the MRC Integrative Epidemiology Unit [MC_UU_12013/1, MC_UU_12013/9]. NMD is supported by the Economics and Social Research Council (ESRC) via a Future Research Leaders Fellowship [ES/N000757/1]. The research described in this paper was specifically funded by a grant from the Economics and Social Research Council for Transformative Social Science. No funding body has influenced data collection, analysis or its interpretations. This publication is the work of the authors, who serve as the guarantors for the contents of this paper. This work was carried out using the computational facilities of the Advanced Computing Research Centre and the Research Data Storage Facility of the University of Bristol. This research was conducted using the Resource.
Data access: The statistical code used to produce these results can be accessed here. The final analysis dataset used in this study is archived with , which can be accessed by contacting UK Biobank firstname.lastname@example.org.
It is possible that heritable Openness domains. The correlations of personality characteristics with educational attainment-related polygenic influences reflected almost entirely their correlations with phenotypic educational attainment. Structural equation modeling of the associations between polygenic risk, personality (a weighed aggregate of education-related facets) and educational attainment lent relatively strongest support to the possibility of educational attainment mediating (explaining) some of the heritable in personality traits.in personality characteristics does not reflect (only) genetic and biological processes specific to personality per se. We tested the possibility that Five-Factor Model personality domains and facets, as rated by people themselves and their knowledgeable informants, reflect polygenic influences that have been previously associated with educational attainment. In a sample of over 3,000 adult Estonians, polygenic scores for educational attainment, based on small contributions from more than 150,000 genetic variants, were correlated with various personality traits, mostly from the Neuroticism and
Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
Analyzing genetic differences between closely related populations can be a powerful way to detect recent adaptation. The very large sample size of the FUT2 (p = 9.16 × 10−9). In addition, by combining evidence of unusual differentiation within the UK with evidence from ancient Eurasians, we identified new genome-wide statistically-significant (p < 5 × 10−8) signals of recent selection at two additional loci: CYP1A2/CSK and F12. We detected strong associations to diastolic blood pressure in the for the variants with new selection signals at CYP1A2/CSK (p = 1.10 × 10−19)) and for variants with ancient Eurasian selection signals in the ATXN2/SH2B3 locus (p = 8.00 × 10−33), implicating recent adaptation related to blood pressure.is ideal for detecting selection using population differentiation, and enables an analysis of UK population structure at fine resolution. In analyses of 113,851 UK Biobank samples, population structure in the UK is dominated by 5 principal components (PCs) spanning 6 clusters: Northern Ireland, Scotland, northern England, southern England, and two Welsh clusters. Analyses with ancient Eurasians show that populations in the northern UK have higher levels of Steppe ancestry, and that UK population structure cannot be explained as a simple mixture of Celts and Saxons. A scan for unusual population differentiation along top PCs identified a genome-wide statistically-significant signal of selection at the coding variant rs601338 in
Recent findings from molecular genetics now make it possible to test directly forby analyzing whether genetic variants associated with various phenotypes have been under selection. I leverage these findings to construct polygenic scores that use individuals’ genotypes to predict their , educational attainment (EA), glucose concentration, height, schizophrenia, total cholesterol, and (in females) age at menarche. I then examine associations between these scores and fitness to test whether has been occurring. My study sample includes individuals of European ancestry born between 1931 and 1953 in the Health and Retirement Study, a representative study of the US population. My results imply that natural selection has been slowly favoring lower EA in both females and males, and are suggestive that may have favored a higher age at menarche in females. For EA, my estimates imply a rate of selection of about -1.5 months of education per generation (which pales in comparison with the increases in EA observed in contemporary times). Though they cannot be projected over more than one generation, my results provide additional evidence that humans are still evolving—albeit slowly, especially when compared to the rapid secular changes that have occurred over the past few generations due to cultural and environmental factors.
2016-kleinstiver.pdf: “High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects”, Benjamin P. Kleinstiver, Vikram Pattanayak, Michelle S. Prew, Shengdar Q. Tsai, Nhu T. Nguyen, Zongli Zheng, J. Keith Joung
2016-kang.pdf: “Introducing precise genetic modifications into human 3PN embryos by CRISPR / Cas-mediated genome editing”, Xiangjin Kang, Wenyin He, Yuling Huang, Qian Yu, Yaoyong Chen, Xingcheng Gao, Xiaofang Sun, Yong Fan
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages.
Our solution requires no change in the model architecture from our base system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. The rest of the model, which includes encoder, decoder and attention, remains unchanged and is shared across all languages. Using a shared wordpiece vocabulary, our approach enables Multilingual NMT using a single model without any increase in parameters, which is substantially simpler than previous proposals for Multilingual NMT.
Our method often improves the translation quality of all involved language pairs, even while keeping the total number of model parameters constant. On the WMT’14 benchmarks, a single multilingual model achieves comparable performance for English→French and surpasses state-of-the-art results for English→German. Similarly, a single multilingual model surpasses state-of-the-art results for French→English and German→English on WMT’14 and WMT’15 benchmarks respectively. On production corpora, multilingual models of up to twelve language pairs allow for better translation of many individual pairs.
In addition to improving the translation quality of language pairs that the model was trained with, our models can also learn to perform implicit bridging between language pairs never seen explicitly during training, showing that transfer learning and zero-shot translation is possible for neural translation. Finally, we show analyses that hints at a universal interlingua representation in our models and show some interesting examples when mixing languages.
2016-graves.pdf#deepmind: “Hybrid computing using a neural network with dynamic external memory”, Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwińska, Sergio Gómez Colmenarejo, Edward Grefenstette, Tiago Ramalho, John Agapiou, Adrià Puigdomènech Badia, Karl Moritz Hermann, Yori Zwols, Georg Ostrovski, Adam Cain, Helen King, Christopher Summerfield, Phil Blunsom, Koray Kavukcuoglu, Demis Hassabis
Neural networks augmented with external memory have the ability to learn algorithmic solutions to complex tasks. These models appear promising for applications such as language modeling and machine translation. However, they scale poorly in both space and time as the amount of memory grows—limiting their applicability to real-world domains.
Here, we present an end-to-end differentiable memory access scheme, which we call Sparse Access Memory (SAM), that retains the representational power of the original approaches whilst training efficiently with very large memories. We show that SAM achieves asymptotic lower bounds in space and time complexity, and find that an implementation runs 1,000× faster and with 3,000× less physical memory than non-sparse models.
SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring 100,000s of time steps and memories. As well, we show how our approach can be adapted for models that maintain temporal associations between memories, as with the recently introduced Differentiable Neural Computer.
“Progressive Neural Networks”, (2016-06-15):
Learning to solve complex sequences of tasks–while both leveraging transfer and avoiding catastrophic forgetting–remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
“Sim-to-Real Robot Learning from Pixels with Progressive Nets”, (2016-10-13):
Applying reinforcement learning and sparse rewards.learning to solve complex, interactive, pixel-driven control tasks on a robot is an unsolved problem. Deep Reinforcement Learning algorithms are too slow to achieve performance on a real robot, but their potential has been demonstrated in simulated environments. We propose using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world. The progressive net approach is a general framework that enables reuse of everything from low-level visual features to high-level policies for transfer to new tasks, enabling a compositional, yet simple, approach to building complex skills. We present an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging the reality gap. Unlike other proposed approaches, our real-world experiments demonstrate successful task learning from raw visual input on a fully actuated robot manipulator. Moreover, rather than relying on model-based trajectory optimisation, the task learning is accomplished using only deep
“Overcoming catastrophic forgetting in neural networks”, (2016-12-02):
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.
The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent. Towards this goal, we define a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains. This method, termed “Actor-Mimic”, exploits the use of deep prior expert guidance, speeding up learning in novel environments. Although our method can in general be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate these methods.and model compression techniques to train a single policy network that learns how to act in a set of distinct tasks by using the guidance of several expert teachers. We then show that the representations learnt by the deep policy network are capable of generalizing to new tasks with no
“Decoupled Neural Interfaces using Synthetic Gradients”, (2016-08-18):
Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates. All layers, or more generally, modules, of the network are therefore locked, in the sense that they must wait for the remainder of the network to execute forwards and propagate error backwards before they can be updated. In this work we break this constraint by decoupling modules by introducing a model of the future computation of the network graph. These models predict what the result of the modelled subgraph will produce using only local information. In particular we focus on modelling error gradients: by using the modelled synthetic gradient in place of true backpropagated error gradients we decouple subgraphs, and can update them independently and asynchronously i.e. we realise decoupled neural interfaces. We show results for feed-forward models, where every layer is trained asynchronously, recurrent neural networks (RNNs) where predicting one’s future gradient extends the time over which the RNN can effectively model, and also a hierarchical system with ticking at different timescales. Finally, we demonstrate that in addition to predicting gradients, the same framework can be used to predict inputs, resulting in models which are decoupled in both the forward and backwards pass—amounting to independent networks which co-learn such that they can be composed into a single functioning corporation.
“One-shot Learning with Memory-Augmented Neural Networks”, (2016-05-19):
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of “one-shot learning.” Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
“Programming with a Differentiable Forth Interpreter”, (2016-05-21):
Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. In this paper, we consider the case of prior procedural knowledge for neural networks, such as knowing how a program should traverse a sequence, but not what local actions should be performed at each step. To this end, we present an LSTM and trained jointly, our interpreter achieves state-of-the-art accuracy for end-to-end reasoning about quantities expressed in natural language stories.differentiable interpreter for the programming language Forth which enables programmers to write program sketches with slots that can be filled with behaviour trained from program input-output data. We can optimise this behaviour directly through gradient descent techniques on user-specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of program structure and learn complex behaviours such as sequence sorting and addition. When connected to outputs of an
“DeepCoder: Learning to Write Programs”, (2016-11-07):
We develop a first line of attack for solving programming competition-style problems from input-output examples using deep learning. The approach is to train a neural network to predict properties of the program that generated the outputs from the inputs. We use the neural network’s predictions to augment search techniques from the programming languages community, including enumerative search and an SMT-based solver. Empirically, we show that our approach leads to an order of magnitude speedup over the strong non-augmented baselines and a Recurrent Neural Network approach, and that we are able to solve problems of difficulty comparable to the simplest problems on programming competition websites.
“Achieving Human Parity in Conversational Speech Recognition”, (2016-10-17):
Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcribers is 5.9% for the Switchboard portion of the data, in which newly acquainted pairs of people discuss an assigned topic, and 11.3% for the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state of the art, and edges past the human benchmark, achieving error rates of 5.8% and 11.0%, respectively. The key to our system’s performance is the use of various convolutional and acoustic model architectures, combined with a novel spatial smoothing method and lattice-free MMI acoustic training, multiple language modeling approaches, and a systematic use of system combination.
“Lip Reading Sentences in the Wild”, (2016-11-16):
The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem—unconstrained natural language sentences, and in the wild videos.
Our key contributions are: (1) a ‘Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to characters; (2) a curriculum learning strategy to accelerate training and to reduce overfitting; (3) a ‘Lip Reading Sentences’ (LRS) dataset for visual speech recognition, consisting of over 100,000 natural sentences from British television.
The WLAS model trained on the LRS dataset surpasses the performance of all previous work on standard lip reading benchmark datasets, often by a significant margin. This lip reading performance beats a professional lip reader on videos from BBC television, and we also demonstrate that visual information helps to improve speech recognition performance even when the audio is available.
2016-silver.pdf#deepmind: “Mastering the game of Go with deep neural networks and tree search”, (2016-01-28; ):
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
[Anecdote: I hear from Groq that the original AlphaGo GPU implementation was not on track to defeat Lee Sedol by about a month before, when they happened to gamble on implementing TPUv1 support. The additional compute led to drastic performance gains, and the TPU model could beat the GPU model in ~98 of 100 games, and the final model solidly defeated Lee Sedol. (Since TPUv1s reportedly only did inferencing/forward-mode, presumably they were not used for the initial imitation learning, or the policy gradients self-play, but for generating the ~30 million self-play games which the value network was trained on (doing regression/prediction of ‘board → P(win)’, requiring no state or activations from the self-play games, just generating an extremely large corpus which could be easily used by training.]
“Neural Architecture Search with Reinforcement Learning”, (2016-11-05):
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a RNN with to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05× faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.to generate the model descriptions of neural networks and train this
At present, designing convolutional neural network ( ) architectures requires both human expertise and labor. New architectures are handcrafted by careful experimentation or modified from a handful of existing networks.
We introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learning agent is trained to sequentially choose layers using Q-learning with an ε-greedy exploration strategy and experience replay. The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task.
On image classification benchmarks, the agent-designed networks (consisting of only standard convolution, pooling, and fully-connected layers) beat existing networks designed with the same layer types and are competitive against the state-of-the-art methods that use more complex layer types. We also outperform existing meta-modeling approaches for network design on image classification tasks.
“Learning to reinforcement learn”, (2016-11-17):
In recent years deep(RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta- . Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.
“Value Iteration Networks”, (2016-02-09):
We introduce the value iteration network (VIN): a fully differentiable neural network with a ‘planning module’ embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for . Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a , and trained using standard backpropagation. We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains.
“The Predictron: End-To-End Learning and Planning”, (2016-12-28):
One of the key challenges of artificial intelligence is to learn models that are effective in the context of planning. In this document we introduce the predictron architecture. The predictron consists of a fully abstract model, represented by a Markov reward process, that can be rolled forward multiple “imagined” planning steps. Each forward pass of the accumulates internal rewards and values over multiple planning depths. The predictron is trained so as to make these accumulated values accurately approximate the true value function. We applied the predictron to procedurally generated random mazes and a simulator for the game of pool. The yielded statistically-significantly more accurate predictions than conventional deep neural network architectures.
“Unifying Count-Based Exploration and Intrinsic Motivation”, (2016-06-06):
We consider an agent’s uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma’s Revenge.
“Asynchronous Methods for Deep Reinforcement Learning”, (2016-02-04):
We propose a conceptually simple and lightweight framework for deep training for half the time on a single multi-core CPU instead of a . Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while
“Reinforcement Learning with Unsupervised Auxiliary Tasks”, (2016-11-16):
Deepagents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning. All of these tasks share a common representation that, like unsupervised learning, continues to develop in the absence of extrinsic rewards. We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task. Our agent significantly outperforms the previous state-of-the-art on Atari, averaging 880% expert human performance, and a challenging suite of first-person, three-dimensional Labyrinth tasks leading to a mean speedup in learning of 10× and averaging 87% expert human performance on Labyrinth.
Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal foroptimization. To augment reward, we consider a range of self-supervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquitous and instantaneous supervision for representation learning even in the absence of reward. While current results show that learning from reward alone is feasible, pure methods are constrained by computational and data efficiency issues that can be remedied by auxiliary losses. Self-supervised pre-training and joint optimization improve the data efficiency and policy returns of end-to-end .
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independently of camera calibration or the current robot pose. This requires the network to observe the spatial relationship between the gripper and objects in the scene, thus learning hand-eye coordination. We then use this network to servo the gripper in real time to achieve successful grasps. To train our network, we collected over 800,000 grasp attempts over the course of two months, using between 6 and 14 robotic manipulators at any given time, with differences in camera placement and hardware. Our experimental evaluation demonstrates that our method achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing.
Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered policy representations and human-supplied demonstrations. Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted to simulated settings and relatively simple tasks, due to their apparent high sample complexity. In this paper, we demonstrate that a recent deepalgorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots. We demonstrate that the training times can be further reduced by parallelizing the algorithm across multiple robots which pool their policy updates asynchronously. Our experimental evaluation shows that our method can learn a variety of 3D manipulation skills in simulation and a complex door opening skill on real robots without any prior demonstrations or manually designed representations.
A striking contrast runs through the last 60 years of biopharmaceutical discovery, research, and development. Huge scientific and technological gains should have increased the quality of academic science and raised industrial R&D efficiency. However, academia faces a “reproducibility crisis”; inflation-adjusted industrial R&D costs per novel drug increased nearly 100× between 1950 and 2010; and drugs are more likely to fail in clinical development today than in the 1970s. The contrast is explicable only if powerful headwinds reversed the gains and/or if many “gains” have proved illusory. However, discussions of reproducibility and R&D productivity rarely address this point explicitly.
The main objectives of the primary research in this paper are: (a) to provide quantitatively and historically plausible explanations of the contrast; and (b) identify factors to which R&D efficiency is sensitive.
We present a quantitative decision-theoretic model of the R&D process [a ‘leaky pipeline’; cf the log-normal]. The model represents therapeutic candidates (eg., putative drug targets, molecules in a screening library, etc.) within a “measurement space”, with candidates’ positions determined by their performance on a variety of assays (eg., binding affinity, toxicity, in vivo efficacy, etc.) whose results correlate to a greater or lesser degree. We apply decision rules to segment the space, and assess the probability of correct R&D decisions.
We find that when searching for rare positives (eg., candidates that will successfully complete clinical development), changes in the predictive validity of screening and disease models that many people working in drug discovery would regard as small and/or unknowable (ie., an 0.1 absolute change in correlation coefficient between model output and clinical outcomes in man) can offset large (eg., 10×, even 100×) changes in models’ brute-force efficiency. We also show how validity and reproducibility correlate across a population of simulated screening and disease models.
We hypothesize that screening and disease models with high predictive validity are more likely to yield good answers and good treatments, so tend to render themselves and their diseases academically and commercially redundant. Perhaps there has also been too much enthusiasm for reductionist molecular models which have insufficient predictive validity. Thus we hypothesize that the average predictive validity of the stock of academically and industrially “interesting” screening and disease models has declined over time, with even small falls able to offset large gains in scientific knowledge and brute-force efficiency. The rate of creation of valid screening and disease models may be the major constraint on R&D productivity.
2001-ioannidis.pdf: “Comparison of Evidence of Treatment Effects in Randomized and Nonrandomized Studies”, (2001-08-01; ):
Context: There is substantial debate about whether the results of nonrandomized studies are consistent with the results of randomized controlled trials on the same topic.
Objectives: To compare results of randomized and nonrandomized studies that evaluated medical interventions and to examine characteristics that may explain discrepancies between randomized and nonrandomized studies.
Data Sources: MEDLINE (1966–March 2000), the Cochrane Library (Issue 3, 2000), and major journals were searched.
Study Selection: Forty-five diverse topics were identified for which both randomized trials (n = 240) and nonrandomized studies (n = 168) had been performed and had been considered in meta-analyses of binary outcomes.
Data Extraction: Data on events per patient in each study arm and design and characteristics of each study considered in each meta-analysis were extracted and synthesized separately for randomized and nonrandomized studies.
Data Synthesis: Very good correlation was observed between the summary odds ratios of randomized and nonrandomized studies (r = 0.75; p < 0.001); however, nonrandomized studies tended to show larger treatment effects (28 vs 11; p = 0.009). Between-study heterogeneity was frequent among randomized trials alone (23%) and very frequent among nonrandomized studies alone (41%). The summary results of the 2 types of designs differed beyond chance in 7 cases (16%). Discrepancies beyond chance were less common when only prospective studies were considered (8%). Occasional differences in sample size and timing of publication were also noted between discrepant randomized and nonrandomized studies. In 28 cases (62%), the natural logarithm of the odds ratio differed by at least 50%, and in 15 cases (33%), the odds ratio varied at least 2-fold between nonrandomized studies and randomized trials.
Conclusions: Despite good correlation between randomized trials and nonrandomized studies—in particular, prospective studies—discrepancies beyond chance do occur and differences in estimated magnitude of treatment effect are very common.
2019-gordon.pdf: “A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook”, (2019-05-04; ):
Measuring the causal effects of digital advertising remains challenging despite the availability of granular data. Unobservable factors make exposure endogenous, and advertising’s effect on outcomes tends to be small. In principle, these concerns could be addressed using randomized controlled trials (). In practice, few online ad campaigns rely on and instead use observational methods to estimate ad effects. We assess empirically whether the variation in data typically available in the advertising industry enables observational methods to recover the causal effects of online advertising. Using data from 15 U.S. advertising experiments at Facebook comprising 500 million user-experiment observations and 1.6 billion ad impressions, we contrast the experimental results to those obtained from multiple observational models. The observational methods often fail to produce the same effects as the randomized experiments, even after conditioning on extensive demographic and behavioral variables. In our setting, advances in causal inference methods do not allow us to isolate the exogenous variation needed to estimate the treatment effects. We also characterize the incremental explanatory power our data would require to enable observational methods to successfully measure advertising effects. Our findings suggest that commonly used observational approaches based on the data usually available in the industry often fail to accurately measure the true effect of advertising.
We introduce the network model as a formal psychometric model, conceptualizing the covariance between psychometric indicators as resulting from pairwise interactions between observable variables in a network structure. This contrasts with standard psychometric models, in which the covariance between test items arises from the influence of one or more common latent variables. Here, we present two generalizations of the network model that encompass latent variable structures, establishing network modeling as parts of the more general framework of Structural Equation Modeling ( ). In the first generalization, we model the covariance structure of variables as a network. We term this framework Network Modeling (LNM) and show that, with LNM, a unique structure of conditional independence relationships between variables can be obtained in an explorative manner. In the second generalization, the residual variance-covariance structure of indicators is modeled as a network. We term this generalization Residual Network Modeling (RNM) and show that, within this framework, identifiable models can be obtained in which local independence is structurally violated. These generalizations allow for a general modeling framework that can be used to fit, and compare, models, network models, and the RNM and LNM generalizations. This methodology has been implemented in the free-to-use software package lvnet, which contains confirmatory model testing as well as two exploratory search algorithms: stepwise search algorithms for low-dimensional datasets and penalized maximum likelihood estimation for larger datasets. We show in simulation studies that these search algorithms performs adequately in identifying the structure of the relevant residual or networks. We further demonstrate the utility of these generalizations in an empirical example on a personality inventory dataset.
Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http: / / jakewestfall.org / ivy / ) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity.
1936-stouffer.pdf: “Evaluating the Effect of Inadequately Measured Variables in Partial Correlation Analysis”, (1936; ):
It is not generally recognized that such an analysis [using regression] assumes that each of the variables is perfectly measured, such that a second measure X’i, of the variable measured by Xi, has a correlation of unity with Xi. If some of the measures are more accurate than others, the analysis is impaired [by measurement error]. For example, the sociologist may have a problem in which an index of economic status and an index of nativity are independent variables. What is the effect, if the index of economic status is much less satisfactory than the index of nativity? Ordinarily, the effect will be to underestimate the [coefficient] of the less adequately measured variable and to overestimate the [coefficient] of the more adequately measured variable.
If either the reliability or validity of an index is in question, at least two measures of the variable are required to permit an evaluation. The purpose of this paper is to provide a logical basis and a simple arithmetical procedure (a) for measuring the effect of the use of 2 indexes, each of one or more variables, in partial and multiple correlation analysis and (b) for estimating the likely effect if 2 indexes, not available, could be secured.
1942-thorndike.pdf: “85_1.tif”, Robert L. Thorndike
1965-kahneman.pdf: “Control of spurious association and the reliability of the controlled variable”, Daniel Kahneman
2016-makel.pdf: “When Lightning Strikes Twice”, (2016-07-01; ):
The educational, occupational, and creative accomplishments of the profoundly gifted participants (IQs ⩾ 160) in the Study of Mathematically Precocious Youth (SMPY) are astounding, but are they representative of equally able 12-year-olds? Duke University’s Talent Identification Program (TIP) identified 259 young adolescents who were equally gifted. By age 40, their life accomplishments also were extraordinary: Thirty-seven percent had earned doctorates, 7.5% had achieved academic tenure (4.3% at research-intensive universities), and 9% held patents; many were high-level leaders in major organizations. As was the case for the sample before them, differential ability strengths predicted their contrasting and eventual developmental trajectories—even though essentially all participants possessed both mathematical and verbal reasoning abilities far superior to those of typical Ph.D. recipients. Individuals, even profoundly gifted ones, primarily do what they are best at. Differences in ability patterns, like differences in interests, guide development along different paths, but ability level, coupled with commitment, determines whether and the extent to which noteworthy accomplishments are reached if opportunity presents itself.
[Keywords: intelligence, creativity, giftedness, replication, blink comparator]
2020-levitt.pdf: “Heads or Tails: The Impact of a Coin Toss on Major Life Decisions and Subsequent Happiness”, (2020-05-19; ):
Little is known about whether people make good choices when facing important decisions. This article reports on a large-scale randomized field experiment in which research subjects having difficulty making a decision flipped a coin to help determine their choice. For important decisions (eg. quitting a job or ending a relationship), individuals who are told by the coin toss to make a change are more likely to make a change, more satisfied with their decisions, and happier six months later than those whose coin toss instructed maintaining the status quo. This finding suggests that people may be excessively cautious when facing life-changing choices.
[Keywords: quitting, happiness, decision biases.]
2006-mccrone.pdf: “Smarter Than The Average Bug”, (2006-05-27; ):
Portia may be about the size of a fat raisin, with eyes no larger than sesame seeds, yet it has a visual acuity that beats a cat or a pigeon. The human eye is better, but only about five times better. So from a safe distance a foot or two away, Portia sits scanning Scytodes, looking to see if it is carrying an egg sac in its fangs… The retinas of its principle eyes have only about a thousand receptors compared to the 200 million or so of the human eyeball. But Portia can swivel these tiny eyes across the scene in systematic fashion, patiently building up an image point by point. Having rejected a few alternatives routes, Portia makes up its mind and disappears from sight. A couple of hours later, the silent assassin is back, dropping straight down on Scytodes from a convenient rock overhang on a silk dragline—looking like something out of the movie, Mission Impossible. Once again, Portia’s guile wins the day.
…Undoubtedly many of Portia’s cognitive abilities are genetic. Laboratory tests carried out by Robert Jackson, chief of Canterbury’s spider unit, have shown that only Portia from the particular area where Scytodes is common can recognise the difference between an egg sac carrying and non-egg sac carrying specimen. And it is a visual skill they are born with. The same species of Portia trapped a few hundred miles away doesn’t show any evidence of seeing the egg sac. But as Jackson points out, this just deepens the mystery. First there is the fact that such a specific mental behaviour as looking for an egg sac could be wired into a spider’s genome. And then there is the realisation that this is a population-specific, not species-specific, trait! It is a bit of locally acquired genetic knowledge. How does any simple hardwiring story account for that?
… “The White Tail can pluck, but only in a programmed, stereotyped, way. It doesn’t bother with tactics, or experimenting, or looking to see which way the other spider is facing. It just charges in and overpowers its prey with its size. Portia is a really weedy little spider and has to spend ages planning a careful attack. But its eyesight and trial and error approach means it can tackle any sort of web spider it comes across, even ones it has never met before in the history of its species”, says Harland. While Portia’s deception skills are impressive, the real admiration is reserved for its ability to plot a path to its victim. For an instinctive animal, out of sight is supposed to be out of mind. But Portia can take several hours to get into the right spot, even if it means losing sight of its prey for long periods.
…As a maze to be worked out from a single viewing—and with no previous experience of such mazes—this would be a tall order even for a rat or monkey. Yet more often than not, Portia could identify the right path. There was nothing quick about it. Portia would sit on top of the dowel for up to an hour, twisting to and fro as it appeared to track its eyes across the various possible routes. Sometimes it couldn’t decide and would just give up. However, once it had a plan, it would clamber down and pick the correct wire, even if this meant at first heading back behind where it had been perched. And walking right past the other wire. Harland says it seems that Portia can see where it has to get to in order to start its journey and ignore distractions along the way. This impression was strengthened by the fact that on trials where Portia made a wrong choice, it often gave up on reaching the first high bend of the wire—even though the bait was not yet in sight. It was as if Portia knew where it should be in the apparatus and could tell straight away when it had made a dumb mistake.
Crazy talk, obviously. There just ain’t room in Portia’s tiny head for anything approaching a plan, an expectation, or any other kind of inner life. The human brain has some 100 billion neurons, or brain cells, and even a mouse has around 70 million. Harland says no one has done a precise count on Portia but it is reckoned to have about 600,000 neurons, putting it midway between the quarter million of a housefly and the one million of a honey bee. Yet in the lab over the past few years, Portia has kept on surprising.
…Rather controversially, Li calls this the forming of a search image. Yet even if this mental priming is reduced to some thoroughly robotic explanation, such as an enhanced sensitivity of certain prey-recognising circuits and a matching damping of others, it still says that there is a general shift in the running state of Portia’s nervous system. Portia is responding in a globally cohesive fashion and is not just a loose bundle of automatic routines.
…Harland says Portia’s eyesight is the place to start. Jumping spiders already have excellent vision and Portia’s is ten times as good, making it sharper than most mammals. However being so small, there is a trade-off in that Portia can only focus its eyes on a tiny spot. It has to build up a picture of the world by scanning almost pixel by pixel across the visual scene. Whatever Portia ends up seeing, the information is accumulated slowly, as if peering through a keyhole, over many minutes. So there might be something a little like visual experience, but nothing like a full and “all at once” experience of a visual field. Harland feels that the serial nature of this scanning vision also makes it easier to imagine how prey recognition and other such decision processes could be controlled by some quite stereotyped genetic programs. When Portia is looking for an egg sac obscuring the face of Scytodes, it wouldn’t need to be representing the scene as a visual whole. Instead it could be checking a template, ticking off critical features in a sequence of fixations. In such a case, the less the eye sees with each fixation, perhaps the better. The human brain has to cope with a flood of information. Much of the work lies in discovering what to ignore about any moment. So the laser-like focus of Portia’s eyes might do much of this filtering by default. Yet while much of Portia’s mental abilities may reduce to the way its carefully designed eyes are coupled to largely reflexive motor patterns, Harland says there is still a disconcerting plasticity in its gene-encoded knowledge of the world. If one population of Portia can recognise an egg-carrying Scytodes but specimens from another region can’t, then this seems something quite new—a level of learning somewhere in-between the brain of an individual and the genome of a species… As Harland says, Portia just doesn’t fit anyone’s theories right at the moment.
Policy-makers are interested in early-years interventions to ameliorate childhood risks. They hope for improved adult outcomes in the long run, bringing return on investment. How much return can be expected depends, partly, on how strongly childhood risks forecast adult outcomes. But there is disagreement about whether childhood determines adulthood. We integrated multiple nationwide administrative databases and electronic medical records with the four-decade Dunedin birth-cohort study to test child-to-adult prediction in a different way, by using a population-segmentation approach. A segment comprising one-fifth of the cohort accounted for 36% of the cohort’s injury insurance-claims; 40% of excess obese-kilograms; 54% of cigarettes smoked; 57% of hospital nights; 66% of welfare benefits; 77% of fatherless childrearing; 78% of prescription fills; and 81% of criminal convictions. Childhood risks, including poor age-three brain health, predicted this segment with large. Early-years interventions effective with this population segment could yield very large returns on investment.
2020-richmondrakerd.pdf: “Clustering of health, crime and social-welfare inequality in 4 million citizens from two nations”, (2020-01-20; ):
Health and social scientists have documented the hospital revolving-door problem, the concentration of crime, and long-term welfare dependence. Have these distinct fields identified the same citizens? Using administrative databases linked to 1.7 million New Zealanders, we quantified and monetized inequality in distributions of health and social problems and tested whether they aggregate within individuals. Marked inequality was observed: Gini coefficients equalled 0.96 for criminal convictions, 0.91 for public-hospital nights, 0.86 for welfare benefits, 0.74 for prescription-drug fills and 0.54 for injury-insurance claims. Marked aggregation was uncovered: a small population segment accounted for a disproportionate share of use-events and costs across multiple sectors. These findings were replicated in 2.3 million Danes. We then integrated the New Zealand databases with the four-decade-long Dunedin Study. The high-need/high-cost population segment experienced early-life factors that reduce workforce readiness, including low education and poor mental health. In midlife they reported low life satisfaction. Investing in young people’s education and training potential could reduce health and social inequalities and enhance population wellbeing.
Traumatic brain injury (TBI) is the leading cause of disability and mortality in children and young adults worldwide. It remains unclear, however, how TBI in childhood and adolescence is associated with adult mortality, psychiatric morbidity, and social outcomes.
Methods and Findings:
In a Swedish birth cohort between 1973 and 1985 of 1,143,470 individuals, we identified all those who had sustained at least one TBI (n = 104,290 or 9.1%) up to age 25 y and their unaffected siblings (n = 68,268) using patient registers. We subsequently assessed these individuals for the following outcomes using multiple national registries: disability pension, specialist diagnoses of psychiatric disorders and psychiatric inpatient hospitalisation, premature mortality (before age 41 y), low educational attainment (not having achieved secondary school qualifications), and receiving means-tested welfare benefits. We used logistic and Cox regression models to quantify the association between TBI and specified adverse outcomes on the individual level. We further estimated population attributable fractions (PAF) for each outcome measure. We also compared differentially exposed siblings to account for unobserved genetic and environmental . In addition to relative risk estimates, we examined absolute risks by calculating prevalence and Kaplan-Meier estimates. In complementary analyses, we tested whether the findings were moderated by injury severity, recurrence, and age at first injury (ages 0–4, 5–9, 6–10, 15–19, and 20–24 y).
TBI exposure was associated with elevated risks of impaired adult functioning across all outcome measures. After a median follow-up period of 8 y from age 26 y, we found that TBI contributed to absolute risks of over 10% for specialist diagnoses of psychiatric disorders and low educational attainment, approximately 5% for disability pension, and 2% for premature mortality. The highest relative risks, adjusted for sex, birth year, and birth order, were found for psychiatric inpatient hospitalisation (adjusted relative risk [aRR] = 2.0; 95% CI: 1.9–2.0; 6,632 versus 37,095 events), disability pension (aRR = 1.8; 95% : 1.7–1.8; 4,691 versus 29,778 events), and premature mortality (aRR = 1.7; 95% : 1.6–1.9; 799 versus 4,695 events). These risks were only marginally attenuated when the comparisons were made with their unaffected siblings, which implies that the effects of TBI were consistent with a causal inference. A dose-response relationship was observed with injury severity. Injury recurrence was also associated with higher risks—in particular, for disability pension we found that recurrent TBI was associated with a 3× risk increase (aRR = 2.6; 95% : 2.4–2.8) compared to a single-episode TBI. Higher risks for all outcomes were observed for those who had sustained their first injury at an older age (ages 20–24 y) with more than 25% increase in relative risk across all outcomes compared to the youngest age group (ages 0–4 y). On the population level, TBI explained between 2%–6% of the in the examined outcomes.
Using hospital data underestimates milder forms of TBI, but such misclassification bias suggests that the reported estimates are likely conservative. The sibling-comparison design accounts for unmeasured familial confounders shared by siblings, including half of their genes. Thus, residual genetic remains a possibility but will unlikely alter our main findings, as associations were only marginally attenuated within families.
Given our findings, which indicate potentially causal effects between TBI exposure in childhood and later impairments across a range of health and social outcomes, age-sensitive clinical guidelines should be considered and preventive strategies should be targeted at children and adolescents.
In a population-wide observational cohort, Seena Fazel and colleagues use a sibling-matched design to examine the burden of long-term outcomes associated with traumatic brain injury.
Why Was This Study Done?:
Traumatic brain injury (TBI) constitutes the leading cause of morbidity and mortality in individuals under the age of 45 y globally.
Research on the long-term effects of TBI is limited to more severe injuries and medical outcomes.
There is uncertainty whether children and adolescents experiencing milder forms of TBI may have significant medical and social problems in adulthood.
What Did the Researchers Do and Find?:
We used national registers in Sweden covering 1.1 million individuals born between 1973–1985
In the 9.1% who sustained at least one TBI before the age of 25 y, we examined later risk of six medical and social outcomes.
We compared TBI patients with their unaffected siblings in order to account for the possibility that the risk for these outcomes runs in families.
We found TBI consistently predicted later risk of premature mortality, psychiatric inpatient admission, psychiatric outpatient visits, disability pension, welfare recipiency, and low educational attainment in the sibling-comparison analyses, and the effects were stronger for those with greater injury severity, recurrence, and older age at first injury.
What Do These Findings Mean?:
Consideration needs to be given to review the cognitive, psychiatric, and social development all children and adolescents who sustain head injuries.
Guidelines should consider age-specific recommendations for follow-up.
The public health benefits of preventing TBIs should include social outcomes.
2015-hofman.pdf: “Evolution of the Human Brain: From Matter to Mind”, (2015; ):
Design principles and operational modes are explored that underlie the information processing capacity of the human brain.
The hypothesis is put forward that in higher organisms, especially in primates, the complexity of the neural circuitry of the cerebral cortex is the neural correlate of the brain’s coherence and predictive power, and, thus, a measure of intelligence. It will be argued that with the evolution of the human brain we have nearly reached the limits of biological intelligence.
[Keywords: biological intelligence, cognition, consciousness,, primates, information processing, neural networks, cortical design, human brain evolution]
2013-anonymous-strategicconsequencesofchineseracism.pdf: “The Strategic Consequences of Chinese Racism: A Strategic Asymmetry for the United States”, (2013-01-07; ):
Whether China and the United States are destined to compete for domination in international politics is one of the major questions facing DoD. In a competition with the People’s Republic of China, the United States must explore all of its advantages and all of the weaknesses of China that may provide an asymmetry for the United States. This study examines one such asymmetry, the strategic consequences of Chinese racism. After having examined the literature on China extensively, this author is not aware of a single study that addresses this important topic. This study explores the causes of Chinese racism, the strategic consequences of Chinese racism, and how the United States may use this situation to advance its interests in international politics.
the study finds that xenophobia, racism, and ethnocentrism are caused by human evolution. These behaviors are not unique to the Chinese. However, they are made worse by Chinese history and culture.
considers the Chinese conception of race in Chinese history and culture. It finds that Chinese religious-cultural and historical conceptions of race reinforce Chinese racism. In Chinese history and contemporary culture, the Chinese are seen to be unique and superior to the rest of the world. Other peoples and groups are seen to be inferior, with a sliding scale of inferiority. The major Chinese distinction is between degrees of barbarians, the “black devils”, or savage inferiors, beyond any hope of interaction and the “white devils” or tame barbarians with whom the Chinese can interact. These beliefs are widespread in Chinese society, and have been for its history…
evaluates the 9 strategic consequences of Chinese racism.
virulent racism and eugenics heavily inform Chinese perceptions of the world…
racism informs their view of the United States…
racism informs their view of international politics in three ways.
- states are stable, and thus good for the Chinese, to the degree that they are unicultural.
- Chinese ethnocentrism and racism drive their outlook to the rest of the world. Their is of a tribute system where barbarians know that the Chinese are superior.
- there is a strong, implicit, racialist view of international politics that is alien and anathema to Western policy-makers and analysts. The Chinese are comfortable using race to explain events and appealing to racist stereotypes to advance their interests. Most insidious is the Chinese belief that Africans in particular need Chinese leadership.
the Chinese will make appeals to Third World states based on “racial solidarity”,…
Chinese racism retards their relations with the Third World…
Chinese racism, and the degree to which the Chinese permit their view of the United States to be informed by racism, has the potential to hinder China in its competition with the United States because it contributes to their overconfidence…
as lamentable as it is, Chinese racism helps to make the Chinese a formidable adversary…
the Chinese are never going to go through a civil rights movement like the United States…
China’s treatment of Christians and ethnic minorities is poor…
considers the 5 major implications for United States decision-makers and asymmetries that may result from Chinese racism.
Chinese racism provides empirical evidence of how the Chinese will treat other international actors if China becomes dominant…
it allows the United States to undermine China in the Third World…
it permits a positive image of the United States to be advanced in contrast to China…
calling attention to Chinese racism allows political and ideological alliances of the United States to be strengthened…
United States defense decision-makers must recognize that racism is a cohesive force for the Chinese…
…The study’s fundamental conclusion is that endemic Chinese racism offers the United States a major asymmetry it may exploit with major countries, regions like Africa, as well as with important opinion makers in international politics. The United States is on the right side of the struggle against racism and China is not. The United States should call attention to this to aid its position in international politics.
1998-iannaccone.pdf: “Introduction to the Economics of Religion”, Laurence R. Iannaccone
“Joel on Software: Strategy Letter V”, (2002-06-11):
Every product in the marketplace has substitutes and complements. A substitute is another product you might buy if the first product is too expensive. Chicken is a substitute for beef. If you’re a chicken farmer and the price of beef goes up, the people will want more chicken, and you will sell more. A complement is a product that you usually buy together with another product. Gas and cars are complements. Computer hardware is a classic complement of computer operating systems. And babysitters are a complement of dinner at fine restaurants. In a small town, when the local five star restaurant has a two-for-one Valentine’s day special, the local babysitters double their rates. (Actually, the nine-year-olds get roped into early service.) All else being equal, demand for a product increases when the prices of its complements decrease.
Let me repeat that because you might have dozed off, and it’s important. Demand for a product increases when the prices of its complements decrease. For example, if flights to Miami become cheaper, demand for hotel rooms in Miami goes up—because more people are flying to Miami and need a room. When computers become cheaper, more people buy them, and they all need operating systems, so demand for operating systems goes up, which means the price of operating systems can go up.
…Once again: demand for a product increases when the price of its complements decreases. In general, a company’s strategic interest is going to be to get the price of their complements as low as possible. The lowest theoretically sustainable price would be the “commodity price”—the price that arises when you have a bunch of competitors offering indistinguishable goods. So:
Smart companies try to commoditize their products’ complements.
If you can do this, demand for your product will increase and you will be able to charge more and make more.
“DDoSCoin: Cryptocurrency with a Malicious Proof-of-Work”, (2016-08-08):
[HTTPS connections can provide third-party-verifiable signatures and so HTTPS is a valid Proof-of-Work and one can incentivize creating HTTPS connections and hence DDoSes. This could also be used non-maliciously to create a distributed anonymous uptime-checking service, by incentivizing only a few connections each time period for small bounties.]
Since its creation in 2009, Bitcoin has used a hash-based proof-of-work to generate new blocks, and create a single public ledger of transactions. The hash-based computational puzzle employed by is instrumental to its security, preventing Sybil attacks and making double-spending attacks more difficult. However, there have been concerns over the efficiency of this proof-of-work puzzle, and alternative “useful” proofs have been proposed. In this paper, we present DDoSCoin, which is a cryptocurrency with a malicious proof-of-work. DDoSCoin allows miners to prove that they have contributed to a distributed denial of service attack against specific target servers. This proof involves making a large number of TLS connections to a target server, and using cryptographic responses to prove that a large number of connections has been made. Like proof-of-work puzzles, these proofs are inexpensive to verify, and can be made arbitrarily difficult to solve.
2016-baade.pdf: “Going for the Gold: The Economics of the Olympics”, (2016-01-01; ):
In this paper, we explore the costs and benefits of hosting the Olympic Games. On the cost side, there are three major categories: general infrastructure such as transportation and housing to accommodate athletes and fans; specific sports infrastructure required for competition venues; and operational costs, including general administration as well as the opening and closing ceremony and security. Three major categories of benefits also exist: the short-run benefits of tourist spending during the Games; the long-run benefits or the “Olympic legacy” which might include improvements in infrastructure and increased trade, foreign investment, or tourism after the Games; and intangible benefits such as the “feel-good effect” or civic pride. Each of these costs and benefits will be addressed in turn, but the overwhelming conclusion is that in most cases the Olympics are a money-losing proposition for host cities; they result in positive net benefits only under very specific and unusual circumstances. Furthermore, the cost–benefit proposition is worse for cities in developing countries than for those in the industrialized world. In closing, we discuss why what looks like an increasingly poor investment decision on the part of cities still receives significant bidding interest and whether changes in the bidding process of the International Olympic Committee (IOC) will improve outcomes for potential hosts.
2016-donaldson.pdf: “The View from Above: Applications of Satellite Data in Economics”, (2016-09-01; ):
The past decade or so has seen a dramatic change in the way that economists can learn by watching our planet from above. A revolution has taken place in remote sensing and allied fields such as computer science, engineering, and geography.
Petabytes of satellite imagery have become publicly accessible at increasing resolution, many algorithms for extracting meaningful social science information from these images are now routine, and modern cloud-based processing power allows these algorithms to be run at global scale.
This paper seeks to introduce economists to the science of remotely sensed data, and to give a flavor of how this new source of data has been used by economists so far and what might be done in the future.
We group the main advantages of such remote sensing data to economists into 3 categories:
access to information difficult to obtain by other means:
The first advantage is simply that remote sensing technologies can collect panel data at low marginal cost, repeatedly, and at large scale on proxies for a wide range of hard-to-measure characteristics. We discuss below economic analysis that already uses remotely sensed data on night lights, precipitation, wind speed, flooding, topography, forest cover, crop choice, agricultural productivity, urban development, building type, roads, pollution, beach quality, and fish abundance. Many more characteristics of potential interest to economists have already been measured remotely and used in other fields. Most of these variables would be prohibitively expensive to measure accurately without remote sensing, and there are settings in which the official government counterparts of some remotely sensed statistics (such as pollution or forestry) may be subject to manipulation…
unusually high spatial resolution:
The second advantage of remote sensing data sources is that they are typically available at a substantially higher degree of spatial resolution than are traditional data. Much of the publicly available satellite imagery used by economists provides readings for each of the hundreds of billions of 30-meter-by-30-meter grid cells of land surface on Earth. Many economic decisions (particularly land use decisions such as zoning, building types, or crop choice) are made at approximately this same level of spatial resolution. But since 1999, private companies have offered submeter imagery and, following a 2014 US government ruling, American companies are able to sell imagery at resolutions below 0.5 meters to nongovernment customers for the first time. This is important because even when a coarser unit of analysis is appropriate, 900 1-meter pixels provide far more information available for signal extraction than a single 30-meter pixel covering the same area. In addition, some innovative identification strategies used by economists exploit stark policy changes that occur at geographic boundaries; these high-spatial-resolution research designs rely intimately on high-spatial-resolution outcome data (for example, Turner, Haughwout, and van der Klaauw 2014)…
wide geographic coverage.
The third key advantage of remotely sensed data lies in their wide geographic coverage. Only rarely do social scientists enjoy the opportunities, afforded by satellites, to study data that have been collected in a consistent manner—without regard for local events like political strife or natural disasters—across borders and with uniform spatial sampling on every inhabited continent. Equally important, many research satellites (or integrated series of satellites) offer substantial temporal coverage, capturing data from the same location at weekly or even daily frequency for several decades and counting.
An example of this third feature—global scope—can be seen in work on the economic impacts of climate change in agriculture by Costinot, Donaldson, and Smith (2016). These authors draw on an agronomic model that is partly based on remotely sensed data. The agronomic model, when evaluated under both contemporary and expected (2070–2099) climates, predicts a change in agricultural productivity for any crop in any location on Earth. For example, the relative impact for 2 of the world’s most important crops, rice and wheat, is shown in Figure 5: Costinot, Donaldson, and Smith feed these pixel-by-pixel changes into a general equilibrium model of world agricultural trade and then use the model to estimate that climate change can be expected to reduce global agricultural output by about 1⁄6th (and that international trade is unlikely to mitigate this damage, despite the inherently transnational nature of the shock seen in Figure 5). Given the rate at which algorithms for crop classification and yield measurement have improved in recent years, future applications of satellite data are likely to be particularly rich in the agricultural arena.
“Towards an integration of deep learning and neuroscience”, (2016-08-22):
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short-term and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain’s specialized systems can be interpreted as enabling efficient optimization for specific problem classes. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.
Some risks have extremely high stakes. For example, a worldwide pandemic or asteroid impact could potentially kill more than a billion people. Comfortingly, scientific calculations often put very low probabilities on the occurrence of such catastrophes. In this paper, we argue that there are important new methodological problems which arise when assessing global catastrophic risks and we focus on a problem regarding probability estimation. When an expert provides a calculation of the probability of an outcome, they are really providing the probability of the outcome occurring, given that their argument is watertight. However, their argument may fail for a number of reasons such as a flaw in the underlying theory, a flaw in the modeling of the problem, or a mistake in the calculations. If the probability estimate given by an argument is dwarfed by the chance that the argument itself is flawed, then the estimate is suspect. We develop this idea formally, explaining how it differs from the related distinctions of model and parameter uncertainty. Using the risk estimates from the Large Hadron Collider as a test case, we show how serious the problem can be when it comes to catastrophic risks and how best to address it.
“The Merchant and the Alchemist's Gate”, (2007-09):
This fantasy short story by Ted Chiang follows Fuwaad ibn Abbas, a fabric merchant in the ancient city of Baghdad. It begins when he is searching for a gift to give a business associate and happens to discover a new shop in the marketplace. The shop owner, who makes and sells a variety of very interesting items, invites Fuwaad into the back workshop to see a mysterious black stone arch which serves as a gateway into the future, which the shop owner has made by the use of alchemy. Fuwaad is intrigued, and the shop owner tells him 3 stories of others who have traveled through the gate to meet and have conversation with their future selves. When Fuwaad learns that the shop keeper has another gate in Cairo that will allow people to travel even into the past, he makes the journey there to try to rectify a mistake he made 20 years earlier. [Summary adapted from Wikipedia]
1940-nippongakujutsushinkokai-manyoshu.pdf#page=197: “The Manyoshu: The Nippon Gakujutsu Shinkokai Translation of One Thousand Poems with the Texts in Romaji”, Nippon Gakujutsu Shinkokai, Donald Keene