2015 saw two important fields continue their abrupt surge, going from strength to strength: genetics (behavioral genetics in particular) and AI. The pace of progress was dizzying, with, it seemed, each month bringing major results to the point where it drowned out important news from other fields and made it difficult to pick just 10; for example, who has time to note a major milestone in cryonics like proof of preservation of long-term memory in C. elegans when CRISPR is increasing in power on a monthly basis and researchers are offhandedly producing feats like myostatin-enhanced beagles or “micropigs” that would have been major R&D efforts just years ago? And the flood of deep learning results has continued to the point where end-of-year roundups of major breakthroughs accidentally omit discoveries like MSR’s residual networks which enable powerful neural networks with literally hundreds of layers to be trained? As Karpathy put it: “BatchNorm, STN, DCGAN, DRAW, soft/hard attention, char-rnn, DeepDream, NeuralStyle, TensorFlow, ResNet, AlphaGo.. a lot happened over 1 year” Not to mention behavioral genetics’ findings being repeatedly vindicated in large-scale genetic studies, confuting the critics, but also going further and making surprising new discoveries like a pervasive web of genetic correlations between intelligence and many other traits; population genetics in general is increasingly finding that there are meaningful differences between even closely related populations, indicating the important of ‘soft selection sweeps’ and the cumulative effect of small differences on many genes, which have lead to changes as large as domestication. And based on just January 2016’s news in both areas, it seems that 2015 will not be exceptional but marks a new normal for these two areas and we can look forward to many exciting new results consolidating & extending 2015.
Do No Harm, Marsh (elegantly written and moving neurosurgeon memoir on the theme of iatrogenics; I did disagree with his comments on the cost-benefit of operating in one case, though)
Newsletter tag: archive of all issues back to 2013 for the gwern.net newsletter (monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews.)
Dark Net Markets (DNM) are online markets typically hosted as Tor hidden services providing escrow services between buyers & sellers transacting in Bitcoin or other cryptocoins, usually for drugs or other illegal/regulated goods; the most famous DNM was Silk Road 1, which pioneered the business model in 2011.
From 2013–2015, I scraped/mirrored on a weekly or daily basis all existing English-language DNMs as part of my research into their usage, lifetimes/characteristics, & legal riskiness; these scrapes covered vendor pages, feedback, images, etc. In addition, I made or obtained copies of as many other datasets & documents related to the DNMs as I could.
This uniquely comprehensive collection is now publicly released as a 50GB (~1.6TB uncompressed) collection covering 89 DNMs & 37+ related forums, representing <4,438 mirrors, and is available for any research.
This page documents the download, contents, interpretation, and technical methods behind the scrapes.
Mail is delivered by the USPS mailman at a regular but not observed time; what is observed is whether the mail has been delivered at a time, yielding somewhat-unusual “interval-censored data”. I describe the problem of estimating when the mailman delivers, write a simulation of the data-generating process, and demonstrate analysis of interval-censored data in R using maximum-likelihood (survival analysis with Gaussian regression using survival library), MCMC (Bayesian model in JAGS), and likelihood-free Bayesian inference (custom ABC, using the simulation). This allows estimation of the distribution of mail delivery times. I compare those estimates from the interval-censored data with estimates from a (smaller) set of exact delivery-times provided by USPS tracking & personal observation, using a multilevel model to deal with heterogeneity apparently due to a change in USPS routes/postmen. Finally, I define a loss function on mail checks, enabling: a choice of optimal time to check the mailbox to minimize loss (exploitation); optimal time to check to maximize information gain (exploration); Thompson sampling (balancing exploration & exploitation indefinitely), and estimates of the value-of-information of another datapoint (to estimate when to stop exploration and start exploitation after a finite amount of data).
I compile a table and discussion of all known arrests and prosecutions related to English-language Tor-Bitcoin darknet markets (DNMs) such as Silk Road 1, primarily 2011–2015, along with discussion of how they came to be arrested.
Despite a century of research on complex traits in humans, the relative importance and specific nature of the influences of genes and environment on human traits remain controversial. We report a meta-analysis of twin correlations and reported variance components for 17,804 traits from 2,748 publications including 14,558,903 partly dependent twin pairs, virtually all published twin studies of complex traits. Estimates of heritability cluster strongly within functional domains, and across all traits the reported heritability is 49%. For a majority (69%) of traits, the observed twin correlations are consistent with a simple and parsimonious model where twin resemblance is solely due to additive genetic variation. The data are inconsistent with substantial influences from shared environment or non-additive genetic variation. This study provides the most comprehensive analysis of the causes of individual differences in human traits thus far and will guide future gene-mapping efforts. All the results can be visualized using the MaTCH webtool.
Predisposition to respond to placebo treatment may be in part a stable heritable trait.
Candidate placebo response pathways may interact with drugs to modify outcomes in both the placebo and drug treatment arms of clinical trials.
Genomic analysis of randomized placebo and no-treatment controlled trials are needed to fully realize the potential of the placebome.
Placebos are indispensable controls in randomized clinical trials (RCTs), and placebo responses significantly contribute to routine clinical outcomes. Recent neurophysiological studies reveal neurotransmitter pathways that mediate placebo effects. Evidence that genetic variations in these pathways can modify placebo effects raises the possibility of using genetic screening to identify placebo responders and thereby increase RCT efficacy and improve therapeutic care. Furthermore, the possibility of interaction between placebo and drug molecular pathways warrants consideration in RCT design. The study of genomic effects on placebo response, 'the placebome', is in its infancy. Here, we review evidence from placebo studies and RCTs to identify putative genes in the placebome, examine evidence for placebo-drug interactions, and discuss implications for RCTs and clinical care.
“Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N = 53,949)”, G. Davies, N. Armstrong, J. C. Bis, J. Bressler, V. Chouraki, S. Giddaluru, E. Hofer, C. A Ibrahim-Verbaas, M. Kirin, J. Lahti, S. J. van der Lee, S. Le Hellard, T. Liu, R. E. Marioni, C. Oldmeadow, I. Postmus, A. V. Smith, J. A Smith, A. Thalamuthu, R. Thomson, V. Vitart, J. Wang, L. Yu, L. Zgaga, W. Zhao, R. Boxall, S. E. Harris, W. D. Hill, D. C. Liewald, M. Luciano, H. Adams, D. Ames, N. Amin, P. Amouyel, A. A Assareh, R. Au, J. T. Becker, A. Beiser, C. Berr, L. Bertram, E. Boerwinkle, B. M. Buckley, H. Campbell, J. Corley, P. L. De Jager, C. Dufouil, J. G. Eriksson, T. Espeseth, J. D. Faul, I. Ford, Generation Scotland, R. F. Gottesman, M. E. Griswold, V. Gudnason, T. B. Harris, G. Heiss, A. Hofman, E. G. Holliday, J. Huffman, S. L. R. Kardia, N. Kochan, D. S. Knopman, J. B. Kwok, J-C Lambert, T. Lee, G. Li, S-C Li, M. Loitfelder, O. L. Lopez, A. J. Lundervold, A. Lundqvist, K. A Mather, S. S. Mirza, L. Nyberg, B. A Oostra, A. Palotie, G. Papenberg, A. Pattie, K. Petrovic, O. Polasek, B. M. Psaty, P. Redmond, S. Reppermund, J. I Rotter, H. Schmidt, M. Schuur, P. W. Schofield, R. J. Scott, V. M. Steen, D. J. Stott, J. C. van Swieten, K. D. Taylor, J. Trollor, S. Trompet, A. G. Uitterlinden, G. Weinstein, E. Widen, B. G. Windham, J. W. Jukema, A. F. Wright, M. J. Wright, Q. Yang, H. Amieva, J. R. Attia, D. A Bennett, H. Brodaty, A. J. M. de Craen, C. Hayward, M. A Ikram, U. Lindenberger, L-G Nilsson, D. J. Porteous, K. Räikkönen, I. Reinvang, I. Rudan, P. S. Sachdev, R. Schmidt, P. R. Schofield, V. Srikanth, J. M. Starr, S. T. Turner, D. R. Weir, J. F. Wilson, C. van Duijn, L. Launer, A. L. Fitzpatrick, S. Seshadri, T. H. Mosley Jr, I. J. Deary (2015-02-03):
General cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health-related and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of genome-wide association studies of 31 cohorts (N = 53 949) in which the participants had undertaken multiple, diverse cognitive tests. A general cognitive function phenotype was tested for, and created in each cohort by principal component analysis. We report 13 genome-wide significant single-nucleotide polymorphism (SNP) associations in three genomic regions, 6q16.1, 14q12 and 19q13.32 (best SNP and closest gene, respectively: rs10457441, p =3.93 × 10−9, MIR2113; rs17522122, p = 2.55 × 10−8, AKAP6; rs10119, p =5.67 × 10−9, APOE/TOMM40). We report one gene-based significant association with the HMGN1 gene located on chromosome 21 (p = 1 × 10−6). These genes have previously been associated with neuropsychiatric phenotypes. Meta-analysis results are consistent with a polygenic model of inheritance. To estimate SNP-based heritability, the genome-wide complex trait analysis procedure was applied to two large cohorts, the Atherosclerosis Risk in Communities Study (N = 6617) and the Health and Retirement Study (N = 5976). The proportion of phenotypic variation accounted for by all genotyped common SNPs was 29% (s.e.=5%) and 28% (s.e.=7%), respectively. Using polygenic prediction analysis, ~1.2% of the variance in general cognitive function was predicted in the Generation Scotland cohort (N = 5487; p = 1.5 × 10−17). In hypothesis-driven tests, there was significant association between general cognitive function and four genes previously associated with Alzheimer’s disease: TOMM40, APOE, ABCG1 and MEF2C.
Environmental measures used widely in the behavioral sciences show nearly as much genetic influence as behavioral measures, a critical finding for interpreting associations between environmental factors and children's development. This research depends on the twin method that compares monozygotic and dizygotic twins, but key aspects of children's environment such as socioeconomic status (SES) cannot be investigated in twin studies because they are the same for children growing up together in a family. Here, using a new technique applied to DNA from 3000 unrelated children, we show significant genetic influence on family SES, and on its association with children's IQ at ages 7 and 12. In addition to demonstrating the ability to investigate genetic influence on between-family environmental measures, our results emphasize the need to consider genetics in research and policy on family SES and its association with children's IQ.
“Population genetic differentiation of height and body mass index across Europe”, Matthew R. Robinson, Gibran Hemani, Carolina Medina-Gomez, Massimo Mezzavilla, Tonu Esko, Konstantin Shakhbazov, Joseph E. Powell, Anna Vinkhuyzen, Sonja I. Berndt, Stefan Gustafsson, Anne E. Justice, Bratati Kahali, Adam E. Locke, Tune H. Pers, Sailaja Vedantam, Andrew R. Wood, Wouter van Rheenen, Ole A. Andreassen, Paolo Gasparini, Andres Metspalu, Leonard H. van den Berg, Jan H. Veldink, Fernando Rivadeneira, Thomas M. Werge, Goncalo R. Abecasis, Dorret I. Boomsma, Daniel I. Chasman, Eco J. C. de Geus, Timothy M. Frayling, Joel N. Hirschhorn, Jouke Jan Hottenga, Erik Ingelsson, Ruth J. F. Loos, Patrik K. E. Magnusson, Nicholas G. Martin, Grant W. Montgomery, Kari E. North, Nancy L. Pedersen, Timothy D. Spector, Elizabeth K. Speliotes, Michael E. Goddard, Jian Yang, Peter M. Visscher (2015-09-14):
Across-nation differences in the mean values for complex traits are common1–8, but the reasons for these differences are unknown. Here we find that many independent loci contribute to population genetic differences in height and body mass index (BMI) in 9,416 individuals across 14 European countries. Using discovery data on over 250,000 individuals and unbiased effect size estimates from 17,500 sibling pairs, we estimate that 24% (95% credible interval (CI) = 9%, 41%) and 8% (95% CI = 4%, 16%) of the captured additive genetic variance for height and BMI, respectively, reflect population genetic differences. Population genetic divergence differed significantly from that in a null model (height, p < 3.94 × 10−8; BMI, p < 5.95 × 10−4), and we find an among-population genetic correlation for tall and slender individuals (r = −0.80, 95% CI = −0.95, −0.60), consistent with correlated selection for both phenotypes. Observed differences in height among populations reflected the predicted genetic means (r = 0.51; p < 0.001), but environmental differences across Europe masked genetic differentiation for BMI (p < 0.58).
“Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication”, Michael J. Montague, Gang Li, Barbara Gandolfi, Razib Khan, Bronwen L. Aken, Steven M. J. Searle, Patrick Minx, LaDeana W. Hillier, Daniel C. Koboldt, Brian W. Davis, Carlos A. Driscoll, Christina S. Barr, Kevin Blackistone, Javier Quilez, Belen Lorente-Galdos, Tomas Marques-Bonet, Can Alkan, Gregg W. C. Thomas, Matthew W. Hahn, Marilyn Menotti-Raymond, Stephen J. O’Brien, Richard K. Wilson, Leslie A. Lyons, William J. Murphy, and Wesley C. Warren (2014-10-03):
Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae.
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57 ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.
The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28 nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control. In this paper, we aim to answer the following question: does training the perception and control systems jointly end-to-end provide better performance than training each component separately? To this end, we develop a method that can be used to learn policies that map raw image observations directly to torques at the robot’s motors. The policies are represented by deep convolutional neural networks (CNNs) with 92,000 parameters, and are trained using a partially observed guided policy search method, which transforms policy search into supervised learning, with supervision provided by a simple trajectory-centric reinforcement learning method. We evaluate our method on a range of real-world manipulation tasks that require close coordination between vision and control, such as screwing a cap onto a bottle, and present simulated comparisons to a range of prior policy search methods.
This paper addresses the general problem of reinforcement learning (RL) in partially observable environments. In 2013, our large RL recurrent neural networks (RNNs) learned from scratch to drive simulated cars from high-dimensional video input. However, real brains are more powerful in many ways. In particular, they learn a predictive model of their initially unknown environment, and somehow use it for abstract (e.g., hierarchical) planning and reasoning. Guided by algorithmic information theory, we describe RNN-based AIs (RNNAIs) designed to do the same. Such an RNNAI can be trained on never-ending sequences of tasks, some of them provided by the user, others invented by the RNNAI itself in a curious, playful fashion, to improve its RNN-based world model. Unlike our previous model-building RNN-based RL machines dating back to 1990, the RNNAI learns to actively query its model for abstract reasoning and planning and decision making, essentially "learning to think." The basic ideas of this report can be applied to many other cases where one RNN-like system exploits the algorithmic information content of another. They are taken from a grant proposal submitted in Fall 2014, and also explain concepts such as "mirror neurons." Experimental results will be described in separate papers.
The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent. Towards this goal, we define a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains. This method, termed "Actor-Mimic", exploits the use of deep reinforcement learning and model compression techniques to train a single policy network that learns how to act in a set of distinct tasks by using the guidance of several expert teachers. We then show that the representations learnt by the deep policy network are capable of generalizing to new tasks with no prior expert guidance, speeding up learning in novel environments. Although our method can in general be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate these methods.
We introduce a neural network with a recurrent attention model over a possibly large external memory. The architecture is a form of Memory Network (Weston et al., 2015) but unlike the model in that work, it is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings. It can also be seen as an extension of RNNsearch to the case where multiple computational steps (hops) are performed per output symbol. The flexibility of the model allows us to apply it to tasks as diverse as (synthetic) question answering and to language modeling. For the former our approach is competitive with Memory Networks, but with less supervision. For the latter, on the Penn TreeBank and Text8 datasets our approach demonstrates comparable performance to RNNs and LSTMs. In both cases we show that the key concept of multiple computational hops yields improved results.
Motivated by the recent progress in generative models, we introduce a model that generates images from natural language descriptions. The proposed model iteratively draws patches on a canvas, while attending to the relevant words in the description. After training on Microsoft COCO, we compare our model with several baseline generative models on image generation and retrieval tasks. We demonstrate that our model produces higher quality samples than other approaches and generates images with novel scene compositions corresponding to previously unseen captions in the dataset.
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks—demonstrating their applicability as general image representations.
[Exploration of char-RNN neural nets for generating text. Karpathy codes a simple recurrent NN which generates character-by-character, and discovers that it is able to generate remarkably plausible text (at the syntactic level) for Paul Graham, Shakespeare, Wikipedia, LaTeX, Linux C code, and baby names—all using the same generic architecture. Visualizing the internal activity of the char-RNNs, they seem to be genuinely understanding some of the recursive syntactic structure of the text in a way that other text-generation methods like n-grams cannot. Inspired by this post, I began tinkering with char-RNNs for poetry myself; as of 2019, char-RNNs have been largely obsoleted by the new Transformer architecture, but recurrency will make a comeback and Karpathy’s post is still a valuable and fun read.]
There’s something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience I’ve in fact reached the opposite conclusion). Fast forward about a year: I’m training RNNs all the time and I’ve witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me. This post is about sharing some of that magic with you.We’ll train RNNs to generate text character by character and ponder the question “how is that even possible?”
This article presents an emerging architectural hypothesis of the brain as a biological implementation of a Universal Learning Machine. I present a rough but complete architectural view of how the brain works under the universal learning hypothesis. I also contrast this new viewpoint—which comes from computational neuroscience and machine learning—with the older evolved modularity hypothesis popular in evolutionary psychology and the heuristics and biases literature. These two conceptions of the brain lead to very different predictions for the likely route to AGI, the value of neuroscience, the expected differences between AGI and humans, and thus any consequent safety issues and dependent strategies.
Intro · Two viewpoints on the Mind · Universal Learning Machines · Historical Interlude · Dynamic Rewiring · Brain Architecture (the whole brain in one picture and a few pages of text) · The Basal Ganglia · Implications for AGI · Conclusion
…The roots of the universal learning hypothesis can be traced back to Mountcastle’s discovery of the simple uniform architecture of the cortex. The universal learning hypothesis proposes that all significant mental algorithms are learned; nothing is innate except for the learning and reward machinery itself (which is somewhat complicated, involving a number of systems and mechanisms), the initial rough architecture (equivalent to a prior over mindspace), and a small library of simple innate circuits (analogous to the operating system layer in a computer). In this view the mind (software) is distinct from the brain (hardware). The mind is a complex software system built out of a general learning mechanism…The key takeaway is that the data is what matters—and in the end it is all that matters. Train a universal learner on image data and it just becomes a visual system. Train it on speech data and it becomes a speech recognizer. Train it on ATARI and it becomes a little gamer agent.
Conclusion: Ray Kurzweil has been predicting for decades that AGI will be built by reverse engineering the brain, and this particular prediction is not especially unique—this has been a popular position for quite a while. My own investigation of neuroscience and machine learning led me to a similar conclusion some time ago.
The recent progress in deep learning, combined with the emerging modern understanding of the brain, provide further evidence that AGI could arrive around the time when we can build and train ANNs with similar computational power as measured very roughly in terms of neuron/synapse counts. In general the evidence from the last four years or so supports Hanson’s viewpoint from the Foom debate. More specifically, his general conclusion:
Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. By comparison, their general architectural innovations will be minor additions.
The ULH supports this conclusion. Current ANN engines can already train and run models with around 10 million neurons and 10 billion (compressed/shared) synapses on a single GPU, which suggests that the goal could soon be within the reach of a large organization. Furthermore, Moore’s Law for GPUs still has some steam left, and software advances are currently improving simulation performance at a faster rate than hardware. These trends implies that Anthropomorphic/Neuromorphic AGI could be surprisingly close, and may appear suddenly. What kind of leverage can we exert on a short timescale?
John Laing Leal was a physician and water treatment expert who, in 1908, was responsible for conceiving and implementing the first disinfection of a U.S. drinking water supply using chlorine. He was one of the principal expert witnesses at two trials which examined the quality of the water supply in Jersey City, New Jersey, and which evaluated the safety and utility of chlorine for production of "pure and wholesome" drinking water. The second trial verdict approved the use of chlorine to disinfect drinking water which led to an explosion of its use in water supplies across the U.S.
Water chlorination is the process of adding chlorine or chlorine compounds such as sodium hypochlorite to water. This method is used to kill bacteria, viruses and other microbes in water. In particular, chlorination is used to prevent the spread of waterborne diseases such as cholera, dysentery, and typhoid.
Background: Next generation sequencing (NGS) is now being used for detecting chromosomal abnormalities in blastocyst trophectoderm (TE) cells from in vitro fertilized embryos. However, few data are available regarding the clinical outcome, which provides vital reference for further application of the methodology. Here, we present a clinical evaluation of NGS-based preimplantation genetic diagnosis/screening (PGD/PGS) compared with single nucleotide polymorphism (SNP) array-based PGD/PGS as a control.
Results: A total of 395 couples participated. They were carriers of either translocation or inversion mutations, or were patients with recurrent miscarriage and/or advanced maternal age. A total of 1,512 blastocysts were biopsied on D5 after fertilization, with 1,058 blastocysts set aside for SNP array testing and 454 blastocysts for NGS testing. In the NGS cycles group, the implantation, clinical pregnancy and miscarriage rates were 52.6% (60⁄114), 61.3% (49⁄80) and 14.3% (7⁄49), respectively. In the SNP array cycles group, the implantation, clinical pregnancy and miscarriage rates were 47.6% (139⁄292), 56.7% (115⁄203) and 14.8% (17⁄115), respectively. The outcome measures of both the NGS and SNP array cycles were the same with insignificant differences. There were 150 blastocysts that underwent both NGS and SNP array analysis, of which seven blastocysts were found with inconsistent signals. All other signals obtained from NGS analysis were confirmed to be accurate by validation with qPCR. The relative copy number of mitochondrial DNA (mtDNA) for each blastocyst that underwent NGS testing was evaluated, and a significant difference was found between the copy number of mtDNA for the euploid and the chromosomally abnormal blastocysts. So far, out of 42 ongoing pregnancies, 24 babies were born in NGS cycles; all of these babies are healthy and free of any developmental problems.
Conclusions: This study provides the first evaluation of the clinical outcomes of NGS-based pre-implantation genetic diagnosis/screening, and shows the reliability of this method in a clinical and array-based laboratory setting. NGS provides an accurate approach to detect embryonic imbalanced segmental rearrangements, to avoid the potential risks of false signals from SNP array in this study. [Keywords: preimplantation genetic diagnosis/screening, next generation sequencing, blastocyst, cryopreserved embryo transfer, clinical outcome]
Earlier work described a mutation in DEC2 also known as BHLHE41 (basic helix-loophelix family member e41) as causal in a family of short sleepers, who needed just 6 h sleep per night. We evaluated whether there were other variants of this gene in two well-phenotyped cohorts. Sequencing of the BHLHE41 gene, electroencephalographic data, and delta power analysis and functional studies using cell-based luciferase. We identified new variants of the BHLHE41 gene in two cohorts who had either acute sleep deprivation (n = 200) or chronic partial sleep deprivation (n = 217). One variant, Y362H, at another location in the same exon occurred in one twin in a dizygotic twin pair and was associated with reduced sleep duration, less recovery sleep following sleep deprivation, and fewer performance lapses during sleep deprivation than the homozygous twin. Both twins had almost identical amounts of non rapid eye movement (NREM) sleep. This variant reduced the ability of BHLHE41 to suppress CLOCK/BMAL1 and NPAS2/BMAL1 transactivation in vitro. Another variant in the same exome had no effect on sleep or response to sleep deprivation and no effect on CLOCK/BMAL1 transactivation. Random mutagenesis identified a number of other variants of BHLHE41 that affect its function. There are a number of mutations of BHLHE41. Mutations reduce total sleep while maintaining NREM sleep and provide resistance to the effects of sleep loss. Mutations that affect sleep also modify the normal inhibition of BHLHE41 of CLOCK/BMAL1 transactivation. Thus, clock mechanisms are likely involved in setting sleep length and the magnitude of sleep homeostasis. Pellegrino R, Kavakli IH, Goel N, Cardinale CJ, Dinges DF, Kuna ST, Maislin G, Van Dongen HP, Tufik S, Hogenesch JB, Hakonarson H, Pack AI. A novel BHLHE41 variant is associated with short sleep and resistance to sleep deprivation in humans. SLEEP 2014;37(8):1327-1336.
Comparative studies of the brain in mammals suggest that there are general architectural principles governing its growth and evolutionary development. We are beginning to understand the geometric, biophysical and energy constraints that have governed the evolution and functional organization of the brain and its underlying neuronal network. The object of this review is to present current perspectives on primate brain evolution, especially in humans, and to examine some hypothetical organizing principles that underlie the brain's complex organization. Some of the design principles and operational modes that underlie the information processing capacity of the cerebral cortex in primates will be explored. It is shown that the development of the cortex coordinates folding with connectivity in a way that produces smaller and faster brains, then otherwise would have been possible. In view of the central importance placed on brain evolution in explaining the success of our own species, one may wonder whether there are physical limits that constrain its processing power and evolutionary potential. It will be argued that at a brain size of about 3500 cm(3), corresponding to a brain volume two to three times that of modern man, the brain seems to reach its maximum processing capacity. The larger the brain grows beyond this critical size, the less efficient it will become, thus limiting any improvement in cognitive power.
[130 epigrams on computer science and technology, published in 1982, for ACM’s SIGPLAN journal, by noted computer scientist and programming language researcher Alan Perlis. The epigrams are a series of short, programming-language-neutral, humorous statements about computers and programming, distilling lessons he had learned over his career, which are widely quoted.]
8. A programming language is low level when its programs require attention to the irrelevant.…19. A language that doesn’t affect the way you think about programming, is not worth knowing.…54. Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.
15. Everything should be built top-down, except the first time.…30. In programming, everything we do is a special case of something more general—and often we know it too quickly.…31. Simplicity does not precede complexity, but follows it.…58. Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it.…65. Make no mistake about it: Computers process numbers—not symbols. We measure our understanding (and control) by the extent to which we can arithmetize an activity.…56. Software is under a constant tension. Being symbolic it is arbitrarily perfectible; but also it is arbitrarily changeable.
1. One man’s constant is another man’s variable. 34. The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information.
36. The use of a program to prove the 4-color theorem will not change mathematics—it merely demonstrates that the theorem, a challenge for a century, is probably not important to mathematics.
39. Re graphics: A picture is worth 10K words—but only those to describe the picture. Hardly any sets of 10K words can be adequately described with pictures.
48. The best book on programming for the layman is Alice in Wonderland; but that’s because it’s the best book on anything for the layman.
77. The cybernetic exchange between man, computer and algorithm is like a game of musical chairs: The frantic search for balance always leaves one of the 3 standing ill at ease.…79. A year spent in artificial intelligence is enough to make one believe in God.…84. Motto for a research laboratory: What we work on today, others will first think of tomorrow.
91. The computer reminds one of Lon Chaney—it is the machine of a thousand faces.
7. It is easier to write an incorrect program than understand a correct one.…93. When someone says “I want a programming language in which I need only say what I wish done,” give him a lollipop.…102. One can’t proceed from the informal to the formal by formal means.
100. We will never run out of things to program as long as there is a single program around.
108. Whenever 2 programmers meet to criticize their programs, both are silent.…112. Computer Science is embarrassed by the computer.…115. Most people find the concept of programming obvious, but the doing impossible. 116. You think you know when you can learn, are more sure when you can write, even more when you can teach, but certain when you can program. 117. It goes against the grain of modern education to teach children to program. What fun is there in making plans, acquiring discipline in organizing thoughts, devoting attention to detail and learning to be self-critical?
Tacit Knowledge, embodied in people rather than words, equations, or diagrams, plays a vital role in science. The historical record of the development and spread of nuclear weapons and the recollections of their designers suggest that tacit knowledge is also crucial to nuclear weapons development. Therefore, if design ceases, and if there is no new generation of designers to whom that tacit knowledge can be passed, then in an important (though qualified) sense nuclear weapons will have been uninvented. Their renewed development would thus have some of the characteristics of reinvention rather than simply copying. In addition, knowledge may be lost not only as a result of complete disarmament, but also as a consequence of likely measures such as a nuclear test ban.
Although the Internet Archive’s Wayback Machine is the largest and most well-known web archive, there have been a number of public web archives that have emerged in the last several years. With varying resources, audiences and collection development policies, these archives have varying levels of overlap with each other. While individual archives can be measured in terms of number of URIs, number of copies per URI, and intersection with other archives, to date there has been no answer to the question "How much of the Web is archived?" We study the question by approximating the Web using sample URIs from DMOZ, Delicious, Bitly, and search engine indexes; and, counting the number of copies of the sample URIs exist in various public web archives. Each sample set provides its own bias. The results from our sample sets indicate that range from 35 between 2-5 copies, 1 in public web archives. The number of URI copies varies as a function of time, but no more than 31.3
In this study we quantify economic benefits from projected improvements in worker productivity resulting from the reduction in children's exposure to lead in the United States since 1976. We calculated the decline in blood lead levels (BLLs) from 1976 to 1999 on the basis of nationally representative National Health and Nutrition Examination Survey (NHANES) data collected during 1976 through 1980, 1991 through 1994, and 1999. The decline in mean BLL in 1- to 5-year-old U.S. children from 1976-1980 to 1991-1994 was 12.3 microg/dL, and the estimated decline from 1976 to 1999 was 15.1 microg/dL. We assumed the change in cognitive ability resulting from declines in BLLs, on the basis of published meta-analyses, to be between 0.185 and 0.323 IQ points for each 1 g/dL blood lead concentration. These calculations imply that, because of falling BLLs, U.S. preschool-aged children in the late 1990s had IQs that were, on average, 2.2-4.7 points higher than they would have been if they had the blood lead distribution observed among U.S. preschool-aged children in the late 1970s. We estimated that each IQ point raises worker productivity 1.76-2.38%. With discounted lifetime earnings of $723,300 for each 2-year-old in 2000 dollars, the estimated economic benefit for each year's cohort of 3.8 million 2-year-old children ranges from $110 billion to $319 billion.
Background: Results from previous studies show that the cognitive ability of offspring might be irreversibly damaged as a result of their mother’s mild iodine deficiency during pregnancy. A reduced intelligence quotient (IQ) score has broad economic and societal cost implications because intelligence affects wellbeing, income, and education outcomes. Although pregnancy and lactation lead to increased iodine needs, no UK recommendations for iodine supplementation have been issued to pregnant women. We aimed to investigate the cost-effectiveness of iodine supplementation versus no supplementation for pregnant women in a mildly to moderately iodine-deficient population for which a population-based iodine supplementation programme—for example, universal salt iodisation—did not exist.
Methods: We systematically searched MEDLINE, Embase, EconLit, and NHS EED for economic studies that linked IQ and income published in all languages until Aug 21, 2014. We took clinical data relating to iodine deficiency in pregnant women and the effect on IQ in their children aged 8–9 years from primary research. A decision tree was developed to compare the treatment strategies of iodine supplementation in tablet form with no iodine supplementation for pregnant women in the UK. Analyses were done from a health service perspective (analysis 1; taking direct health service costs into account) and societal perspective (analysis 2; taking education costs and the value of an IQ point itself into account), and presented in terms of cost (in sterling, relevant to 2013) per IQ point gained in the offspring. We made data-supported assumptions to complete these analyses, but used a conservative approach that limited the benefits of iodine supplementation and overestimated its potential harms.
Findings: Our systematic search identified 1361 published articles, of which eight were assessed to calculate the monetary value of an IQ point. A discounted lifetime value of an additional IQ point based on earnings was estimated to be £3297 (study estimates range from £1319 to £11 967) for the offspring cohort. Iodine supplementation was cost saving from both a health service perspective (saving £199 per pregnant woman [sensitivity analysis range –£42 to £229]) and societal perspective (saving £4476 per pregnant woman [sensitivity analysis range £540 to £4495]), with a net gain of 1·22 IQ points in each analysis. Base case results were robust to sensitivity analyses.
Interpretation: Iodine supplementation for pregnant women in the UK is potentially cost saving. This finding also has implications for the 1·88 billion people in the 32 countries with iodine deficiency worldwide. Valuation of IQ points should consider non-earnings benefits—eg, health benefits associated with a higher IQ not germane to earnings.
The Barbary slave trade refers to slave markets on the Barbary Coast of North Africa, which included the Ottoman provinces of Algeria, Tunisia and Tripolitania and the independent sultanate of Morocco, between the 16th and middle of the 18th century. The Ottoman provinces in North Africa were nominally under Ottoman suzerainty, but in reality they were mostly autonomous.
Artificial reinforcement learning (RL) is a widely used technique in artificial intelligence that provides a general method for training agents to perform a wide variety of behaviours. RL as used in computer science has striking parallels to reward and punishment learning in animal and human brains. I argue that present-day artificial RL agents have a very small but nonzero degree of ethical importance. This is particularly plausible for views according to which sentience comes in degrees based on the abilities and complexities of minds, but even binary views on consciousness should assign nonzero probability to RL programs having morally relevant experiences. While RL programs are not a top ethical priority today, they may become more significant in the coming decades as RL is increasingly applied to industry, robotics, video games, and other areas. I encourage scientists, philosophers, and citizens to begin a conversation about our ethical duties to reduce the harm that we inflict on powerless, voiceless RL agents.
[Fierce but witty critique by David Stove of philosophy throughout the ages and defense of Logical Positivism, with Christian theology, Neoplatonism, and German Idealism as examples. Logical Positivists took the easy way out: the problem with these philosophies is not that they are gibberish or meaningless, because at least then they would all be wrong in the same way and could perhaps be refuted in the same way, but that they each are wrong in a myriad of different ways, ways for which we have no existing “fallacy” defined, entire universes of new errors—undermining the hope of using reason or philosophy to make any kind of progress. What is wrong with philosophy, and ourselves, if we cannot even explain why these are so badly wrong after millennia of thought and debate?]
(WP; TVTropes; LW discussion; non-PDF version) 2007 short story, set in a Renaissance-esque fantasy historical setting, featuring a cambist (a money-exchanger) who is set three dangerous tasks by a bored and dissolute aristocrat. The 3 challenges illustrate principles of economics:
the exchange theory of value: the value of something is what you can exchange it for in the market
revealed preferences: the choices individuals and groups reveal the true value set on things, regardless of what they may say
gains from trade: a trade of 2 things, which remain unchanged, can make both parties better off
Borges considers the problem of whether Argentinian writing on non-Argentinian subjects can still be truly "Argentine." His conclusion: ...We should not be alarmed and that we should feel that our patrimony is the universe; we should essay all themes, and we cannot limit ourselves to purely Argentine subjects in order to be Argentine; for either being Argentine is an inescapable act of fate—and in that case we shall be so in all events—or being Argentine is a mere affectation, a mask. I believe that if we surrender ourselves to that voluntary dream which is artistic creation, we shall be Argentine and we shall also be good or tolerable writers.
Still Alice is a 2007 novel by Lisa Genova. The novel is about a woman who suffers early-onset Alzheimer's disease. Alice Howland, a 50-year-old woman, is a cognitive psychology professor at Harvard University and is a world-renowned linguistics expert. She is married to an equally successful husband, and they have three grown children. The disease takes hold swiftly, and it changes Alice’s relationship with her family and the world. It was Genova's first novel.
The Fractal Prince is the second science fiction novel by Hannu Rajaniemi and the second novel to feature the post-human gentleman thief Jean le Flambeur. It was published in Britain by Gollancz in September 2012, and by Tor in the same year in the US. The novel is the second in the trilogy, following The Quantum Thief (2010) and preceding The Causal Angel (2014).
The Causal Angel is the third science fiction novel by Hannu Rajaniemi featuring the protagonist Jean le Flambeur. It was published in July 2014 by Gollancz in the UK and by Tor in the US. The novel is the finale of a trilogy. The previous novels in the series are The Quantum Thief (2010) and The Fractal Prince (2012).
A Perfect Vacuum is a 1971 book by Polish author Stanisław Lem, the largest and best known collection of Stanislaw Lem's fictitious criticism of nonexisting books. It was translated into English by Michael Kandel. Some of the reviews remind the reader of drafts of his science fiction novels, some read like philosophical pieces across scientific topics, from cosmology to the pervasiveness of computers, finally others satirize and parody everything from the nouveau roman to pornography, Ulysses, authorless writing, and Dostoevsky.
The Martian is a 2011 science fiction novel written by Andy Weir. It was his debut novel under his own name. It was originally self-published in 2011; Crown Publishing purchased the rights and re-released it in 2014. The story follows an American astronaut, Mark Watney, as he becomes stranded alone on Mars in 2035 and must improvise in order to survive. The Martian, a film adaptation directed by Ridley Scott and starring Matt Damon, was released in October 2015.
Ready Player One is a 2011 science fiction novel, and the debut novel of American author Ernest Cline. The story, set in a dystopia in 2045, follows protagonist Wade Watts on his search for an Easter egg in a worldwide virtual reality game, the discovery of which would lead him to inherit the game creator's fortune. Cline sold the rights to publish the novel in June 2010, in a bidding war to the Crown Publishing Group. The book was published on August 16, 2011. An audiobook was released the same day; it was narrated by Wil Wheaton, who was mentioned briefly in one of the chapters. In 2012, the book received an Alex Award from the Young Adult Library Services Association division of the American Library Association and won the 2011 Prometheus Award.A film adaptation, screenwritten by Cline and Zak Penn and directed by Steven Spielberg, was released on March 29, 2018. A sequel, Ready Player Two, was released on November 24, 2020.
Christopher Murray is an American researcher in global health and public health at the University of Washington in Seattle and is the institute director of the Institute for Health Metrics and Evaluation (IHME). Beginning in 1990, he has worked on ways to measure the burden of disease and disability around the globe. He has led several projects to gather that data, disease-by-disease, country by country. The aim of these efforts, which involve the work of hundreds of researchers, is to provide data for policy makers around the world to allocate healthcare resources.
Disease burden is the impact of a health problem as measured by financial cost, mortality, morbidity, or other indicators. It is often quantified in terms of quality-adjusted life years (QALYs) or disability-adjusted life years (DALYs). Both of these metrics quantify the number of years lost due to disability (YLDs), sometimes also known as years lost due to disease or years lived with disability/disease. One DALY can be thought of as one year of healthy life lost, and the overall disease burden can be thought of as a measure of the gap between current health status and the ideal health status. According to an article published in The Lancet in June 2015, low back pain and major depressive disorder were among the top ten causes of YLDs and were the cause of more health loss than diabetes, chronic obstructive pulmonary disease, and asthma combined. The study based on data from 188 countries, considered to be the largest and most detailed analysis to quantify levels, patterns, and trends in ill health and disability, concluded that "the proportion of disability-adjusted life years due to YLDs increased globally from 21.1% in 1990 to 31.2% in 2013." The environmental burden of disease is defined as the number of DALYs that can be attributed to environmental factors. Similarly, The work-related burden of disease is defined as the number of deaths and DALYs that can be attributed to occupational risk factors to human health. These measures allow for comparison of disease burdens, and have also been used to forecast the possible impacts of health interventions. By 2014 DALYs per head were "40% higher in low-income and middle-income regions."
Ivor Armstrong Richards, known as I. A. Richards, was an English educator, literary critic, and rhetorician. His work contributed to the foundations of the New Criticism, a formalist movement in literary theory which emphasized the close reading of a literary text, especially poetry, in an effort to discover how a work of literature functions as a self-contained and self-referential æsthetic object.
Nim Chimpsky was a chimpanzee and the subject of an extended study of animal language acquisition at Columbia University. The project was led by Herbert S. Terrace with the linguistic analysis headed up by psycholinguist Thomas Bever. Chimpsky was given his name as a pun on linguist Noam Chomsky, who posits that humans are "wired" to develop language. Though usually called Nim Chimpsky, his full name was Neam Chimpsky, or Nim for short.
Dead Birds is a 1963 American documentary film by Robert Gardner (1925-2014) about the ritual warfare cycle of the Dugum Dani people who live in the Baliem Valley in present-day Irian Jaya province on the western half of the island of New Guinea that is part of present-day Indonesia. The film presents footage of battles between the Willihiman-Wallalua clan and the Wittaia clan with scenes of the funeral of a small boy killed by a raiding party, the women's work that goes on while battles continue, and the wait for enemy to appear. In 1964 the film received the Grand Prize "Marzocco d'Oro" at the 5th Festival dei Populi rassegna internazionale del film etnografico e sociologico in Florence, Italy, the Robert J. Flaherty Award given by the City College of New York, and was a featured film at the Melbourne Film Festival. In 1998, Dead Birds was included in the annual selection of 25 motion pictures added to the National Film Registry of the Library of Congress. being deemed "culturally, historically, or aesthetically significant" and recommended for preservation. Dead Birds has come to hold canonical status among ethnographic films.
A State of Mind is a 2004 documentary film directed by Daniel Gordon and produced by Nicholas Bonner. It follows two North Korean child gymnasts and their families for over eight months during training for the 2003 Pyongyang mass games. The film won two awards at the North Korean Pyongyang International Film Festival in 2004 and was shown at 11 other film festivals worldwide before being released in a theatrical run in 2005.
How the Grinch Stole Christmas! is a 1966 animated television special, directed and co-produced by Chuck Jones. It is based on the 1957 children's book of the same name by Dr. Seuss, and tells the story of the Grinch, who tries to ruin the Christmas for the townsfolk of Whoville below his mountain hideaway. Originally telecast in the United States on CBS on December 18, 1966, it went on to become a perennial holiday special. The special also features the voice of Boris Karloff as the Grinch and the narrator.
The Theory of Everything is a 2014 biographical romantic drama film directed by James Marsh. Set at the University of Cambridge, it details the life of the theoretical physicist Stephen Hawking. It was adapted by Anthony McCarten from the 2007 memoir Travelling to Infinity: My Life with Stephen by Jane Hawking, which deals with her relationship with her ex-husband Stephen Hawking, his diagnosis of amyotrophic lateral sclerosis, and his success in the field of physics. The film stars Eddie Redmayne and Felicity Jones, with Charlie Cox, Emily Watson, Simon McBurney, Christian McKay, Harry Lloyd, and David Thewlis featured in supporting roles. The film had its world premiere at the 2014 Toronto International Film Festival on 7 September 2014. It had its UK premiere on 1 January 2015.
The Martian is a 2015 science fiction film directed by Ridley Scott and starring Matt Damon. The Martian, a 2011 novel by Andy Weir, served as the screenplay adapted by Drew Goddard. The film depicts an astronaut's lone struggle to survive on Mars after being left behind, and efforts to rescue him and bring him home to Earth. It also stars Jessica Chastain, Jeff Daniels, Kristen Wiig, Chiwetel Ejiofor, Sean Bean, Michael Peña, Kate Mara, Sebastian Stan, Aksel Hennie, Mackenzie Davis, Donald Glover, and Benedict Wong.
Big Hero 6 is a 2014 American 3D computer animated superhero film produced by Walt Disney Animation Studios and released by Walt Disney Pictures. Loosely based on the superhero team of the same name by Marvel Comics, the film is the 53rd Disney animated feature film. Directed by Don Hall and Chris Williams, the film tells the story of Hiro Hamada, a young robotics prodigy, and Baymax, his late brother's healthcare provider robot, who forms a superhero team to combat a masked villain. The film features the voices of Scott Adsit, Ryan Potter, Daniel Henney, T.J. Miller, Jamie Chung, Damon Wayans Jr., Genesis Rodriguez, Alan Tudyk, James Cromwell, and Maya Rudolph.
The Great Gatsby is a 2013 romantic drama film based on F. Scott Fitzgerald's 1925 novel of the same name. The film was co-written and directed by Baz Luhrmann and stars Leonardo DiCaprio as the eponymous Jay Gatsby, with Tobey Maguire, Carey Mulligan, Joel Edgerton, Isla Fisher, Jason Clarke, Elizabeth Debicki and Jack Thompson. Jay-Z served as executive producer. Production began in 2011 and took place in Australia, with a $105 million net production budget. The film follows the life and times of millionaire Jay Gatsby (DiCaprio) and his neighbor Nick Carraway (Maguire), who recounts his encounter with Gatsby at the height of the Roaring Twenties on Long Island.
Cowboy Bebop is a Japanese science-fiction anime television series animated by Sunrise featuring a production team led by director Shinichirō Watanabe, screenwriter Keiko Nobumoto, character designer Toshihiro Kawamoto, mechanical designer Kimitoshi Yamane, and composer Yoko Kanno. The twenty-six episodes ("sessions") of the series are set in the year 2071, and follow the lives of a bounty hunter crew traveling in their spaceship called the Bebop. Although it covers a wide range of genres throughout its run, Cowboy Bebop draws most heavily from science fiction, western and noir films, and its most recurring thematic focal points include adult existential ennui, loneliness and the difficulties of trying to escape one's past.
Mushishi is a Japanese manga series written and illustrated by Yuki Urushibara. It was serialized in Afternoon Season Zōkan from 1999 to 2002, and in Monthly Afternoon from December 2002 to August 2008. The individual chapters were collected and released into ten tankōbon volumes by Kodansha. Those volumes were localized to North America by Del Rey between January 2007 and August 2010. The series follows Ginko, a man who dedicates himself to keeping people protected from supernatural creatures called Mushi.
Hozuki's Coolheadedness is a Japanese manga series written and illustrated by Natsumi Eguchi. The plot revolves around Hozuki, a demon who works for the King and Head Judge of Hell. It was serialized by Kodansha in the magazine Weekly Morning between March 2011 and January 2020, with chapters collected in thirty tankōbon volumes. The manga was adapted into a television anime series; Wit Studio produced the first season in 2014, and Studio Deen was responsible for a second season in 2017–2018. The former studio also produced three original animation DVDs (OADs) in 2015, while the latter produced one OAD in 2017, and Pine Jam produced three more OVAs in 2019 and 2020.
Expelled from Paradise is a 2014 Japanese Anime science fiction film. The film is directed by Seiji Mizushima, with a screenplay written by Gen Urobuchi, produced by Toei Animation and animated by Graphinica, and distributed by T-Joy in cooperation with Toei Company.
Monthly Girls' Nozaki-kun is a Japanese four-panel romantic comedy manga written and illustrated by Izumi Tsubaki. The chapters are serialized online in Gangan Online, and have been published in both physical and digital releases of Shoujo Romance Girly and tankōbon volumes by Square Enix. An anime adaptation by Doga Kobo aired in July 2014.
Shirobako is a 24-episode anime television series produced by P.A.Works and directed by Tsutomu Mizushima. It aired in Japan between October 9, 2014 and March 26, 2015. A manga adaptation began serialization in ASCII Media Works's Dengeki Daioh magazine in September 2014, and a novel was published by Shueisha in January 2015. An anime film premiered on February 29, 2020.
Fate/stay night: Unlimited Blade Works is a 2010 Japanese animated fantasy action film directed by Yūji Yamaguchi. Unlimited Blade Works covers the events of the second route of the visual novel Fate/stay night by Type-Moon. The film primarily focuses on two young mages, Shirou Emiya and Rin Tohsaka, and their servants, who participate in a conflict known as the Holy Grail War. During the fights, Shirou often crosses paths with Rin's servant, Archer, who seeks his death despite being an ally.
This page is a changelog for Gwern.net: a monthly reverse chronological list of recent major writings/changes/additions.
Following my writing can be a little difficult because it is often so incremental. So every month, in addition to my regular /r/Gwern subreddit submissions, I write up reasonably-interesting changes and send it out to the mailing list in addition to a compilation of links & reviews (archives).
A subreddit for posting links of interest and also for announcing updates to gwern.net (which can be used as a RSS feed). Submissions are categorized similar to the monthly newsletter and typically will be collated there.
The cypherpunk movement laid the ideological roots of Bitcoin and the online drug market Silk Road; balancing previous emphasis on cryptography, I emphasize the non-cryptographic market aspects of Silk Road which is rooted in cypherpunk economic reasoning, and give a fully detailed account of how a buyer might use market information to rationally buy, and finish by discussing strengths and weaknesses of Silk Road, and what future developments are predicted by cypherpunk ideas.
I compile a dataset of 87 public English-language darknet markets (DNMs) 2011–2016 in the vein of the famous Silk Road 1, recording their openings/closing and relevant characteristics. A survival analysis indicates the markets follow a Type TODO lifespan, with a median life of TODO months. Risk factors include TODO. With the best model, I generate estimates for the currently-operating markets.
Char-RNNs are unsupervised generative models which learn to mimic text sequences. I suggest extending char-RNNs with inline metadata such as genre or author prefixed to each line of input, allowing for better & more efficient metadata, and more controllable sampling of generated output by feeding in desired metadata. A 2015 experiment using torch-rnn on a set of ~30 Project Gutenberg e-books (1 per author) to train a large char-RNN shows that a char-RNN can learn to remember metadata such as authors, learn associated prose styles, and often generate text visibly similar to that of a specified author.
In February 2019, following up on my 2015–2016 text-generation experiments with char-RNNs, I experiment with the cutting-edge Transformer NN architecture for language modeling & text generation. Using OpenAI’s GPT-2-117M (117M) model pre-trained on a large Internet corpus and nshepperd’s finetuning code, I retrain GPT-2-117M on a large (117MB) Project Gutenberg poetry corpus. I demonstrate how to train 2 variants: “GPT-2-poetry”, trained on the poems as a continuous stream of text, and “GPT-2-poetry-prefix”, with each line prefixed with the metadata of the PG book it came from. In May 2019, I trained the next-largest GPT-2, GPT-2-345M, similarly, for a further quality boost in generated poems. In October 2019, I retrained GPT-2-117M on a Project Gutenberg corpus with improved formatting, and combined it with a contemporary poem dataset based on Poetry Foundation’swebsite. .> With just a few GPU-days on 1080ti GPUs, GPT-2-117M finetuning can produce high-quality poetry which is more thematically consistent than my char-RNN poems, capable of modeling subtle features like rhyming, and sometimes even a pleasure to read. I list the many possible ways to improve poem generation and further approach human-level poems. For the highest-quality AI poetry to date, see my followup page, “GPT-3 Creative Writing”.
Alan Jay Perlis was an American computer scientist and professor at Purdue University, Carnegie Mellon University and Yale University. He is best known for his pioneering work in programming languages and was the first recipient of the Turing Award.
The Cambist and Lord Iron: A Fairy Tale of Economics is a 2007 novelette by Daniel Abraham. It was originally published in the anthology Logorrhea: Good Words Make Good Stories, and subsequently republished in The Year's Best Fantasy and Horror 2008: 21st Annual Collection (2008), in Fantasy: The Best of the Year (2008), in The Best Science Fiction and Fantasy of the Year Volume Two (2008), and in Lightspeed (2013); as well, an audio version was made available via PodCastle in 2009.
In political economy and especially Marxian economics, exchange value refers to one of four major attributes of a commodity, i.e., an item or service produced for, and sold on the market. The other three aspects are use value, economic value, and price. Thus, a commodity has:
a value, represented by the Socially necessary labour time to produce it. ;
a use value ;
an exchange value, which is the proportion at which a commodity can be exchanged for other commodities;
In economics, economic value is a measure of the benefit provided by a good or service to an economic agent. It is generally measured relative to units of currency, and the interpretation is therefore "what is the maximum amount of money a specific actor is willing and able to pay for the good or service"?
Revealed preference theory, pioneered by economist Paul Samuelson, is a method of analyzing choices made by individuals, mostly used for comparing the influence of policies on consumer behavior. Revealed preference models assume that the preferences of consumers can be revealed by their purchasing habits.
In economics, gains from trade are the net benefits to economic agents from being allowed an increase in voluntary trading with each other. In technical terms, they are the increase of consumer surplus plus producer surplus from lower tariffs or otherwise liberalizing trade.
Subscription page for the monthly gwern.net newsletter. There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews. You can also browse the archives since December 2013.
I continue my AI poetry generation experiments with OpenAI’s 2020 GPT-3, which is 116× larger, and much more powerful, than the 2019 GPT-2. GPT-3, however, is not merely a quantitative tweak yielding “GPT-2 but better”—it is qualitatively different, exhibiting eerie runtime learning capabilities allowing even the raw model, with zero finetuning, to “meta-learn” many textual tasks purely by example or instruction. One does not train or program GPT-3 in a normal way, but one engages in dialogue and writes prompts to teach GPT-3 what one wants.
Experimenting through the OpenAI Beta API in June 2020, I find that GPT-3 does not just match my finetuned GPT-2-1.5b-poetry for poem-writing quality, but exceeds it, while being versatile in handling poetry, Tom Swifty puns, science fiction, dialogue like Turing’s Turing-test dialogue, literary style parodies… As the pièce de résistance, I recreate Stanislaw Lem’s Cyberiad’s “Trurl’s Electronic Bard” poetry using GPT-3. (Along the way, I document instances of how the BPE text encoding unnecessarily damagesGPT-3’s performance on a variety of tasks, how to best elicit the highest-quality responses, common errors people make in using GPT-3, and test out GPT-3’s improvements in NN weak points like logic or commonsense knowledge.)
GPT-3’s samples are not just close to human level: they are creative, witty, deep, meta, and often beautiful. They demonstrate an ability to handle abstractions, like style parodies, I have not seen in GPT-2 at all. Chatting with GPT-3 feels uncannily like chatting with a human. I was impressed by the results reported in the GPT-3 paper, and after spending a week trying it out, I remain impressed.
This page records GPT-3 samples I generated in my explorations, and thoughts on how to use GPT-3 and its remaining weaknesses. I hope you enjoy them even a tenth as much as I enjoyed testing GPT-3 and watching the completions scroll across my screen.
The Poetry Foundation is a Chicago-based American foundation created to promote poetry in the wider culture. It was formed from Poetry magazine, which it continues to publish, with a 2003 gift of $200 million from philanthropist Ruth Lilly.
In November 2019, I experimented with training a GPT-2 neural net model to generate folk music in the high-level ABC music text format, following previous work in 2016 which used a char-RNN trained on a ‘The Session’ dataset. A GPT-2 hypothetically can improve on an RNN by better global coherence & copying of patterns, without problems with the hidden-state bottleneck.
I encountered problems with the standard GPT-2 model’s encoding of text which damaged results, but after fixing that, I successfully trained it on n = 205,304 ABC music pieces taken from The Session & ABCnotation.com. The resulting music samples are in my opinion quite pleasant. (A similar model was later retrained by Geerlings & Meroño-Peñuela 2020.)
We followed the ABC folk model with an ABC-MIDI model: a dataset of 453k ABC pieces decompiled from MIDI pieces, which fit into GPT-2-117M with an expanded context window when trained on TPUs. The MIDI pieces are far more diverse and challenging, and GPT-2 underfits and struggles to produce valid samples but when sampling succeeds, it can generate even better musical samples.
Standard language generation neural network models, like GPT-2, are trained via likelihood training to imitate human text corpuses. Generated text suffers from persistent flaws like repetition, due to myopic generation word-by-word, and cannot improve on the training data because they are trained to predict ‘realistic’ completions of the training data.
A proposed alternative is to use reinforcement learning to train the NNs, to encourage global properties like coherence & lack of repetition, and potentially improve over the original corpus’s average quality. Preference learning trains a reward function on human ratings, and uses that as the ‘environment’ for a blackbox DRL algorithm like PPO.
OpenAI released a codebase implementing this dual-model preference learning approach for textual generation, based on GPT-2. Having previously used GPT-2 for poetry & music generation, I experimented with GPT-2 preference learning for unconditional music and poetry generation.
I found that preference learning seemed to work better for music than poetry, and seemed to reduce the presence of repetition artifacts, but the results, at n≅7,400 ratings compiled over 23 iterations of training+sampling November 2019–January 2020, are not dramatically better than alternative improvements like scaling up models or more thorough data-cleaning or more stringent sample curation. My blind ratings using n≅200 comparisons showed no large advantage for the RL-tuned samples (winning only 93 of 210 comparisons, or 46%).
This may be due to insufficient ratings, bad hyperparameters, or not using samples generated with common prefixes, but I suspect it’s the former, as some NLP tasks in Ziegler et al 2019 required up to 60k ratings for good performance, and the reward model appeared to achieve poor performance & succumb to adversarial examples easily.
Working with it, I suspect that preference learning is unnecessarily sample-inefficient & data-inefficient, and that the blackbox reinforcement learning approach is inferior to directly using the reward model to optimize text samples, and propose two major architectural overhauls: have the reward model directly model the implied ranking of every datapoint, and drop the agent model entirely in favor of backprop-powered gradient ascent which optimizes sequences to maximize the reward model’s output.
…Black is GPT-2. Its excuse [for this chess blunder] is that it’s a text prediction program with no concept of chess. As far as it knows, it’s trying to predict short alphanumeric strings like “e2e4” or “Nb7”. Nobody told it this represents a board game. It doesn’t even have a concept of 2D space that it could use to understand such a claim. But it still captured my rook! Embarrassing!…Last month, I asked him if he thought GPT-2 could play chess. I wondered if he could train it on a corpus of chess games written in standard notation (where, for example, e2e4 means “move the pawn at square e2 to square e4”). There are literally millions of games written up like this. GPT-2 would learn to predict the next string of text, which would correspond to the next move in the chess game. Then you would prompt it with a chessboard up to a certain point, and it would predict how the chess masters who had produced its training data would continue the game – ie make its next move using the same heuristics they would. Gwern handed the idea to his collaborator Shawn Presser, who had a working GPT-2 chess engine running within a week:…You can play against GPT-2 yourself by following the directions in the last tweet, though it won’t be much of a challenge for anyone better than I am.
…What does this imply? I’m not sure (and maybe it will imply more if someone manages to make it actually good). It was already weird to see something with no auditory qualia learn passable poetic meter. It’s even weirder to see something with no concept of space learn to play chess. Is any of this meaningful? How impressed should we be that the same AI can write poems, compose music, and play chess, without having been designed for any of those tasks? I still don’t know.
“Language Models are Few-Shot Learners”, Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei (2020-05-28):
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions—something which current NLP systems still largely struggle to do.
Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10× more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora.
Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.
Compared to GPT-2, GPT-3 improves performance on character-level tasks like rhyming, alliteration, punning, anagrams or permutations, acrostic poems, and arithmetic less than expected, despite being very good at many other closely-related kinds of writings like satire.
Why? A plausible explanation is an obscure technical detail: as a performance optimization, GPT does not see characters but sub-word-chunks called “byte-pair encodings” (BPEs). Because GPTs never see characters but opaque partial-words, which vary chaotically based on the specific word and even the surrounding context, they are unable to easily learn about character-level aspects of language, like similar spellings or sounds, and are forced to learn relationships much more indirectly, like by brute-force memorizing of pairs of words.
Some experiments with reformatting GPT-3’s poorest-performing tasks to avoid inconsistent BPE encodings of strings shows small to large performance gains, consistent with this theory.
The GPT-3 neural network is so large a model in terms of power and dataset that it exhibits qualitatively different behavior: you do not apply it to a fixed set of tasks which were in the training dataset, requiring retraining on additional data if one wants to handle a new task (as one would have to retrain GPT-2); instead, you interact with it, expressing any task in terms of natural language descriptions, requests, and examples, tweaking the prompt until it “understands” & it meta-learns the new task based on the high-level abstractions it learned from the pretraining.
This is a rather different way of using a DL model, and it’s better to think of it as a new kind of programming, where the prompt is now a “program” which programs GPT-3 to do new things.
While training a GPT-2-117M on a folk music corpus written in ABC format, persistent syntax errors kept being generated by an otherwise-high-quality model: random spaces would be generated, rendering a music piece either erroneous or lower-quality. Why? It seems to be some issue with the GPT BPE encoder handling of spaces which makes it difficult to emit the right space-separated characters. We found that ABC does not actually require spaces, and we simply removed all spaces from the corpus—noticeably improving quality of generated pieces.
Generating symbolic music with language models is a promising research area, with potential applications in automated music composition. Recent work shows that Transformer architectures can learn to generate compelling four-instrument scores from large MIDI datasets. In this paper, we re-train the small (117M) GPT-2 model with a large dataset in ABC notation, and generate samples of single-instrument folk music. Our BLEU and ROUGE based quantitative, and survey based qualitative, evaluations suggest that ABC notation is learned with syntactical and semantic correctness, and that samples contain robust and believable n-grams.
To expand the ABC GPT-2 model to cover a wider variety of musical genres, I turn to the next-most compact widespread music encoding format: MIDI. There are hundreds of thousands of MIDIs which can be decompiled to ABC format, averaging ~10k BPEs—within GPT-2-117M’s feasible context window when trained on TPUs (which permit training of context windows up to 30k wide).
We compile the ABC from before and 2 large MIDI datasets, and convert to ABC, yielding ~453k usable ABC-MIDI musical files (~5.1GB of text). We trained January–April 2020 on our TPU swarm (with many interruptions), achieving a final loss of ~0.2 (underfit).
Sampling from the final model is hit-or-miss as it is prone to the likelihood repetition trap and it generates instruments one-by-one so it is common for instruments to be cut off or otherwise broken during sampling (indicating that sampling is increasingly a bigger problem than training for long-range sequence modeling). However, successful pieces are possible, and are musically far more diverse than the folk ABC corpus, with many pleasingly complex samples.
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets.
We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset—matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples.
The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text.
These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
This work applies natural language modeling to generate plausible strategic moves in the ancient game of Go. We train the Generative Pretrained Transformer (GPT-2) to mimic the style of Go champions as archived in Smart Game Format (SGF), which offers a text description of move sequences. The trained model further generates valid but previously unseen strategies for Go. Because GPT-2 preserves punctuation and spacing, the raw output of the text generator provides inputs to game visualization and creative patterns, such as the Sabaki project’s game engine using auto-replays. Results demonstrate that language modeling can capture both the sequencing format of championship Go games and their strategic formations. Compared to random game boards, the GPT-2 fine-tuning shows efficient opening move sequences favoring corner play over less advantageous center and side play. Game generation as a language modeling task offers novel approaches to more than 40 other board games where historical text annotation provides training data (e.g., Amazons & Connect 4/6).
This work demonstrates that natural language transformers can support more generic strategic modeling, particularly for text-archived games. In addition to learning natural language skills, the abstract transformer architecture can generate meaningful moves on a chessboard. With further fine-tuning, the transformer learns complex gameplay by training on 2.8 million chess games in Portable Game Notation. After 30,000 training steps, OpenAI’s Generative Pre-trained Transformer (GPT-2) optimizes weights for 774 million parameters. This fine-tuned Chess Transformer generates plausible strategies and displays game formations identifiable as classic openings, such as English or the Slav Exchange. Finally, in live play, the novel model demonstrates a human-to-transformer interface that correctly filters illegal moves and provides a novel method to challenge the transformer’s chess strategies. We anticipate future work will build on this transformer’s promise, particularly in other strategy games where features can capture the underlying complex rule syntax from simple but expressive player annotations.
A decompiler is a computer program that takes an executable file as input, and attempts to create a high level source file which can be recompiled successfully. It is therefore the opposite of a compiler, which takes a source file and makes an executable. Decompilers are usually unable to perfectly reconstruct the original source code, and as such, will frequently produce obfuscated code. Nonetheless, decompilers remain an important tool in the reverse engineering of computer software.