“What my deep model doesn’t know…” (Something of a tour de force: isomorphism between Gaussian processes and deep neural networks showing dropout is equivalent to using 1 NN to average over a whole family of similar models (explaining why dropout improves results, since we all know the advantages of ensembles) and showing how variation in NN output when wiggled by dropout gives an indication of uncertainty in predictions and from there yields useful results and stuff like usable Thompson sampling (!) in the famous deep Q reinforcement-learner for optimizing exploration and learning faster. Phew.)
This page is a changelog for Gwern.net: a monthly reverse chronological list of recent major writings/changes/additions.
Following my writing can be a little difficult because it is often so incremental. So every month, in addition to my regular /r/Gwern subreddit submissions, I write up reasonably-interesting changes and send it out to the mailing list in addition to a compilation of links & reviews (archives).
A subreddit for posting links of interest and also for announcing updates to gwern.net (which can be used as a RSS feed). Submissions are categorized similar to the monthly newsletter and typically will be collated there.
Local archives are useful for personal purposes, but sometimes, in investigations that may be controversial, you want to be able to prove that the copy you downloaded was not modified and you need to timestamp it and prove the exact file existed on or before a certain date. This can be done by creating a cryptographic hash of the file and then publishing that hash to global chains like centralized digital timestampers or the decentralized Bitcoin blockchain. Current timestamping mechanisms tend to be centralized, manual, cumbersome, or cost too much to use routinely. Centralization can be overcome by timestamping to Bitcoin; costing too much can be overcome by batching up an arbitrary number of hashes and creating just 1 hash/timestamp covering them all; manual & cumbersome can be overcome by writing programs to handle all of this and incorporating them into one’s workflow. So using an efficient cryptographic timestamping service (the OriginStamp Internet service), we can write programs to automatically & easily timestamp arbitrary files & strings, timestamp every commit to a Git repository, and webpages downloaded for archival purposes. We can implement the same idea offline, without reliance on OriginStamp, but at the cost of additional software dependencies like a Bitcoin client.
Pleiotropy occurs when one gene influences two or more seemingly unrelated phenotypic traits. Such a gene that exhibits multiple phenotypic expression is called a pleiotropic gene. Mutation in a pleiotropic gene may have an effect on several traits simultaneously, due to the gene coding for a product used by a myriad of cells or different targets that have the same signaling function.
Arthur Jensen argues that the failure of recent compensatory education efforts to produce lasting effects on children's IQ and achievement suggests that the premises on which these efforts have been based should be reexamined. He begins by questioning a central notion upon which these and other educational programs have recently been based: that IQ differences are almost entirely a result of environmental differences and the cultural bias of IQ tests. After tracing the history of IQ tests, Jensen carefully defines the concept of IQ, pointing out that it appears as a common factor in all tests that have been devised thus far to tap higher mental processes. Having defined the concept of intelligence and related it to other forms of mental ability, Jensen employs an analysis of variance model to explain how IQ can be separated into genetic and environmental components. He then discusses the concept of "heritability," a statistical tool for assessing the degree to which individual differences in a trait like intelligence can be accounted for by genetic factors. He analyzes several lines of evidence which suggest that the heritability of intelligence is quite high (i.e., genetic factors are much more important than environmental factors in producing IQ differences). After arguing that environmental factors are not nearly as important in determining IQ as are genetic factors, Jensen proceeds to analyze the environmental influences which may be most critical in determining IQ. He concludes that prenatal influences may well contribute the largest environmental influence on IQ. He then discusses evidence which suggests that social class and racial variations in intelligence cannot be accounted for by differences in environment but must be attributed partially to genetic differences. After he has discussed the influence on the distribution of IQ in a society on its functioning, Jensen examines in detail the results of educational programs for young children, and finds that the changes in IQ produced by these programs are generally small. A basic conclusion of Jensen's discussion of the influence of environment on IQ is that environment acts as a "threshold variable." Extreme environmental deprivation can keep the child from performing up to his genetic potential, but an enriched educational program cannot push the child above that potential. Finally, Jensen examines other mental abilities that might be capitalized on in an educational program, discussing recent findings on diverse patterns of mental abilities between ethnic groups and his own studies of associative learning abilities that are independent of social class. He concludes that educational attempts to boost IQ have been misdirected and that the educational process should focus on teaching much more specific skills. He argues that this will be accomplished most effectively if educational methods are developed which are based on other mental abilities besides I.Q.
Purpose: The main aim of the present study (FITFATTWIN) was to investigate how physical activity level is associated with body composition, glucose homeostasis, and brain morphology in young adult male monozygotic twin pairs discordant for physical activity.
Methods: From a population-based twin cohort, we systematically selected 10 young adult male monozygotic twin pairs (age range, 32–36 yr) discordant for leisure time physical activity during the past 3 yr. On the basis of interviews, we calculated a mean sum index for leisure time and commuting activity during the past 3 yr (3-yr LTMET index expressed as MET-hours per day). We conducted extensive measurements on body composition (including fat percentage measured by dual-energy x-ray absorptiometry), glucose homeostasis including homeostatic model assessment index and insulin sensitivity index (Matsuda index, calculated from glucose and insulin values from an oral glucose tolerance test), and whole brain magnetic resonance imaging for regional volumetric analyses.
Results: According to pairwise analysis, the active twins had lower body fat percentage (p = 0.029) and homeostatic model assessment index (p = 0.031) and higher Matsuda index (p = 0.021) compared with their inactive co-twins. Striatal and prefrontal cortex (subgyral and inferior frontal gyrus) brain gray matter volumes were larger in the nondominant hemisphere in active twins compared with those in inactive co-twins, with a statistical threshold of p < 0.001.
Conclusions: Among healthy adult male twins in their mid-30s, a greater level of physical activity is associated with improved glucose homeostasis and modulation of striatum and prefrontal cortex gray matter volume, independent of genetic background. The findings may contribute to later reduced risk of type 2 diabetes and mobility limitations.
In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution, i.e. every finite linear combination of them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.
Dilution is a regularization technique for reducing overfitting in artificial neural networks by preventing complex co-adaptations on training data. It is an efficient way of performing model averaging with neural networks. The term dilution refers to the thinning of the weights. The term dropout refers to randomly "dropping out", or omitting, units during the training process of a neural network. Both the thinning of weights and dropping out units trigger the same type of regularization, and often the term dropout is used when referring to the dilution of weights.
Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.
This paper addresses the general problem of reinforcement learning (RL) in partially observable environments. In 2013, our large RL recurrent neural networks (RNNs) learned from scratch to drive simulated cars from high-dimensional video input. However, real brains are more powerful in many ways. In particular, they learn a predictive model of their initially unknown environment, and somehow use it for abstract (e.g., hierarchical) planning and reasoning. Guided by algorithmic information theory, we describe RNN-based AIs (RNNAIs) designed to do the same. Such an RNNAI can be trained on never-ending sequences of tasks, some of them provided by the user, others invented by the RNNAI itself in a curious, playful fashion, to improve its RNN-based world model. Unlike our previous model-building RNN-based RL machines dating back to 1990, the RNNAI learns to actively query its model for abstract reasoning and planning and decision making, essentially "learning to think." The basic ideas of this report can be applied to many other cases where one RNN-like system exploits the algorithmic information content of another. They are taken from a grant proposal submitted in Fall 2014, and also explain concepts such as "mirror neurons." Experimental results will be described in separate papers.
Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects, such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."
The magnitude and direction of reported physiological effects induced using transcranial magnetic stimulation (TMS) to modulate human motor cortical excitability have proven difficult to replicate routinely. We conducted an online survey on the prevalence and possible causes of these reproducibility issues. A total of 153 researchers were identified via their publications and invited to complete an anonymous internet-based survey that asked about their experience trying to reproduce published findings for various TMS protocols. The prevalence of questionable research practices known to contribute to low reproducibility was also determined. We received 47 completed surveys from researchers with an average of 16.4 published papers (95% CI 10.8–22.0) that used TMS to modulate motor cortical excitability. Respondents also had a mean of 4.0 (2.5–5.7) relevant completed studies that would never be published. Across a range of TMS protocols, 45–60% of respondents found similar results to those in the original publications; the other respondents were able to reproduce the original effects only sometimes or not at all. Only 20% of respondents used formal power calculations to determine study sample sizes. Others relied on previously published studies (25%), personal experience (24%) or flexible post-hoc criteria (41%). Approximately 44% of respondents knew researchers who engaged in questionable research practices (range 32–70%), yet only 18% admitted to engaging in them (range 6–38%). These practices included screening subjects to find those that respond in a desired way to a TMS protocol, selectively reporting results and rejecting data based on a gut feeling. In a sample of 56 published papers that were inspected, not a single questionable research practice was reported. Our survey revealed that approximately 50% of researchers are unable to reproduce published TMS effects. Researchers need to start increasing study sample size and eliminating—or at least reporting—questionable research practices in order to make the outcomes of TMS research reproducible.
Fluid intelligence involves novel problem-solving and may be susceptible to poor sleep. This study examined relationships between adolescent sleep, fluid intelligence, and academic achievement. Participants were 217 adolescents (42% male) aged 13 to 18 years (mean age, 14.9 years; SD=1.0) in grades 9–11. Fluid intelligence was predicted to mediate the relationship between adolescent sleep and academic achievement. Students completed online questionnaires of self-reported sleep, fluid intelligence (Letter Sets and Number Series), and self-reported grades. Total sleep time was not significantly related to fluid intelligence nor academic achievement (both p > 0.05); however, sleep difficulty (e.g. difficulty initiating sleep, unrefreshing sleep) was related to both (p < 0.05). The strength of the relationship between sleep difficulty and grades was reduced when fluid intelligence was introduced into the model; however, the z-score was not significant to confirm mediation.Nevertheless, fluid intelligence is a cognitive ability integral in academic achievement, and in this study has been shown it to be susceptible to sleep impairments (but not duration) in adolescents.
Archaeoacoustics is the use of acoustical study as a methodological approach within the field of archaeology. Archaeoacoustics examines the acoustics of archaeological sites and artifacts. It is an interdisciplinary field that includes archaeology, ethnomusicology, acoustics and digital modelling, and is part of the wider field of music archaeology, with a particular interest in prehistoric music. Since many cultures explored through archaeology were focused on the oral and therefore the aural, researchers believe that studying the sonic nature of archaeological sites and artifacts may reveal new information on the civilizations scrutinized.
Borges considers the problem of whether Argentinian writing on non-Argentinian subjects can still be truly "Argentine." His conclusion: ...We should not be alarmed and that we should feel that our patrimony is the universe; we should essay all themes, and we cannot limit ourselves to purely Argentine subjects in order to be Argentine; for either being Argentine is an inescapable act of fate—and in that case we shall be so in all events—or being Argentine is a mere affectation, a mask. I believe that if we surrender ourselves to that voluntary dream which is artistic creation, we shall be Argentine and we shall also be good or tolerable writers.
Star Wars: The Force Awakens is a 2015 American epic space opera film produced, co-written and directed by J. J. Abrams. It is the first installment in the Star Wars sequel trilogy, following the story of Return of the Jedi (1983), and is the seventh episode of the nine-part "Skywalker saga". The film was produced by Lucasfilm and Abrams' production company Bad Robot Productions, and was distributed by Walt Disney Studios Motion Pictures. The film's ensemble cast includes Harrison Ford, Mark Hamill, Carrie Fisher, Adam Driver, Daisy Ridley, John Boyega, Oscar Isaac, Lupita Nyong'o, Andy Serkis, Domhnall Gleeson, Anthony Daniels, Peter Mayhew, and Max von Sydow. Set thirty years after Return of the Jedi, The Force Awakens follows Rey, Finn, Poe Dameron, and Han Solo's search for Luke Skywalker and their fight in the Resistance, led by General Leia Organa and veterans of the Rebel Alliance, against Kylo Ren and the First Order, a successor to the Galactic Empire.
How the Grinch Stole Christmas! is a 1966 animated television special, directed and co-produced by Chuck Jones. It is based on the 1957 children's book of the same name by Dr. Seuss, and tells the story of the Grinch, who tries to ruin the Christmas for the townsfolk of Whoville below his mountain hideaway. Originally telecast in the United States on CBS on December 18, 1966, it went on to become a perennial holiday special. The special also features the voice of Boris Karloff as the Grinch and the narrator.
Big Hero 6 is a 2014 American 3D computer animated superhero film produced by Walt Disney Animation Studios and released by Walt Disney Pictures. Loosely based on the superhero team of the same name by Marvel Comics, the film is the 53rd Disney animated feature film. Directed by Don Hall and Chris Williams, the film tells the story of Hiro Hamada, a young robotics prodigy, and Baymax, his late brother's healthcare provider robot, who forms a superhero team to combat a masked villain. The film features the voices of Scott Adsit, Ryan Potter, Daniel Henney, T.J. Miller, Jamie Chung, Damon Wayans Jr., Genesis Rodriguez, Alan Tudyk, James Cromwell, and Maya Rudolph.
Brave is a 2012 American computer-animated fantasy film produced by Pixar Animation Studios and released by Walt Disney Pictures. It was directed by Mark Andrews and Brenda Chapman and co-directed by Steve Purcell. The story is by Chapman, with the screenplay by Andrews, Purcell, Chapman and Irene Mecchi. The film was produced by Katherine Sarafian, with John Lasseter, Andrew Stanton, and Pete Docter as executive producers. The film's voice cast features Kelly Macdonald, Billy Connolly, Emma Thompson, Julie Walters, Robbie Coltrane, Kevin McKidd, and Craig Ferguson. Set in the Scottish Highlands, the film tells the story of a princess named Merida who defies an age-old custom, causing chaos in the kingdom by expressing the desire not to be betrothed. When her mother falls victim to a beastly curse turning into a bear, Merida must look within herself and find the key to saving the kingdom. Merida is the first Disney Princess created by Pixar. The film is also dedicated to Steve Jobs, who died before the film's release.
A Charlie Brown Christmas is a 1965 animated television special, and is the first TV special based on the comic strip Peanuts, by Charles M. Schulz. Produced by Lee Mendelson and directed by Bill Melendez, the program made its debut on CBS on December 9, 1965. In this special, Charlie Brown finds himself depressed despite the onset of the cheerful holiday season. Lucy suggests he direct a neighborhood Christmas play, but his best efforts are ignored and mocked by his peers. After Linus tells Charlie Brown about the true meaning of Christmas, Charlie Brown cheers up, and the Peanuts gang unites to celebrate the Christmas season.
Subscription page for the monthly gwern.net newsletter. There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews. You can also browse the archives since December 2013.
Newsletter tag: archive of all issues back to 2013 for the gwern.net newsletter (monthly updates, which will include summaries of projects I’ve worked on that month (the same as the changelog), collations of links or discussions from my subreddit, and book/movie reviews.)