Skip to main content

heritability/​rare-variants directory


“Polygenic Risk Score As a Possible Tool for Identifying Familial Monogenic Causes of Complex Diseases”, Lu et al 2022

“Polygenic risk score as a possible tool for identifying familial monogenic causes of complex diseases”⁠, Tianyuan Lu, Vincenzo Forgetta, John Brent Richards, Celia M. T. Greenwood (2022-04-23):

Purpose: The study aimed to evaluate whether polygenic risk scores could be helpful in addition to family history for triaging individuals to undergo deep-depth diagnostic sequencing for identifying monogenic causes of complex diseases.

Methods: Among 44,550 exome-sequenced European ancestry UK Biobank participants, we identified individuals with a clinically reported or computationally predicted monogenic pathogenic variant for breast cancer, bowel cancer, heart disease, diabetes, or Alzheimer disease. We derived polygenic risk scores for these diseases. We tested whether a polygenic risk score could identify rare pathogenic variant heterozygotes among individuals with a parental disease history.

Results: Monogenic causes of complex diseases were more prevalent among individuals with a parental disease history than in the rest of the population. Polygenic risk scores showed moderate discriminative power to identify familial monogenic causes. For instance, we showed that prescreening the patients with a polygenic risk score for type 2 diabetes can prioritize individuals to undergo diagnostic sequencing for monogenic diabetes variants and reduce needs for such sequencing by up to 37%.

Conclusion: Among individuals with a family history of complex diseases, those with a low polygenic risk score are more likely to have monogenic causes of the disease and could be prioritized to undergo genetic testing.

[Keywords: complex traits and diseases, family history, genome-wide genotyping, polygenic risk score, rare variant screening]

“The Contributions of Rare Inherited and Polygenic Risk to ASD in Multiplex Families”, Chang et al 2022

“The Contributions of Rare Inherited and Polygenic Risk to ASD in Multiplex Families”⁠, Timothy S. Chang, Matilde Cirnigliaro, Stephanie A. Arteaga, Laura Pérez-Cano, Elizabeth K. Ruzzo, Aaron Gordon et al (2022-04-16; ):

Autism Spectrum Disorder (ASD) has a complex genetic architecture involving contributions from de novo and inherited variation. Few studies have been designed to address the role of rare inherited variation, or its interaction with polygenic risk in ASD.

Here, we performed whole genome sequencing of the largest cohort of multiplex families to date, consisting of 4,551 individuals in 1,004 families having 2 or more affected children with ASD.

Using this study design, we identify 7 novel risk genes supported primarily by rare inherited variation, finding support for a total of 74 genes in our cohort and a total of 152 genes after combining with other studies. Probands demonstrated an increased burden of mutations in 2 or more known risk genes (KARGs)—in 3 families both probands inherited protein-truncating variants in two KARGs. We also find that polygenic risk is over-transmitted from unaffected parents to affected children with rare inherited variants, consistent with combinatorial effects in the offspring, which may explain the reduced penetrance of these rare variants in parents. We also observe that in addition to social dysfunction, language delay is associated with ASD polygenic risk over-transmission.

These results are consistent with an additive complex genetic risk architecture of ASD involving rare and common variation and further suggest that language delay is a core biological feature of ASD.

“Integrating Whole-genome Sequencing With Multi-omic Data Reveals the Impact of Structural Variants on Gene Regulation in the Human Brain”, Vialle et al 2022

2022-vialle.pdf: “Integrating whole-genome sequencing with multi-omic data reveals the impact of structural variants on gene regulation in the human brain”⁠, Ricardo A. Vialle, Katia Paiva Lopes, David A. Bennett, John F. Crary, Towfique Raj (2022-03-13; ; similar):

[Twitter] Structural variants (SVs), which are genomic rearrangements of more than 50 base pairs, are an important source of genetic diversity and have been linked to many diseases. However, it remains unclear how they modulate human brain function and disease risk.

Here we report 170,996 SVs discovered using 1,760 short-read whole genomes from aged adults and individuals with Alzheimer’s disease. By applying quantitative trait locus (SV-xQTL) analyses, we quantified the impact of cis-acting SVs on histone modifications, gene expression, splicing and protein abundance in postmortem brain tissues.

More than 3,200 SVs were associated with at least one molecular phenotype. We found reproducibility of 65–99% SV-eQTLs across cohorts and brain regions. SV associations with mRNA and proteins shared the same direction of effect in more than 87% of SV-gene pairs. Mediation analysis showed ~8% of SV-eQTLs mediated by histone acetylation and ~11% by splicing. Additionally, associations of SVs with progressive supranuclear palsy identified previously known and novel SVs.

“Characterization of Arabian Peninsula Whole Exomes: Exploring High Inbreeding Features”, Ferreira et al 2022

“Characterization of Arabian Peninsula whole exomes: exploring high inbreeding features”⁠, Joana C. Ferreira, Farida Alshamali, Luisa Pereira, Veronica Fernandes (2022-02-22; similar):

The exome (WES) capture enriched for UTRs on 90 Arabian Peninsula (AP) populations contributed nearly 20,000 new variants from a total over 145,000 total variants. Almost half of these variants were in UTR3, reflecting the low effort we have dedicated to cataloguing these regions, which can bear an important proportion of functional variants, as being discovered in genome-wide association studies.

By applying several pathogenic predicting tools, we have demonstrated the high burden in potentially deleterious variants (especially in nonsynonymous and UTR variants located in genes that have been associated mainly with neurologic disease and congenital malformations) contained in AP WES, and the burden was as high as the consanguinity level (inferred as sum of runs of homozygosity, SROH) increased. Arabians had twice SROH values in relation to Europeans and East Asians, and within AP, Saudi Arabia had the highest values and Oman the lowest.

We must pursuit cataloguing diversity in populations with high consanguinity, as the potentially pathogenic variants are not eliminated by genetic drift as much as in less consanguineous populations.

“Genome-wide Analyses of ADHD Identify 27 Risk Loci, Refine the Genetic Architecture and Implicate Several Cognitive Domains”, Demontis et al 2022

“Genome-wide analyses of ADHD identify 27 risk loci, refine the genetic architecture and implicate several cognitive domains”⁠, Ditte Demontis, Bragi Walters, Georgios Athanasiadis, Raymond Walters, Karen Therrien, Leila Farajzadeh et al (2022-02-16; ⁠, ; similar):

Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood psychiatric disorder, with a major genetic component. Here we present a GWAS meta-analysis of ADHD comprising 38,691 individuals with ADHD and 186,843 controls. We identified 27 genome-wide statistically-significant loci, which is more than twice the number previously reported. Fine-mapping risk loci highlighted 76 potential risk genes enriched in genes expressed in brain, particularly the frontal cortex, and in early brain development. Overall, ADHD was associated with several brain specific neuronal sub-types and especially midbrain dopaminergic neurons. In a subsample of 17,896 exome-sequenced individuals, we identified increased load of rare protein-truncating variants in cases for a set of risk genes enriched with likely causal common variants, suggesting implication of SORCS3 in ADHD by both common and rare variants. We found ADHD to be highly polygenic, with around seven thousand variants explaining 90% of the SNP heritability. Bivariate Gaussian mixture modeling estimated that more than 90% of ADHD influencing variants are shared with other psychiatric disorders (autism, schizophrenia and depression) and phenotypes (eg. educational attainment) when both concordant and discordant variants are considered. Additionally, we demonstrated that common variant ADHD risk was associated with impaired complex cognition such as verbal reasoning and a range of executive functions including attention.

“Reduced Reproductive Success Is Associated With Selective Constraint on Human Genes”, Gardner et al 2022

“Reduced reproductive success is associated with selective constraint on human genes”⁠, Eugene J. Gardner, Matthew D. C. Neville, Kaitlin E. Samocha, Kieron Barclay, Martin Kolk, Mari E. K. Niemi et al (2022-02-03; ⁠, ⁠, ; similar):

Genome-wide sequencing of human populations has revealed substantial variation among genes in the intensity of purifying selection acting on damaging genetic variants1. While genes under the strongest selective constraint are highly enriched for associations with Mendelian disorders, most of these genes are not associated with disease and therefore the nature of the selection acting on them is not known2.

Here we show that genetic variants that damage these genes are associated with markedly reduced reproductive success, primarily due to increased childlessness, with a stronger effect in males than in females. We present evidence that increased childlessness is likely mediated by genetically associated cognitive and behavioural traits, which may mean male carriers are less likely to find reproductive partners.

This reduction in reproductive success may account for 20% of purifying selection against heterozygous variants that ablate protein-coding genes. While this genetic association could only account for a very minor fraction of the overall likelihood of being childless (less than 1%), especially when compared to more influential sociodemographic factors, it may influence how genes evolve over time.

“Genetic Risk Factors Have a Substantial Impact on Healthy Life Years”, Jukarainen et al 2022

“Genetic risk factors have a substantial impact on healthy life years”⁠, Sakari Jukarainen, Tuomo Kiiskinen, Aki S. Havulinna, Juha Karjalainen, Mattia Cordioli, Joel T. Rämö et al (2022-01-28; ⁠, ; similar):

The impact of genetic variation on overall disease burden has not been comprehensively evaluated. Here we introduce an approach to estimate the effect of different types of genetic risk factors on disease burden quantified through disability-adjusted life years (DALYs, “lost healthy life years”). We use genetic information from 735,748 individuals with registry-based follow-up of up to 48 years. At the individual level, rare variants had higher effects on DALYs than common variants, while common variants were more relevant for population-level disease burden. Among common variants, rs3798220 (LPA) had the strongest effect, with 1.18 DALYs attributable to carrying 1 vs 0 copies of the minor allele. Belonging to top 10% vs bottom 90% of a polygenic score for multisite chronic pain had an effect of 3.63 DALYs. Carrying a deleterious rare variant in LDLR, MYBPC3, or BRCA1/​2 had an effect of around 4.1–13.1 DALYs. The population-level disease burden attributable to some common variants is comparable to the burden from modifiable risk factors such as high sodium intake and low physical activity. Genetic risk factors can explain a sizeable number of healthy life years lost both at the individual and population level, highlighting the importance of incorporating genetic information into public health efforts.

“From Variant to Function in Human Disease Genetics”, Lappalainen & MacArthur 2022

2021-lappalainen.pdf: “From variant to function in human disease genetics”⁠, Tuuli Lappalainen, Daniel G. MacArthur (2022-01-21; similar):

Over the next decade, the primary challenge in human genetics will be to understand the biological mechanisms by which genetic variants influence phenotypes, including disease risk. Although the scale of this challenge is daunting, better methods for functional variant interpretation will have transformative consequences for disease diagnosis, risk prediction, and the development of new therapies. An array of new methods for characterizing variant impact at scale, using patient tissue samples as well as in vitro models, are already being applied to dissect variant mechanisms across a range of human cell types and environments. These approaches are also increasingly being deployed in clinical settings. We discuss the rationale, approaches, applications, and future outlook for characterizing the molecular and cellular effects of genetic variants.

“Life Histories of Myeloproliferative Neoplasms Inferred from Phylogenies”, Williams et al 2022

2022-williams.pdf: “Life histories of myeloproliferative neoplasms inferred from phylogenies”⁠, Nicholas Williams, Joe Lee, Emily Mitchell, Luiza Moore, E. Joanna Baxter, James Hewinson, Kevin J. Dawson et al (2022-01-20; similar):

Mutations in cancer-associated genes drive tumour outgrowth, but our knowledge of the timing of driver mutations and subsequent clonal dynamics is limited.

Here, using whole-genome sequencing of 1,013 clonal haematopoietic colonies from 12 patients with myeloproliferative neoplasms⁠, we identified 580,133 somatic mutations to reconstruct haematopoietic phylogenies and determine clonal histories.

Driver mutations were estimated to occur early in life, including the in utero period. JAK2V617F was estimated to have been acquired by 33 weeks of gestation to 10.8 years of age in 5 patients in whom JAK2V617F was the first event. DNMT3A mutations were acquired by 8 weeks of gestation to 7.6 years of age in 4 patients, and a PPM1D mutation was acquired by 5.8 years of age. Additional genomic events occurred before or following JAK2V617F acquisition and as independent clonal expansions⁠. Sequential driver mutation acquisition was separated by decades across life, often outcompeting ancestral clones. The mean latency between JAK2V617F acquisition and diagnosis was 30 years (range 11–54 years). Estimated historical rates of clonal expansion varied substantially (3%–190% per year), increased with additional driver mutations, and predicted latency to diagnosis.

Our study suggests that early driver mutation acquisition and life-long growth and evolution underlie adult myeloproliferative neoplasms, raising opportunities for earlier intervention and a new model for cancer development.

[Derek Lowe:

When someone is diagnosed with cancer, it’s a natural response for them to wonder what they did wrong, and how they could have avoided it…It’s clear that there are mutations in the stem cells (“driver mutations” that lead to a cancer phenotype), and for many years it appeared that these might occur late in life and not that long before diagnosis. The studies of increased leukemia risk in survivors of the Hiroshima and Nagasaki atomic bombings originally supported this view, but long-term follow-up (see that link) shows a complex situation with regard to radiation exposure, age at the time of the bombings, time elapsed since 1945, and the type of leukemia that developed. And it’s long been known that people who do not show signs of actual leukemia can harbor one or more of these driver mutations. Some of these people do indeed go on to develop MPNs, which suggests that there might be a longer “multi-hit” process that could go on for many years.

This work supports that idea. The team studies 12 MPN patients, whose tissue samples provided over a thousand different clones of malignant blood cells. Sequencing these turned up over 580,000 mutations (!), and the paper puts these into a phylogenetic framework to reconstruct the sequence of what the key mutations were and when they might have taken place. Using rates of mutation as a clock, some of them appear to go back even to before birth—the key JAK2V617F mutation, long associated with these malignancies, is estimated to have shown up anywhere from the 33rd week of gestation up to the age of 11. The DNMT3 mutation, similarly, seems to have appeared from the 8th week of gestation (!) out to about the age of 8. Additional driver mutations layer on top of these early events over the years to come—the mean latency between the JAK2 mutation and diagnosis of cancer, for example, was about 30 years.

…But in all cases, it seems clear that it takes many years for MPNs to develop—the diagnoses that are made in the clinic are capturing the end result of what is often a decades-long process of accumulated mutations and clonal expansion. This suggests that targeting therapies towards these mutated cells earlier in life, before the patients involved even have cancer at all, could be a really useful strategy. And it also suggests that this framework doesn’t apply only to blood cancers, either (it’s just easier to prove there).]

[cf. “Inequality in genetic cancer risk suggests bad genes rather than bad luck”⁠, Stensrud & Valberg 2017/​Valberg et al 2017]

“Rare Genetic Variants Correlate With Better Processing Speed”, Song et al 2022

“Rare Genetic Variants Correlate with Better Processing Speed”⁠, Zeyuan Song, Anastasia Gurinovich, Marianne Nygaard, Jonas Mengel-From, Stacy Andersen, Stephanie Cosentino et al (2022-01-12; ; similar):

We conducted a genome-wide association study (GWAS) of Digit Symbol Substitution Test (DSST) scores administered in 4207 family members of the Long Life Family Study (LLFS). Genotype data were imputed to the HRC panel of 64,940 haplotypes resulting in ~15M genetic variants with quality score > 0.7. The results were replicated using genetic data imputed to the 1000 Genomes phase 3 reference panel from two Danish twin cohorts: the study of Middle Aged Danish Twins and the Longitudinal Study of Aging Danish Twins. The GWAS in LLFS discovered 20 rare genetic variants (minor allele frequency (MAF) < 1.0%) that reached genome-wide statistical-significance (p-value < 5×10−8). Among these, 18 variants had large protective effects on the processing speed, including rs7623455, rs9821776, rs9821587, rs78704059 on chromosome 3, which were replicated in the combined Danish twin cohort. These SNPs are located in/​near two genes, THRB and RARB, that belonged to thyroid hormone receptors family that may influence speed of metabolism and cognitive aging. The gene-level tests in LLFS confirmed that these two genes are associated with processing speed.

“Ultra-Rapid Nanopore Genome Sequencing in a Critical Care Setting”, Gorzynski et al 2022

2022-gorzynski.pdf: “Ultra-Rapid Nanopore Genome Sequencing in a Critical Care Setting”⁠, John E. Gorzynski, Sneha D. Goenka, Kishwar Shafin, Tanner D. Jensen, Dianna G. Fisk, Megan E. Grove et al (2022-01-12; similar):

[Using long read whole genome sequencing, we have broken the record for making the fastest genetic diagnosis—multiple times. Our fastest: 7hrs18min.

The new method published today has the potential to revolutionize diagnosing critically ill patients.

Our team…aimed to make fast/​accurate genetic diagnoses using nanopore WGS optimized sample prep and loading 48 Oxford PomethION flow cells; created a pipeline to transfer data to the cloud, base call, and align in real time; optimized PEPPER-Margin-DeepVariant to quickly call variants; and the rest of the curation team customized a variant filtration schema that was not only fast, but reduced the list of variants for manual curation substantially, while still maintaining sensitivity…In some cases our average sequencing rate exceeded 1.8gb/​min—a 1× genome in 1min45sec—unprecedented speed! One case was sequenced so fast that we set a Guinness World Record for the fastest DNA sequencing technique.

We then recruited 12 critically ill patients and sequenced their genomes to ~50×. The patients ranged in age from 3 months to 57 years and had clinical presentations including neurological/​seizure disorders, sudden cardiac arrests, and severe heart failure. In 5 cases we identified genetics variants (SNPs and INDELs) in gene such as RYR2⁠, TNNT2⁠, PCDH19⁠, and CSNK2B that explained the patient’s clinical signs. These findings led to definitive genetic diagnosis.

As a result, these patients received precision care weeks earlier than had they had standard genetic testing. Treatments included surgical interventions, a heart transplant, changes to their medicines, and family screening.]

[Google post]</​>

“Rare Schizophrenia Risk Variant Burden Is Conserved in Diverse Human Populations”, Liu et al 2022

“Rare schizophrenia risk variant burden is conserved in diverse human populations”⁠, Dongjing Liu, Dara Meyer, Brian Fennessy, Claudia Feng, Esther Cheng, Jessica S. Johnson, You Jeong Park et al (2022-01-03; ; similar):

Schizophrenia is a chronic mental illness that is amongst the most debilitating conditions encountered in medical practice. A recent landmark schizophrenia study of the protein-coding regions of the genome identified a causal role for ten genes and a concentration of rare variant signals in evolutionarily constrained genes1. This study—and most other large-scale human genetic studies—was mainly composed of individuals of European ancestry, and the generalizability of the findings in non-European populations is unclear. To address this gap in knowledge, we designed a custom sequencing panel based on current knowledge of the genetic architecture of schizophrenia and applied it to a new cohort of 22,135 individuals of diverse ancestries. Replicating earlier work, cases carried a significantly higher burden of rare protein-truncating variants among constrained genes (OR = 1.48, p-value = 5.4 x 10−6). In meta-analyses with existing schizophrenia datasets totaling up to 35,828 cases and 107,877 controls, this excess burden was largely consistent across five continental populations. Two genes (SRRM2 and AKAP11) were newly implicated as schizophrenia risk genes, and one gene (PCLO) was identified as a shared risk gene for schizophrenia and autism. Overall, our results lend robust support to the rare allelic spectrum of the genetic architecture of schizophrenia being conserved across diverse human populations.

“Schizophrenia-associated Somatic Copy Number Variants from 12,834 Cases Reveal Contribution to Risk and Recurrent, Isoform-specific NRXN1 Disruptions”, Maury et al 2022

“Schizophrenia-associated somatic copy number variants from 12,834 cases reveal contribution to risk and recurrent, isoform-specific NRXN1 disruptions”⁠, Eduardo A. Maury, Maxwell A. Sherman, Giulio Genovese, Thomas G. Gilgenast, Prashanth Rajarajan, Erin Flaherty et al (2022; similar):

While inherited and de novo copy number variants (CNV) have been implicated in the genetic architecture of schizophrenia (SCZ), the contribution of somatic CNVs (sCNVs), present in some but not all cells of the body, remains unknown.

Here we explore the role of sCNVs in SCZ by analyzing blood-derived genotype arrays from 12,834 SCZ cases and 11,648 controls.

sCNVs were more common in cases (0.91%) than in controls (0.51%, p = 2.68e-4). We observed recurrent somatic deletions of exons 1–5 of the NRXN1 gene in 5 SCZ cases. Allele-specific Hi-C maps revealed ectopic, allele-specific loops forming between a potential novel cryptic promoter and non-coding cis regulatory elements upon deletions in the 59 region of NRXN1. We also observed recurrent intragenic deletions of ABCB11, a gene associated with anti-psychotic response, in 5 treatment-resistant SCZ cases.

Taken together our results indicate an important role of sCNVs to SCZ risk and treatment-responsiveness.

“High-impact Rare Genetic Variants in Severe Schizophrenia”, Zoghbi et al 2021

“High-impact rare genetic variants in severe schizophrenia”⁠, Anthony W. Zoghbi, Ryan S. Dhindsa, Terry E. Goldberg, Aydan Mehralizade, Joshua E. Motelow, Xinchen Wang et al (2021-12-21; ):

In this study, we found that selecting individuals with extremely severe forms of schizophrenia led to a substantially improved ability to detect disease-associated rare variants. The high prevalence of rare variant risk factors in individuals with severe, extremely treatment-resistant schizophrenia suggests future clinical opportunities for risk prediction, prognostic stratification, and genetic counseling. These findings have implications for the design of future genetic studies in schizophrenia and highlight a strategy to reduce phenotypic heterogeneity and improve gene discovery efforts in other neuropsychiatric disorders.

Extreme phenotype sequencing has led to the identification of high-impact rare genetic variants for many complex disorders but has not been applied to studies of severe schizophrenia.

We sequenced 112 individuals with severe, extremely treatment-resistant schizophrenia, 218 individuals with typical schizophrenia, and 4,929 controls. We compared the burden of rare, damaging missense and loss-of-function variants between severe, extremely treatment-resistant schizophrenia, typical schizophrenia, and controls across mutation intolerant genes.

Individuals with severe, extremely treatment-resistant schizophrenia had a high burden of rare loss-of-function (odds ratio⁠, 1.91; 95% CI⁠, 1.39 to 2.63; p = 7.8 × 10−5) and damaging missense variants in intolerant genes (odds ratio, 2.90; 95% CI, 2.02 to 4.15; p = 3.2 × 10−9). A total of 48.2% of individuals with severe, extremely treatment-resistant schizophrenia carried at least one rare, damaging missense or loss-of-function variant in intolerant genes compared to 29.8% of typical schizophrenia individuals (odds ratio, 2.18; 95% CI, 1.33 to 3.60; p = 1.6 × 10−3) and 25.4% of controls (odds ratio, 2.74; 95% CI, 1.85 to 4.06; p = 2.9 × 10−7). Restricting to genes previously associated with schizophrenia risk strengthened the enrichment with 8.9% of individuals with severe, extremely treatment-resistant schizophrenia carrying a damaging missense or loss-of-function variant compared to 2.3% of typical schizophrenia (odds ratio, 5.48; 95% CI, 1.52 to 19.74; p = 0.02) and 1.6% of controls (odds ratio, 5.82; 95% CI, 3.00 to 11.28; p = 2.6 × 10−8).

These results demonstrate the power of extreme phenotype case selection in psychiatric genetics and an approach to augment schizophrenia gene discovery efforts.

[Keywords: schizophrenia, genomics, rare variants, treatment-resistant schizophrenia]

“The Origins and Functional Effects of Postzygotic Mutations throughout the Human Lifespan”, Rockweiler et al 2021

“The origins and functional effects of postzygotic mutations throughout the human lifespan”⁠, Nicole B. Rockweiler, Avinash Ramu, Liina Nagirnaja, Wing H. Wong, Michiel J. Noordam, Casey W. Drubin et al (2021-12-21; similar):

Postzygotic mutations (PZMs) begin to accrue in the human genome immediately after fertilization, but how and when PZMs affect development and lifetime health remains unclear. To study the origins and functional consequences of PZMs, we generated a multi-tissue atlas of PZMs from 948 donors using the final major release of the Genotype-Tissue Expression (GTEx) project. Nearly half the variation in mutation burden among tissue samples can be explained by measured technical and biological effects, while 9% can be attributed to donor-specific effects. Through phylogenetic reconstruction of PZMs, we find that their type and predicted functional impact varies during prenatal development, across tissues, and the germ cell lifecycle. Remarkably, a class of prenatal mutations was predicted to be more deleterious than any other category of genetic variation investigated and under positive selection as strong as somatic mutations in cancers. In total, the data indicate that PZMs can contribute to phenotypic variation throughout the human lifespan, and, to better understand the relationship between genotype and phenotype, we must broaden the long-held assumption of one genome per individual to multiple, dynamic genomes per individual.

“Familial Risk and Heritability of Intellectual Disability: a Population-based Cohort Study in Sweden”, Lichtenstein et al 2021

“Familial risk and heritability of intellectual disability: a population-based cohort study in Sweden”⁠, Paul Lichtenstein, Magnus Tideman, Patrick F. Sullivan, Eva Serlachius, Henrik Larsson, Ralf Kuja-Halkola et al (2021-12-18; ; similar):

Background: Intellectual disability (ID) aggregates in families, but factors affecting individual risk and heritability estimates remain unknown.

Methods: A population-based family cohort study of 4,165,785 individuals born 1973–2013 in Sweden, including 37,787 ID individuals and their relatives. The relative risks (RR) of ID with 95% confidence intervals (95% CI) were obtained from stratified Cox proportional-hazards models⁠. Relatives of ID individuals were compared to relatives of unaffected individuals. Structural equation modeling was used to estimate heritability.

Results: Relatives of ID individuals were at increased risk of ID compared to individuals with unaffected relatives. The RR of ID among relatives increased proportionally to the degree of genetic relatedness with ID probands; 256.70 (95% CI 161.30–408.53) for monozygotic twins, 16.47 (13.32–20.38) for parents, 14.88(12.19–18.16) for children, 7.04 (4.67–10.61) for dizygotic twins, 8.38 (7.97–8.83) for full siblings, 4.56 (4.02–5.16) for maternal, 2.90 (2.49–3.37) for paternal half-siblings, 3.03 (2.61–3.50) for nephews/​nieces, 2.84 (2.45–3.29) for uncles/​aunts, and 2.04 (1.91–2.20) for cousins. Lower RRs were observed for siblings of probands with chromosomal abnormalities (RR 5.53, 4.74–6.46) and more severe ID (mild RR 9.15, 8.55–9.78, moderate RR 8.13, 7.28–9.08, severe RR 6.80, 5.74–8.07, and profound RR 5.88, 4.52–7.65). Male sex of relative and maternal line of relationship with proband was related to higher risk (RR 1.33, 1.25–1.41 for brothers vs. sisters and RR 1.49, 1.34–1.68 for maternal vs. paternal half-siblings). ID was substantially heritable with 0.95 (95% CI 0.93–0.98) of the variance in liability attributed to genetic influences.

Conclusions: The risk estimates will benefit researchers, clinicians, families in understanding the risk of ID in the family and the whole population. The higher risk of ID related to male sex and maternal linage will be of value for planning and interpreting etiological studies in ID.

“CONGA: Copy Number Variation Genotyping in Ancient Genomes and Low-coverage Sequencing Data”, Soylev et al 2021

“CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data”⁠, Arda Soylev, Sevim Seda Cokoglu, Dilek Koptekin, Can Alkan, Mehmet Somel (2021-12-17; similar):

To date, ancient genome analyses have been largely confined to the study of single nucleotide polymorphisms (SNPs). Copy number variants (CNVs) are a major contributor of disease and of evolutionary adaptation, but identifying CNVs in ancient shotgun-sequenced genomes is hampered by (a) most published genomes being <1× coverage, (2) ancient DNA fragments being typically <80 bps. These characteristics preclude state-of-the-art CNV detection software to be effectively applied to ancient genomes. Here we present CONGA, an algorithm tailored for genotyping deletion and duplication events in genomes with low depths of coverage. Simulations show that CONGA can genotype deletions and duplications >1 Kbps with F-scores >0.77 and >0.82, respectively at > = 0.5×. Further, down-sampling experiments using published ancient BAM files reveal that >1 Kbps deletions could be genotyped at F-score >0.75 at > = 1× coverage. Using CONGA, we analyse deletion events at 10,018 loci in 56 ancient human genomes spanning the last 50,000 years, with coverages 0.4×-26×. We find inter-individual genetic diversity measured using deletions and SNPs to be highly correlated, suggesting that deletion frequencies broadly reflect demographic history. We also identify signatures of purifying selection on deletions, such as an excess of singletons compared to those in SNPs. CONGA paves the way for systematic studies of drift, mutation load, and adaptation in ancient and modern-day gene pools through the lens of CNVs.

“Fine-scale Population Structure and Demographic History of British Pakistanis”, Arciero et al 2021

“Fine-scale population structure and demographic history of British Pakistanis”⁠, Elena Arciero, Sufyan A. Dogra, Daniel S. Malawsky, Massimo Mezzavilla, Theofanis Tsismentzoglou, Qin Qin Huang et al (2021-12-10; similar):

Previous genetic and public health research in the Pakistani population has focused on the role of consanguinity in increasing recessive disease risk, but little is known about its recent population history or the effects of endogamy.

Here, we investigate fine-scale population structure, history and consanguinity patterns using genotype chip data from 2,200 British Pakistanis⁠.

We reveal strong recent population structure driven by the biraderi social stratification system⁠. We find that all subgroups have had low recent effective population sizes (Ne), with some showing a decrease 15‒20 generations ago that has resulted in extensive identity-by-descent sharing and homozygosity⁠, increasing the risk of recessive disorders⁠. Our results from 2 orthogonal methods (one using machine learning and the other coalescent-based) suggest that the detailed reporting of parental relatedness for mothers in the cohort under-represents the true levels of consanguinity.

These results demonstrate the impact of cultural practices on population structure and genomic diversity in Pakistanis, and have important implications for medical genetic studies.

…57% of the BiB Pakistani mothers reported that their parents were related, and 63% reported being related to their child’s father (Supplementary Data 13 and 14). As expected, a much higher fraction of the genome was homozygous (FROH) in the Pakistani mothers than the White British (mean = 0.048 versus 0.0004, 2-sided t.test p < 1 × 10−15).

…Our results suggest that, even in the absence of close consanguinity, increased homozygosity due to endogamy is likely to be contributing to recessive disease burden and the elevated frequency of rare homozygous knockouts in this population. To investigate the relative impact of endogamy versus consanguinity on recessive disease risk, we used exome-sequence data from 2,484 Bradford Pakistani mothers, in which we ascertained pathogenic/​likely pathogenic (P/​LP) variants in autosomal recessive developmental disorder genes. We then simulated intra-biraderi (endogamous) and inter-biraderi (exogamous) couples, and unions between pairs of individuals whose IBD distribution matches that of self-reported first cousins within the dataset (see ‘Methods’). We then scored each couple as being ‘at risk’ of having an affected child if both individuals were carriers of a P/​LP variant in the same gene, similar to the approach in ref. 58. The results (Figure 6) indicate that intra-biraderi unions incur statistically-significantly higher risk than inter-biraderi unions (particularly for the Bains and Jatts; one-sided permutation tests p = 2 × 10−4 and p < 1 × 10−4 respectively), but first cousin unions incur more than ten-fold higher risk than intra-biraderi unions.

…Our findings suggest that clinicians should consider recording parents’ biraderi groups as well as close relatedness in genetic consultations. This will be particularly useful as research becomes more focused on clinical sequencing datasets such as that held by Genomics England. Recording biraderi information would enable further research into the prevalence of different diseases in different biraderi groups, the impacts of endogamy and the possible presence of disease-causing founder mutations. The results from such research will be important to inform and design targeted genomic health services for Pakistani-ancestry populations. However, great care needs to be taken to ensure this research and any application of it is carried out in a culturally sensitive way.

“The Effect of Inbreeding, Body Size and Morphology on Health in Dog Breeds”, Bannasch et al 2021

“The effect of inbreeding, body size and morphology on health in dog breeds”⁠, Danika Bannasch, Thomas Famula, Jonas Donner, Heidi Anderson, Leena Honkanen, Kevin Batcher, Noa Safra et al (2021-12-02; ⁠, ; similar):

Background: Dog breeds are known for their distinctive body shape, size, coat color, head type and behaviors, features that are relatively similar across members of a breed. Unfortunately, dog breeds are also characterized by distinct predispositions to disease. We explored the relationships between inbreeding⁠, morphology and health using genotype based inbreeding estimates, body weight and insurance data for morbidity.

Results: A large dataset (227 breeds; dataset 1) of median heterozygosity values (H) was obtained through commercial DNA testing of 49,378 dogs…In order to investigate the effect of inbreeding level on health we utilized breed-based health data from Agria pet insurance…The average inbreeding based on genotype across 227 breeds was Fadj = 0.249 (95% CI 0.235–0.263).

There were statistically-significant differences in morbidity between breeds with low and high inbreeding (H = 16.49, p = 0.0004). There was also a statistically-significant difference in morbidity between brachycephalic breeds and non-brachycephalic breeds (p = 0.0048) and between functionally distinct groups of breeds (H = 14.95 p < 0.0001). Morbidity was modeled using robust regression analysis and both body weight (p < 0.0001) and inbreeding (p = 0.013) were statistically-significant (R2 = 0.77).

Smaller less inbred breeds were healthier than larger more inbred breeds.

Conclusions: In this study, body size and inbreeding along with deleterious morphologies contributed to increases in necessary health care in dogs.

…The inbreeding values within dog breeds were very high, with the mean being 0.24, just below the coefficient of inbreeding obtained from breeding full siblings. The breeds with low inbreeding included recent cross breeds (Tamaskan Dog, Barbet and Australian Labradoodle) and landrace breeds (Danish-Swedish Farmdog, Mudi and Koolie), supporting the notion that high inbreeding is a result of closed stud books or small numbers of founders or both. It also demonstrates that it is possible to have consistent breed type without inbreeding.

Similar to another recent study, brachycephalic dogs require more veterinary care than non-brachycephalic dogs.34 In addition, we identified that FCI group 2 breeds required the highest average number of veterinary care events. This group includes the larger molossoid dog breeds which others have previously identified as having higher mortality32, 44. The primitive FCI group 5 breeds had the lowest average morbidity of all the groups, which has not been reported previously, except for the Norrbottenspitz breed.45 This may be, in part, due to the large number of primitive breeds for which there is insurance data available in our data set, while other studies may not have had health data available for these breeds.

There were interesting exceptions to the correlation of inbreeding and health. The Border terrier, Basenji, Collie, and English setter breeds have high inbreeding but low morbidity. Likewise, the Malinois, Pomeranian and Russian Tsvetnaya Bolonka (Russian Toy) have lower inbreeding and high morbidity. These example breeds are neither brachycephalic nor particularly known for extreme morphologies. In the case of healthy breeds with high inbreeding, it may be possible that these breeds have been purged of deleterious alleles as has happened with inbred mouse strains [Mouse genetics concepts and applications, Silver 1995]. In the opposite situation (lower inbreeding and high morbidity), the recorded morbidities could be high allele frequency Mendelian diseases or potentially conditions linked to phenotypes under selection in the breed. These discrepancies could also exist due to population differences between the insurance data and the inbreeding data.

…One must consider that the majority of dog breeds displayed high levels of inbreeding well above what would be considered safe for either humans or wild animal populations. The effects of inbreeding on overall fitness have been demonstrated experimentally using mice, where an overall reduction in fitness between mice with F = 0.25 compared to F = 0 was determined to be 57%.54 While this high level of inbreeding was less relevant to many captive and wild species, it is highly relevant to purebred dogs, based on the average inbreeding identified in this study. However the rate of inbreeding between these mouse experiments and what has occurred in dogs breeds is not the same and could have an effect on health. In humans, modest levels of inbreeding (3–6%) were shown to be associated with increased prevalence of late onset complex diseases 55 as well as other types of inbreeding depression⁠.11 These findings in other species combined with the incredibly strong breed predispositions to complex diseases like cancers and autoimmune diseases highlight the potential relevance of high inbreeding in dogs to their health.

“Deletion of Loss-of-Function-Intolerant Genes and Risk of 5 Psychiatric Disorders”, Wainberg et al 2021

2021-wainberg.pdf: “Deletion of Loss-of-Function-Intolerant Genes and Risk of 5 Psychiatric Disorders”⁠, Michael Wainberg, Daniele Merico, Guillaume Huguet, Mehdi Zarrei, Sebastien Jacquemont, Stephen W. Scherer et al (2021-12-01; ; similar):

Copy number variants (CNVs) are key etiological contributors to neuropsychiatric disorders. Most psychiatric CNV studies have focused on several dozen loci, collectively comprising less than 2% of the genome, where CNVs spontaneously recur sufficiently often to have individually detectable psychiatric associations. We hypothesized that knowledge of gene function could guide the search for nonrecurrent CNVs across the remaining 98%. Specifically, probability of loss-of-function intolerance (pLI) and loss-of-function observed/​expected upper bound fraction (LOEUF), 2 gene-level metrics of variation constraint against protein-truncating variants⁠, have been reported to be uniquely associated with the cognitive consequences of CNVs. Here, we show that pLI and LOEUF are similarly associated with the psychiatric consequences of both recurrent and nonrecurrent gene deletions.

Methods: We studied 431 146 self-reported White UK Biobank participants (234 544 females) with International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) codes from linked inpatient, primary care, or death records, excluding participants with neurodevelopmental disorders (ICD-10 codes F70-F89) or who failed CNV quality control (eMethods in the Supplement). The North West Centre for Research Ethics Committee granted ethical approval to UK Biobank⁠, and informed consent was obtained from participants. We stratified genes into 8 categories based on Genome Aggregation Database’s pLI scores (low [0-<0.5], medium [0.5-<0.9], high [0.9-<0.99], or extreme [0.99–1]) and recurrence type (recurrent for genes overlapping any of 32 previously defined recurrent neuropsychiatric deletion CNV loci [eTable in the Supplement] and nonrecurrent otherwise). For each category, we used Firth logistic regression to test whether carriers of CNVs deleting any gene in the category had higher rates of anxiety disorders (F40/​F41), bipolar disorder (F31), major depressive disorder (MDD) (F32/​F33), obsessive-compulsive disorder (OCD) (F42), and schizophrenia (F20).

To guard against incorrectly associating the consequences of one CNV category to another, we excluded participants with recurrent deletions when computing associations for nonrecurrent deletions. We also excluded participants with higher-pLI deletions when computing associations for lower-pLI deletions of the same recurrence type. To reduce false-positives, we analyzed CNV calls from 2 different computational pipelines and required them to agree that a gene was fully deleted. As a sensitivity analysis, we replaced pLI with LOEUF, using thresholds capturing similar numbers of genes (low: ≥0.5; medium: 0.4-<0.5; high: 0.3-<0.4; extreme: 0-<0.3).

Results: Nonrecurrent CNVs deleting extreme-pLI genes were found in 787 participants (0.2%). A total of 571 unique extreme-pLI genes were deleted by nonrecurrent CNVs in 1 or more participants (Table 1), including key neurotransmitter receptors, ion channels, and neurodevelopmental genes.

pLI and LOEUF scores were associated with psychopathogenicity (Figure 1). The 787 participants with nonrecurrent extreme-pLI gene deletions exhibited statistically-significantly higher rates of anxiety (odds ratio [OR], 1.45 [95% CI, 1.11–1.88]), bipolar disorder (OR, 2.78 [95% CI, 1.24–6.23]), MDD (OR, 1.44 [95% CI, 1.14–1.81]), OCD (OR, 6.75 [95% CI, 1.25–36.31]), and schizophrenia (OR, 3.79 [95% CI, 1.45–9.94]). Conversely, the 54 413 participants (12.6%) with nonrecurrent low-pLI gene deletions displayed no greater rates of any disorder: anxiety (OR, 0.99 [95% CI, 0.96–1.03]), bipolar disorder (OR, 0.99 [95% CI, 0.87–1.13]), MDD (OR, 1.00 [95% CI, 0.97–1.03]), OCD (OR, 0.95 [95% CI, 0.76–1.19]), or schizophrenia (OR, 0.96 [95% CI, 0.82–1.13]).

pLI and LOEUF were also associated with CNV psychopathogenicity. Participants with recurrent CNVs deleting extreme-pLI genes had substantially higher rates of anxiety (OR, 1.78 [95% CI, 1.27–2.51]), bipolar disorder (OR, 6.46 [95% CI, 1.49–28.02]), MDD (OR, 2.15 [95% CI, 1.60–2.88]), OCD (OR, 12.07 [95% CI, 1.45–100.77]), and schizophrenia (OR, 8.39 [95% CI, 2.89–24.37]). Conversely, participants with recurrent deletions of low-pLI genes had only modestly increased risk of anxiety (OR, 1.34 [95% CI, 1.06–1.70]) and MDD (OR, 1.25 [95% CI, 1.01–1.54]) and did not have statistically-significantly altered risk of bipolar disorder (OR, 1.39 [95% CI, 0.56–3.50]), OCD (OR, 0.36 [95% CI, 0.09–1.42]), or schizophrenia (OR, 1.41 [95% CI, 0.39–5.13]).

Discussion: While recurrent CNVs are well-known contributors to psychopathology, the current study showed that nonrecurrent CNVs are associated with psychiatric disease risk. Gene-level metrics of mutational constraint were associated with psychopathogenic CNVs, both recurrent and nonrecurrent. The 0.2% of participants with nonrecurrent deletions of extreme-pLI genes had statistically-significantly higher rates of all 5 psychiatric disorders surveyed, whereas participants with nonrecurrent deletions of only low-pLI genes (the vast majority) had no detectable increase in psychiatric disease risk. Limitations of this study include microarray-based CNV calling, incomplete phenotype ascertainment, and ignoring partial gene deletions, as well as the inability to generalize these findings to other racial and ethnic groups.

These results suggest that interpreting CNVs using mutational constraint metrics, such as pLI, may augment population-based psychiatric genomic screening programs. Our approach may ultimately help identify opportunities for early diagnosis and intervention, including personalized therapies targeting specific nonrecurrent CNVs.

“Exploring the Relationships between Autozygosity, Educational Attainment, and Cognitive Ability in a Contemporary, Trans-ancestral American Sample”, Colbert et al 2021

“Exploring the relationships between autozygosity, educational attainment, and cognitive ability in a contemporary, trans-ancestral American sample”⁠, Sarah M. C. Colbert, Matthew C. Keller, Arpana Agrawal, Emma C. Johnson (2021-11-29; ; similar):

Previous studies have found statistically-significant associations between estimated autozygosity—the proportion of an individual’s genome contained in homozygous segments due to distant inbreeding—and multiple traits, including educational attainment (EA) and cognitive ability. In one study, estimated autozygosity showed a stronger association with parental EA than the subject’s own EA. This was likely driven by parental EA’s association with mobility: more educated parents tended to migrate further from their hometown, therefore choosing more genetically diverse partners. We examined the associations between estimated autozygosity, cognitive ability, and parental EA in a contemporary sub-sample of adolescents from the Adolescent Brain and Cognitive Development Study™ (ABCD Study®) (analytic n = 6,504). We found a negative association between autozygosity and child cognitive ability consistent with previous studies, while the associations between autozygosity and parental EA were in the expected direction of effect (with greater levels of autozygosity being associated with lower EA) but the effect sizes were significantly weaker than those estimated in previous work. We also found a lower mean level of autozygosity in the ABCD sample compared to previous autozygosity studies, which may reflect overall decreasing levels of autozygosity over generations. Variation in migration and mobility patterns in the ABCD study compared to other studies may explain the pattern of associations between estimated autozygosity, EA, and cognitive ability in the current study.

“Deep Learning Enables Genetic Analysis of the Human Thoracic Aorta”, Pirruccello et al 2021

2021-pirruccello.pdf: “Deep learning enables genetic analysis of the human thoracic aorta”⁠, James P. Pirruccello, Mark D. Chaffin, Elizabeth L. Chou, Stephen J. Fleming, Honghuang Lin, Mahan Nekoui et al (2021-11-26; ; similar):

Enlargement or aneurysm of the aorta predisposes to dissection⁠, an important cause of sudden death.

We trained a deep learning U-Net model to evaluate the dimensions of the ascending and descending thoracic aorta in 4.6 million cardiac magnetic resonance images from the UK Biobank⁠.

We then conducted genome-wide association studies in 39,688 individuals, identifying 82 loci associated with ascending and 47 with descending thoracic aortic diameter, of which 14 loci overlapped. Transcriptome-wide analyses, rare-variant burden tests and human aortic single nucleus RNA sequencing prioritized genes including SVIL⁠, which was strongly associated with descending aortic diameter. A polygenic score for ascending aortic diameter was associated with thoracic aortic aneurysm in 385,621 UK Biobank participants (hazard ratio = 1.43 per s.d., confidence interval = 1.32–1.54, p = 3.3 × 10−20).

Our results illustrate the potential for rapidly defining quantitative traits with deep learning, an approach that can be broadly applied to biomedical images.

“Comparing Copy Number Variations in a Danish Case Cohort of Individuals With Psychiatric Disorders”, Sánchez et al 2021

2021-sanchez.pdf: “Comparing Copy Number Variations in a Danish Case Cohort of Individuals With Psychiatric Disorders”⁠, Xabier Calle Sánchez, Dorte Helenius, Jonas Bybjerg-Grauholm, Carsten Pedersen, David M. Hougaard, Anders D. Børglum et al (2021-11-24; ; similar):

Question: What are the population-based prevalence and risk of psychiatric disorders associated with pathogenic copy number variations (CNVs) and how do they compare?

Findings: In a cohort study including 86 189 individuals, increased CNV-associated risk of autism⁠, attention-deficit hyperactivity disorder⁠, schizophrenia, and major depressive disorder⁠, as well as bipolar disorder in men for deletion at 1q21.1, was observed. Population-based penetrance estimates were generally lower than those from prior studies; time-dependent analyses identified variegated disease trajectories across genomic loci, whereas deletions and duplications within each locus had similar trajectory patterns.

Meaning: The findings of this study suggest that population-based analysis substantially revises prevalence and penetrance estimates for pathogenic CNVs; precision health care needs to be tailored to the specific CNV, and to the age and gender of the affected individual.

Importance: Although the association between several recurrent genomic copy number variants (CNVs) and mental disorders has been studied for more than a decade, unbiased, population-based estimates of the prevalence, disease risks and trajectories, fertility, and mortality to contrast chromosomal abnormalities and advance precision health care are lacking.

Objective: To generate unbiased, population-based estimates of prevalence, disease risks and trajectories, fertility, and mortality of CNVs implicated in neuropsychiatric disorders.

Design, Setting, & Participants: In a population-based case-cohort study, using the Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH) 2012 database, individuals born between May 1, 1981, and December 31, 2005, and followed up until December 31, 2012, were analyzed. All individuals (n = 57 377) with attention-deficit/​hyperactivity disorder (ADHD), major depressive disorder (MDD), schizophrenia (SCZ), autism spectrum disorder (ASD), or bipolar disorder (BPD) were included, as well as 30 000 individuals randomly drawn from the database. Data analysis was conducted from 2017–07–01 to 2021–09–07.

Exposures: Copy number variants at 6 genomic loci (1q21.1, 15q11.2, 15q13.3, 16p11.2, 17p12, and 17q12).

Main Outcomes & Measures: Population-unbiased hazard ratio (HR) and survival estimates of CNV associations with the 5 ascertained psychiatric disorders, epilepsy⁠, intellectual disability, selected somatic disorders, fertility, and mortality.

Results: Participants’ age ranged from 1 to 32 years (mean, 12.0 [IQR, 6.9] years) during follow-up, and 38 662 were male (52.3%). Copy number variants broadly associated with an increased risk of autism spectrum disorder and ADHD, whereas risk estimates of SCZ for most CNVs were lower than previously reported. Comparison with previous studies suggests that the lower risk estimates are associated with a higher CNV prevalence in the general population than in control samples of most case-control studies. statistically-significant risk of major depressive disorder (HR, 5.8; 95% CI, 1.5–22.2) and sex-specific risk of bipolar disorder (HR, 17; 95% CI, 1.5–189.3, in men only) were noted for the 1q21.1 deletion. Although CNVs at 1q21.1 and 15q13.3 were associated with increased risk across most diagnoses, the 17p12 deletion consistently conferred less risk of psychiatric disorders (HR 0.4–0.8), although none of the estimates differed statistically-significantly from the general population. Trajectory analyses noted that, although diagnostic risk profiles differed across loci, they were similar for deletions and duplications within each locus. Sex-stratified analyses suggest that pathogenicity of many CNVs may be modulated by sex.

Conclusions & Relevance: The findings of this study suggest that the iPSYCH population case cohort reveals broad disease risk for some studied CNVs and narrower risk for others, in addition to sex differential liability. This finding on genomic risk variants at the level of a population may be important for health care planning and clinical decision-making, and thus the advancement of precision health care.

“The Sequences of 150,119 Genomes in the UK Biobank”, Halldorsson et al 2021

“The sequences of 150,119 genomes in the UK biobank”⁠, Bjarni V. Halldorsson, Hannes P. Eggertsson, Kristjan H. S. Moore, Hannes Hauswedell, Ogmundur Eiriksson et al (2021-11-17; similar):

We describe the analysis of whole genome sequencing (WGS) of 150,119 individuals from the UK biobank (UKB). This yielded a set of high quality variants, including 585,040,410 SNPs, representing 7.0% of all possible human SNPs, and 58,707,036 indels. The large set of variants allows us to characterize selection based on sequence variation within a population through a Depletion Rank (DR) score for windows along the genome. DR analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UKB, a large British Irish cohort (XBI) and smaller African (XAF) and South Asian (XSA) cohorts. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large scale WGS studies. Using this formidable new resource, we provide several noteworthy examples of trait associations with rare variants with large effects not found previously through studies based on exome sequencing and/​or imputation.

“The Impact of Rare Germline Variants on Human Somatic Mutation Processes”, Vali-Pour et al 2021

“The impact of rare germline variants on human somatic mutation processes”⁠, Mischan Vali-Pour, Ben Lehner, Fran Supek (2021-11-14; similar):

Somatic mutations are an inevitable component of ageing and the most important cause of cancer. The rates and types of somatic mutation vary across individuals, but relatively few inherited influences on mutation processes are known.

We performed a comprehensive gene-based rare variant association study with diverse mutational processes, using human cancer genomes from over 11,000 individuals of European ancestry. By combining burden and variance tests, we identify 207 associations involving 15 somatic mutational phenotypes and 42 genes that replicated in an independent data set at a FDR of 1%.

We associated rare inherited deleterious variants in novel genes such as MSH3, EXO1, SETD2, and MTOR with two different forms of DNA mismatch repair deficiency, and variants in genes such as EXO1, PAXIP1, and WRN with deficiency in homologous recombination repair. In addition, we identified associations with other mutational processes, such as APEX1 with APOBEC-signature mutagenesis.

Many of the novel genes interact with each other and with known mutator genes within cellular sub-networks. Considered collectively, damaging variants in the newly-identified genes are prevalent in the population. We suggest that rare germline variation in diverse genes commonly impacts mutational processes in somatic cells.

“100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care—Preliminary Report”, Investigators 2021

“100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care—Preliminary Report”⁠, 100k Genomes Project Pilot Investigators (2021-11-11; similar):

Background: The U.K. 100,000 Genomes Project is in the process of investigating the role of whole genome sequencing in patients with undiagnosed rare diseases after usual care and the alignment of this research with health care implementation in the U.K. National Health Service⁠. Other parts of this project focus on patients with cancer and infection.

Methods: We conducted a pilot study involving 4,660 participants from 2,183 families, among whom 161 disorders covering a broad spectrum of rare diseases were present. We collected data on clinical features with the use of Human Phenotype Ontology terms, undertook genome sequencing, applied automated variant prioritization on the basis of applied virtual gene panels and phenotypes, and identified novel pathogenic variants through research analysis.

Results: Diagnostic yields varied among family structures and were highest in family trios (both parents and a proband) and families with larger pedigrees. Diagnostic yields were much higher for disorders likely to have a monogenic cause (35%) than for disorders likely to have a complex cause (11%). Diagnostic yields for intellectual disability, hearing disorders, and vision disorders ranged from 40 to 55%. We made genetic diagnoses in 25% of the probands. A total of 14% of the diagnoses were made by means of the combination of research and automated approaches, which was critical for cases in which we found etiologic noncoding, structural, and mitochondrial genome variants and coding variants poorly covered by exome sequencing⁠. Cohort-wide burden testing across 57,000 genomes enabled the discovery of 3 new disease genes and 19 new associations. Of the genetic diagnoses that we made, 25% had immediate ramifications for clinical decision making for the patients or their relatives.

Conclusions: Our pilot study of genome sequencing in a national health care system showed an increase in diagnostic yield across a range of rare diseases.

…However, South Asian ancestry was statistically-significantly more common among pediatric probands than among adult probands (16% vs. 4%, p < 0.001); our results indicated potential consanguinity in 43% of the 93 pediatric South Asian probands and in 1% of the other 478 pediatric probands (Table 1).

Health Care Outcomes after Diagnosis: The findings from our approach ended long diagnostic odysseys for some participants and their families (the median duration of such an odyssey was 75 months, and the median number of hospital visits was 68) (Table S1), and we speculate that they will mitigate NHS resource costs (the combined cost for 183,273 episodes of hospital care among the affected participants was £87 million [$122 million]) (Table S3). In addition, 134 of the 533 genetic diagnoses (25%) were reported by clinicians to be of immediate clinical actionability—only 11 (0.2%) were described as having no benefit. As of now, the remainder of the diagnoses are of unknown usefulness. The benefits in terms of health care included 4 diagnoses that led to a suggested change in medication, 26 that led to suggested additional surveillance of the proband or relatives, 13 that allowed for clinical trial eligibility, 59 that informed future reproductive choices, and 32 that had other benefits (Table S9).

In several specific probands, diagnoses have had important clinical actionability. In a 36-year-old man with suspected choroideremia⁠, we detected a novel CHM promoter variant causing loss of gene expression⁠,27 a diagnosis that enabled eligibility for a gene-replacement trial. A male neonate proband presented with severe infection and transient neurologic symptoms immediately after birth and died at 4 months of age with no diagnosis but with health care costs of ~£80,000 ($112,000) (Table S10). A diagnosis of transcobalamin II deficiency due to a homozygous frameshift in TCN2 was made from this study, which enabled predictive testing to be offered to the younger brother within 1 week after birth. The younger child, who received a positive result, received weekly hydroxocobalamin injections to prevent metabolic decompensation.

A 10-year-old girl was admitted to the intensive care unit with life-threatening chicken pox⁠. She had undergone a diagnostic odyssey over a period of 7 years at a total cost of £356,571 ($499,199) across 307 secondary care episodes (Table S11). We were able to diagnose CTPS1 deficiency due to a homozygous, known pathogenic splice acceptor variant. A diagnosis enabled a curative bone marrow transplantation (cost of £70,000 [$98,000]), and predictive testing in her siblings showed no additional family members to be at risk.

One proband had waited until his 6th decade of life for a genomic diagnosis of an INF2 mutation causing focal segmental glomerulosclerosis⁠. His father, brother, and uncle had all died from kidney failure⁠. He had received 2 kidney transplants, had transmitted the condition to his daughter, and was concerned about whether his 15-year-old granddaughter, who was under surveillance, was at risk. After he received his genetic diagnosis, the granddaughter was tested, found to be negative, and discharged from regular medical surveillance.

Discussion: Our findings show a substantial increase in yield of genomic diagnoses made in patients with the use of genome sequencing across a broad spectrum of rare disease. The enhanced diagnostic benefit was observed regardless of whether participants had undergone previous genetic testing (diagnostic yields were 31% among those who had undergone testing and 33% among those who had not). In 25% of those who received a genetic diagnosis, there was immediate clinical actionability…The findings from our pilot study support the case for genome sequencing in the diagnosis of certain specific rare diseases in the new NHS National Genomic Test Directory.37 In patients with specific disorders, such as intellectual disability, genome sequencing is now the first-line test in the NHS (Table S12). With a new National Genomic Medicine Service, the NHS in England is in the process of sequencing 500,000 whole genomes in rare disease and cancer in health care. We hope that our findings will assist other health systems in considering the role of genome sequencing in the care of patients with rare diseases.

“Influences of Rare Copy Number Variation on Human Complex Traits”, Hujoel et al 2021

“Influences of rare copy number variation on human complex traits”⁠, Margaux Louise Anna Hujoel, Maxwell A. Sherman, Alison R. Barton, Ronen E. Mukamel, Vijay G. Sankaran et al (2021-10-21; similar):

The human genome contains hundreds of thousands of regions exhibiting copy number variation (CNV). However, the phenotypic effects of most such polymorphisms are unknown because only larger CNVs (spanning tens of kilobases) have been ascertainable from the SNP-array data generated by large biobanks. We developed a new computational approach that leverages abundant haplotype-sharing in biobank cohorts to more sensitively detect CNVs co-inherited within extended SNP haplotypes. Applied to UK Biobank, this approach achieved 6× increased CNV detection sensitivity compared to previous analyses, accounting for ~half of all rare gene inactivation events produced by genomic structural variation. This extensive CNV call set enabled the most comprehensive analysis to date of associations between CNVs and 56 quantitative traits, identifying 269 independent associations (p < 5 × 10−8)—involving 97 loci—that rigorous statistical fine-mapping analyses indicated were likely to be causally driven by CNVs. Putative target genes were identifiable for nearly half of the loci, enabling new insights into dosage-sensitivity of these genes and implicating several novel gene-trait relationships. CNVs at several loci created extended allelic series including deletions or duplications of distal enhancers that associated with much stronger phenotypic effects than SNPs within these regulatory elements. These results demonstrate the ability of haplotype-informed analysis to empower structural variant detection and provide insights into the genetic basis of human complex traits.

“Rare Variant Aggregation in 148,508 Exomes Identifies Genes Associated With Proxy Alzheimer’s Disease”, Wightman et al 2021

“Rare Variant Aggregation in 148,508 Exomes Identifies Genes Associated with Proxy Alzheimer’s Disease”⁠, Douglas P. Wightman, Jeanne E. Savage, Christiaan A. de Leeuw, Iris E. Jansen, Danielle Posthuma (2021-10-18; ⁠, ; similar):

We generated a proxy Alzheimer’s disease phenotype for 148,508 individuals in the UK biobank in order to perform exome-wide rare variant aggregation analyses to identify genes associated with proxy Alzheimer’s disease. We identified four genes statistically-significantly associated with the proxy phenotype, three of which have been previously associated with clinically diagnosed Alzheimer’s disease (SORL1, TREM2, and TOMM40). We identified one gene (HEXA) which has not been previously associated with Alzheimer’s disease but is known to contribute to neurodegenerative disease. Here we show that proxy Alzheimer’s disease can capture some of the rare variant association signal for Alzheimer’s disease and can be used to highlight genes and variants of interest. The proxy phenotype allows for the utilisation of large genetic databases without clinically diagnosed Alzheimer’s disease patients to uncover variants and genes that contribute to Alzheimer’s disease.

“Integrating de Novo and Inherited Variants in over 42,607 Autism Cases Identifies Mutations in New Moderate Risk Genes”, Zhou et al 2021

“Integrating de novo and inherited variants in over 42,607 autism cases identifies mutations in new moderate risk genes”⁠, Xueya Zhou, Pamela Feliciano, Tianyun Wang, Irina Astrovskaya, Chang Shu, Jacob B. Hall, Joseph U. Obiajulu et al (2021-10-11; similar):

Despite the known heritable nature of autism spectrum disorder (ASD), studies have primarily identified risk genes with de novo variants (DNVs). To capture the full spectrum of ASD genetic risk, we performed a two-stage analysis of rare de novo and inherited coding variants in 42,607 ASD cases, including 35,130 new cases recruited online by SPARK. In the first stage, we analyzed 19,843 cases with one or both biological parents and found that known ASD or neurodevelopmental disorder (NDD) risk genes explain nearly 70% of the genetic burden conferred by DNVs. In contrast, less than 20% of genetic risk conferred by rare inherited loss-of-function (LoF) variants are explained by known ASD/​NDD genes. We selected 404 genes based on the first stage of analysis and performed a meta-analysis with an additional 22,764 cases and 236,000 population controls. We identified 60 genes with exome-wide significance (p < 2.5e-6), including five new risk genes (NAV3, ITSN1, MARK2, SCAF1, and HNRNPUL2). The association of NAV3 with ASD risk is entirely driven by rare inherited LoFs variants, with an average relative risk of 4, consistent with moderate effect. ASD individuals with LoF variants in the four moderate risk genes (NAV3, ITSN1, SCAF1, and HNRNPUL2, n = 95) have less cognitive impairment compared to 129 ASD individuals with LoF variants in well-established, highly penetrant ASD risk genes (CHD8, SCN2A, ADNP, FOXP1, SHANK3) (59% vs. 88%, p = 1.9e-06) . These findings will guide future gene discovery efforts and suggest that much larger numbers of ASD cases and controls are needed to identify additional genes that confer moderate risk of ASD through rare, inherited variants.

“How Rare and Common Risk Variation Jointly Affect Liability for Autism Spectrum Disorder”, Klei et al 2021

“How rare and common risk variation jointly affect liability for autism spectrum disorder”⁠, Lambertus Klei, Lora Lee McClain, Behrang Mahjani, Klea Panayidou, Silvia De Rubeis, Anna-Carin Säll Grahnat et al (2021-10-06; similar):

Background: Genetic studies have implicated rare and common variations in liability for autism spectrum disorder (ASD). Of the discovered risk variants, those rare in the population invariably have large impact on liability, while common variants have small effects. Yet, collectively, common risk variants account for the majority of population-level variability. How these rare and common risk variants jointly affect liability for individuals requires further study.

Methods: To explore how common and rare variants jointly affect liability, we assessed 2 cohorts of ASD families characterized for rare and common genetic variations (Simons Simplex Collection and Population-Based Autism Genetics and Environment Study). We analyzed data from 3011 affected subjects, as well as 2 cohorts of unaffected individuals characterized for common genetic variation: 3011 subjects matched for ancestry to ASD subjects and 11,950 subjects for estimating allele frequencies. We used genetic scores, which assessed the relative burden of common genetic variation affecting risk of ASD (henceforth “burden”), and determined how this burden was distributed among 3 subpopulations: ASD subjects who carry a potentially damaging variant implicated in risk of ASD (“PDV carriers”); ASD subjects who do not (“non-carriers”); and unaffected subjects who are assumed to be non-carriers.

Results: Burden harbored by ASD subjects is stochastically greater than that harbored by control subjects. For PDV carriers, their average burden is intermediate between non-carrier ASD and control subjects. Both carrier and non-carrier ASD subjects have greater burden, on average, than control subjects. The effects of common and rare variants likely combine additively to determine individual-level liability.

Limitations: Only 305 ASD subjects were known PDV carriers. This relatively small subpopulation limits this study to characterizing general patterns of burden, as opposed to effects of specific PDVs or genes. Also, a small fraction of subjects that are categorized as non-carriers could be PDV carriers.

Conclusions: Liability arising from common and rare risk variations likely combines additively to determine risk of any individual diagnosed with ASD. On average, ASD subjects carry a substantial burden of common risk variation, even if they also carry a rare PDV affecting risk.

“A General Framework for Identifying Rare Variant Combinations in Complex Disorders”, Pounraja & Girirajan 2021

“A general framework for identifying rare variant combinations in complex disorders”⁠, Vijay Kumar Pounraja, Santhosh Girirajan (2021-10-01; ; similar):

Statistical challenges due to rarity and combinatorial explosion resulting from exhaustive evaluation of rare variant combinations have limited the study of oligogenic etiology for complex disorders. We present RareComb, a framework that combines a priori algorithm and statistical inference to identify specific combinations of mutated genes associated with complex phenotypes. Using RareComb on 6,189 affected individuals, we identified 718 combinations of mutated genes statistically-significantly associated with intellectual disability (ID), and carriers of these combinations showed lower IQ than expected in a replication cohort of 1,878 individuals. These combinations were enriched for nervous system genes, showed complex inheritance patterns, and were depleted in unaffected siblings. We further identified oligogenic combinations associated with multiple comorbid phenotypes, including COL28A1 and MFSD2B mutations for ID and schizophrenia. Our framework enables rare variant analysis in affected individuals lacking diagnosis based on de novo mutations, and provides a paradigm for dissecting the genetic basis of complex disorders.

“Rates of Contributory de Novo Mutation in High and Low-risk Autism Families”, Yoon et al 2021

“Rates of contributory de novo mutation in high and low-risk autism families”⁠, Seungtai Yoon, Adriana Munoz, Boris Yamrom, Yoon-ha Lee, Peter Andrews, Steven Marks, Zihua Wang, Catherine Reeves et al (2021-09-01; similar):

Autism arises in high and low-risk families. De novo mutation contributes to autism incidence in low-risk families as there is a higher incidence in the affected of the simplex families than in their unaffected siblings. But the extent of contribution in low-risk families cannot be determined solely from simplex families as they are a mixture of low and high-risk. The rate of de novo mutation in nearly pure populations of high-risk families, the multiplex families, has not previously been rigorously determined. Moreover, rates of de novo mutation have been underestimated from studies based on low resolution microarrays and whole exome sequencing.

Here we report on findings from whole genome sequence (WGS) of both simplex families from the Simons Simplex Collection (SSC) and multiplex families from the Autism Genetic Resource Exchange (AGRE). After removing the multiplex samples with excessive cell-line genetic drift, we find that the contribution of de novo mutation in multiplex is substantially smaller than the contribution in simplex. We use WGS to provide high resolution CNV profiles and to analyze more than coding regions, and revise upward the rate in simplex autism due to an excess of de novo events targeting introns.

Based on this study, we now estimate that de novo events contribute to 52–67% of cases of autism arising from low risk families, and 30–39% of cases of all autism.

“Extreme Purifying Selection against Point Mutations in the Human Genome”, Dukler et al 2021

“Extreme purifying selection against point mutations in the human genome”⁠, Noah Dukler, Mehreen R. Mughal, Ritika Ramani, Yi-Fei Huang, Adam Siepel (2021-08-23; similar):

Genome sequencing of tens of thousands of human individuals has recently enabled the measurement of large selective effects for mutations to protein-coding genes. Here we describe a new method, called ExtRaINSIGHT, for measuring similar selective effects at individual sites in noncoding as well as in coding regions of the human genome. ExtRaINSIGHT estimates the prevalence of strong purifying selection, or “ultraselection” (λs), as the fractional depletion of rare single-nucleotide variants (minor allele frequency < 0.1%) in a target set of genomic sites relative to matched sites that are putatively neutrally evolving, in a manner that controls for local variation and neighbor-dependence in mutation rate. We show using simulations that, above an appropriate threshold, λs is closely related to the average site-specific selection coefficient against heterozygous point mutations, as predicted at mutation-selection balance. Applying ExtRaINSIGHT to 71,702 whole genome sequences from gnomAD v3, we find particularly strong evidence of ultraselection in evolutionarily ancient miRNAs and neuronal protein-coding genes, as well as at splice sites. Moreover, our estimated selection coefficient against heterozygous amino-acid replacements across the genome (at 1.4%) is substantially larger than previous estimates based on smaller sample sizes. By contrast, we find weak evidence of ultraselection in other noncoding RNAs and transcription factor binding sites, and only modest evidence in ultraconserved elements and human accelerated regions. We estimate that ~0.3–0.5% of the human genome is ultraselected, with one third to one half of ultraselected sites falling in coding regions. These estimates suggest ~0.3–0.4 lethal or nearly lethal de novo mutations per potential human zygote, together with ~2 de novo mutations that are more weakly deleterious. Overall, our study sheds new light on the genome-wide distribution of fitness effects for new point mutations by combining deep new sequencing data sets and classical theory from population genetics⁠.

“Partitioning Gene-level Contributions to Complex-trait Heritability by Allele Frequency Identifies Disease-relevant Genes”, Burch et al 2021

“Partitioning gene-level contributions to complex-trait heritability by allele frequency identifies disease-relevant genes”⁠, Kathryn S. Burch, Kangcheng Hou, Yi Ding, Yifei Wang, Steven Gazal, Huwenbo Shi, Bogdan Pasaniuc (2021-08-18; similar):

Recent works have shown that SNP-heritability—which is dominated by low-effect common variants—may not be the most relevant quantity for localizing high-effect/​critical disease genes. Here, we introduce methods to estimate the proportion of phenotypic variance explained by a given assignment of SNPs to a single gene (gene-level heritability). We partition gene-level heritability across minor allele frequency (MAF) classes to find genes whose gene-level heritability is explained exclusively by “low-frequency/​rare” variants (0.5% ≤ MAF < 1%). Applying our method to ~17K protein-coding genes and 25 quantitative traits in the UK Biobank (n = 290K), we find that, on average across traits, ~2.5% of nonzero-heritability genes have a rare-variant component, and only ~0.8% (370 gene-trait pairs) have heritability exclusively from rare variants. Of these 370 gene-trait pairs, 37% were not detected by existing gene-level association testing methods, likely because existing methods combine signal from all variants in a region irrespective of MAF class. Many of the additional genes we identify are implicated in phenotypically related Mendelian disorders or congenital developmental disorders, providing further evidence of their trait-relevance. Notably, the rare-variant component of gene-level heritability exhibits trends different from those of common-variant gene-level heritability. For example, while total gene-level heritability increases with gene length, the rare-variant component is significantly larger among shorter genes; the cumulative distributions of gene-level heritability also vary across traits and reveal differences in the relative contributions of rare/​common variants to overall gene-level polygenicity. We conclude that the proportion of gene-level heritability attributable to low-frequency/​rare variation can yield novel insights into complex-trait genetic architecture.

“Differences in the Genetic Architecture of Common and Rare Variants in Childhood, Persistent and Late-diagnosed Attention Deficit Hyperactivity Disorder”, Rajagopal et al 2021

“Differences in the genetic architecture of common and rare variants in childhood, persistent and late-diagnosed attention deficit hyperactivity disorder”⁠, Veera M. Rajagopal, Jinjie Duan, Laura Vilar-Ribó, Jakob Grove, Tetyana Zayats, J. Antoni Ramos-Quiroga et al (2021-08-08; similar):

Attention deficit hyperactivity disorder (ADHD) is a neurodevelopmental disorder, with onset in childhood (“childhood ADHD”), and around two thirds of affected individuals will continue to have ADHD symptoms in adulthood (“persistent ADHD”). Age at first diagnosis can vary, and sometimes ADHD is first diagnosed in adulthood (“late-diagnosed ADHD”).

In this study, we analyzed a large Danish population-based case-cohort generated by iPSYCH in order to identify common genetic risk loci and perform in-depth characterization of the polygenic architecture of childhood (n = 14,878), persistent (n = 1,473) and late-diagnosed ADHD (n = 6,961) alongside 38,303 controls. Additionally, the burden of rare protein truncating variants in the three groups were evaluated in whole-exome sequencing data from a subset of the individuals (7,650 ADHD cases and 8,649 controls). We identified genome-wide statistically-significant loci associated with childhood ADHD (four loci) and late-diagnosed ADHD (one locus). In analyses of the polygenic architecture, we found higher polygenic score (PGS) of ADHD risk variants in persistent ADHD (mean PGS = 0.41) compared to childhood (mean PGS = 0.26) and late-diagnosed ADHD (mean PGS = 0.27), and we found a significant decreased genetic correlation of late-diagnosed ADHD with inattention (rg = 0.57) compared to childhood ADHD (rg = 0.86). These results suggest that a higher ADHD polygenic risk burden is associated with persistence of symptoms, and that a later diagnosis of ADHD could be due in part to genetic factors. Additionally, childhood ADHD demonstrated both a significantly increased genetic overlap with autism compared to late-diagnosed ADHD as well as the highest burden of rare protein-truncating variants in highly constrained genes among ADHD subgroups (compared to controls: β = 0.13, p = 2.41×10−11). Late-diagnosed ADHD demonstrated significantly larger genetic overlap with depression than childhood ADHD and no increased burden in rare protein-truncating variants (compared to controls: β = 0.06). Overall, our study finds genetic heterogeneity among ADHD subgroups and suggests that genetic factors influence time of first ADHD diagnosis, persistence of ADHD and comorbidity patterns in the sub-groups.

“Genetic Correlates of Phenotypic Heterogeneity in Autism”, Warrier et al 2021

“Genetic correlates of phenotypic heterogeneity in autism”⁠, Varun Warrier, Xinhe Zhang, Patrick Reed, Alexandra Havdahl, Tyler M. Moore, Freddy Cliquet, Claire S. Leblond et al (2021-08-05; similar):

The substantial phenotypic heterogeneity in autism limits our understanding of its genetic aetiology. To address this gap, we investigated genetic differences between autistic individuals (Nmax = 12,893) based on core (ie. social communication difficulties, and restricted and repetitive behaviours) and associated features of autism, co-occurring developmental disabilities (eg. language, motor, and intellectual developmental disabilities and delays), and sex.

We conducted a comprehensive factor analysis of core autism features in autistic individuals and identified six factors. Common genetic variants including autism polygenic scores (PGS) were associated with the core factors but de novo variants were not, even though the latent factor structure was similar between carriers and non-carriers of de novo variants.

We identify that increasing autism PGS decrease the likelihood of co-occurring developmental disabilities in autistic individuals, which reflects both a true protective effect and additivity between rare and common variants. Furthermore in autistic individuals without co-occurring intellectual disability (ID), autism PGS are over-inherited by autistic females compared to males. Finally, we observe higher SNP heritability for males and autistic individuals without ID, but found no robust differences in SNP heritability by the level of core autism features. Deeper phenotypic characterisation will be critical to determining how the complex underlying genetics shapes cognition, behaviour, and co-occurring conditions in autism.

“Exome Sequencing in Obsessive-compulsive Disorder Reveals a Burden of Rare Damaging Coding Variants”, Halvorsen et al 2021

2021-halvorsen.pdf: “Exome sequencing in obsessive-compulsive disorder reveals a burden of rare damaging coding variants”⁠, Mathew Halvorsen, Jack Samuels, Ying Wang, Benjamin D. Greenberg, Abby J. Fyer, James T. McCracken, Daniel A. Geller et al (2021-06-28; ; similar):

Obsessive-compulsive disorder (OCD) affects 1–2% of the population, and, as with other complex neuropsychiatric disorders, it is thought that rare variation contributes to its genetic risk.

In this study, we performed exome sequencing in the largest OCD cohort to date (1,313 total cases, consisting of 587 trios, 41 quartets and 644 singletons of affected individuals) and describe contributions to disease risk from rare damaging coding variants.

In case-control analyses (n = 1,263/​11,580), the most statistically-significant single-gene result was observed in SLITRK5 (odds ratio (OR) = 8.8, 95% confidence interval 3.4–22.5, p = 2.3 × 10−6). Across the exome, there was an excess of loss of function (LoF) variation specifically within genes that are LoF-intolerant (OR = 1.33, p = 0.01). In an analysis of trios, we observed an excess of de novo missense predicted damaging variants relative to controls (OR = 1.22, p = 0.02), alongside an excess of de novo LoF mutations in LoF-intolerant genes (OR = 2.55, p = 7.33 × 10−3).

These data support a contribution of rare coding variants to OCD genetic risk.

“Recovery of Trait Heritability from Whole Genome Sequence Data”, Wainschtein et al 2021

“Recovery of trait heritability from whole genome sequence data”⁠, Pierrick Wainschtein, Deepti Jain, Zhili Zheng, TOPMed Anthropometry Working Group, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium et al (2021-06-11; backlinks; similar):

Heritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data 1, but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, ~1⁄3rd to two-thirds of heritability is captured by common SNPs 2–5. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as overestimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be largely recovered from whole-genome sequence (WGS) data on 25,465 unrelated individuals of European ancestry. We assigned 33.7 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned genetic variance accordingly. The estimated heritability was 0.68 (SE 0.10) for height and 0.30 (SE 0.10) for BMI, with a range of ~0.60–0.71 for height and ~0.25–0.35 for BMI, depending on quality control and analysis strategies. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein-altering variants, consistent with negative selection thereon. Cumulatively variants with 0.0001 < MAF < 0.1 explained 0.47 (SE 0.07) and 0.30 (SE 0.10) of heritability for height and BMI, respectively. Our results imply that rare variants, in particular those in regions of low LD, is a major source of the still missing heritability of complex traits and disease.

“Ultra-rare, Rare, and Common Genetic Variant Analysis Converge to Implicate Negative Selection and Neuronal Processes in the Aetiology of Schizophrenia”, Akingbuwa et al 2021

“Ultra-rare, rare, and common genetic variant analysis converge to implicate negative selection and neuronal processes in the aetiology of schizophrenia”⁠, Wonuola A. Akingbuwa, Anke R. Hammerschlag, Meike Bartels, Michel G. Nivard, Christel M. Middeldorp (2021-05-29; ; similar):

Both common and rare genetic variants (minor allele frequency > 1% and < 0.1% respectively) have been implicated in the aetiology of schizophrenia. In this study, we integrate single-cell gene expression data with publicly available Genome-Wide Association Study (GWAS) and exome sequenced data in order to investigate in parallel, the enrichment of common and (ultra-)rare variants related to schizophrenia in several functionally relevant gene sets. Four types of gene sets were constructed (1) protein-truncating variant (PTV)-intolerant (PI) genes (2) genes expressed in brain cell types and neurons ascertained from mouse and human brain tissue (3) genes defined by synaptic function and location and (4) intersection genes, i.e., PI genes that are expressed in the human and mouse brain cell gene sets. We show that common as well as (ultra-)rare schizophrenia-associated variants are overrepresented in PI genes, in excitatory neurons from the prefrontal cortex and hippocampus, medium spiny neurons, and genes enriched for synaptic processes. We also observed stronger enrichment in the intersection genes. Our findings suggest that across the allele frequency spectrum, genes and genetic variants likely to be under stringent selection, and those expressed in particular brain cell types, are involved in the same biological pathways influencing the risk for schizophrenia.

“Lack of Transgenerational Effects of Ionizing Radiation Exposure from the Chernobyl Accident”, Yeager et al 2021

2021-yeager.pdf: “Lack of transgenerational effects of ionizing radiation exposure from the Chernobyl accident”⁠, Meredith Yeager, Mitchell J. Machiela, Prachi Kothiyal, Michael Dean, Clara Bodelon, Shalabh Suman, Mingyi Wang et al (2021-04-22; similar):

Genomics of radiation-induced damage: The potential adverse effects of exposures to radioactivity from nuclear accidents can include acute consequences such as radiation sickness⁠, as well as long-term sequelae such as increased risk of cancer. There have been a few studies examining transgenerational risks of radiation exposure but the results have been inconclusive.

Morton et al 2021 analyzed papillary thyroid tumors, normal thyroid tissue, and blood from hundreds of survivors of the Chernobyl nuclear accident and compared them against those of unexposed patients. The findings offer insight into the process of radiation-induced carcinogenesis and characteristic patterns of DNA damage associated with environmental radiation exposure.

In a separate study, Yeager et al 2021 analyzed the genomes of 130 children and parents from families in which one or both parents had experienced gonadal radiation exposure related to the Chernobyl accident and the children were conceived between 1987 and 2002. Reassuringly, the authors did not find an increase in new germline mutations in this population.

Effects of radiation exposure from the Chernobyl nuclear accident remain a topic of interest.

We investigated germline de novo mutations (DNMs) in children born to parents employed as cleanup workers or exposed to occupational and environmental ionizing radiation after the accident.

Whole-genome sequencing of 130 children (born 1987–2002) and their parents did not reveal an increase in the rates, distributions, or types of DNMs relative to the results of previous studies. We find no elevation in total DNMs, regardless of cumulative preconception gonadal paternal [mean = 365 milligrays (mGy), range = 0 to 4,080 mGy] or maternal (mean = 19 mGy, range = 0 to 550 mGy) exposure to ionizing radiation.

Thus, we conclude that, over this exposure range, evidence is lacking for a substantial effect on germline DNMs in humans, suggesting minimal impact from transgenerational genetic effects.

“The Female Protective Effect against Autism Spectrum Disorder”, Wigdor et al 2021

“The female protective effect against autism spectrum disorder”⁠, Emilie M. Wigdor, Daniel J. Weiner, Jakob Grove, Jack M. Fu, Wesley K. Thompson, Caitlin E. Carey, Nikolas Baya et al (2021-04-05; similar):

Autism spectrum disorder (ASD) is diagnosed 3–4× more frequently in males than in females. Genetic studies of rare variants support a female protective effect (FPE) against ASD. However, sex differences in common, inherited genetic risk for ASD are less studied. Leveraging the nationally representative Danish iPSYCH resource, we found siblings of female ASD cases had higher rates of ASD than siblings of male ASD cases (P < 0.01). In the Simons Simplex and SPARK collections, mothers of ASD cases carried more polygenic risk for ASD than fathers of ASD cases (P = 7.0 × 10−7). Male unaffected siblings under-inherited polygenic risk (P = 0.03); female unaffected siblings did not. Further, female ASD cases without a high-impact de novo variant over-inherited nearly three-fold the polygenic risk of male cases with a high-impact de novo (P = 0.02). Our findings support a FPE against ASD that includes common, inherited genetic variation.

“Structural Variants in Chinese Population and Their Impact on Phenotypes, Diseases and Population Adaptation”, Wu et al 2021

“Structural variants in Chinese population and their impact on phenotypes, diseases and population adaptation”⁠, Zhikun Wu, Zehang Jiang, Tong Li, Chuanbo Xie, Liansheng Zhao, Jiaqi Yang, Shuai Ouyang, Yizhi Liu, Tao Li et al (2021-02-10; similar):

A complete characterization of genetic variation is a fundamental goal of human genome research. Long-read sequencing (LRS) improves the sensitivity for structural variant (SV) discovery and facilitates a better understanding of the SV spectrum in human genomes. Here, we conduct the first LRS-based SV analysis in Chinese population.

We perform whole-genome LRS for 405 unrelated Chinese, with 68 phenotypic and clinical measurements. We discover a complex landscape of 132,312 non-redundant SVs, of which 53.3% are novel. The identified SVs are of high-quality validated by the PacBio high-fidelity sequencing and PCR experiments. The total length of SVs represents ~13.2% of the human reference genome.

We annotate 1,929 loss-of-function SVs affecting the coding sequences of 1,681 genes. We discover new associations of SVs with phenotypes and diseases, such as rare deletions in HBA1/​HBA2/​HBB associated with anemia and common deletions in GHR associated with body height. Furthermore, we identify SV candidates related to human immunity that differentiate sub-populations of Chinese.

Our study reveals the complex landscape of human SVs in unprecedented detail and provides new insights into their roles contributing to phenotypes, diseases and evolution. The genotypic and phenotypic resource is freely available to the scientific community.

“Polygenic Burden Has Broader Impact on Health, Cognition, and Socioeconomic Outcomes Than Most Rare and High-risk Copy Number Variants”, Saarentaus et al 2021

“Polygenic burden has broader impact on health, cognition, and socioeconomic outcomes than most rare and high-risk copy number variants”⁠, Elmo Christian Saarentaus, Aki Samuli Havulinna, Nina Mars, Ari Ahola-Olli, Tuomo Tapio Johannes Kiiskinen et al (2021-02-01; ; similar):

Copy number variants (CNVs) are associated with syndromic and severe neurological and psychiatric disorders (SNPDs), such as intellectual disability, epilepsy, schizophrenia, and bipolar disorder. Although considered high-impact, CNVs are also observed in the general population. This presents a diagnostic challenge in evaluating their clinical-significance.

To estimate the phenotypic differences between CNV carriers and non-carriers regarding general health and well-being, we compared the impact of SNPD-associated CNVs on health, cognition, and socioeconomic phenotypes to the impact of three genome-wide polygenic risk score (PRS) in two Finnish cohorts (FINRISK, n = 23,053 and NFBC1966, n = 4895). The focus was on CNV carriers and PRS extremes who do not have an SNPD diagnosis.

We identified high-risk CNVs (DECIPHER CNVs, risk gene deletions, or large [>1 Mb] CNVs) in 744 study participants (2.66%), 36 (4.8%) of whom had a diagnosed SNPD. In the remaining 708 unaffected carriers, we observed lower educational attainment (EA; OR = 0.77 [95% CI 0.66–0.89]) and lower household income (OR = 0.77 [0.66–0.89]). Income-associated CNVs also lowered household income (OR = 0.50 [0.38–0.66]), and CNVs with medical consequences lowered subjective health (OR = 0.48 [0.32–0.72]). The impact of PRSs was broader. At the lowest extreme of PRS for EA, we observed lower EA (OR = 0.31 [0.26–0.37]), lower-income (OR = 0.66 [0.57–0.77]), lower subjective health (OR = 0.72 [0.61–0.83]), and increased mortality (Cox’s HR = 1.55 [1.21–1.98]). PRS for intelligence had a similar impact, whereas PRS for schizophrenia did not affect these traits.

We conclude that the majority of working-age individuals carrying high-risk CNVs without SNPD diagnosis have a modest impact on morbidity and mortality, as well as the limited impact on income and educational attainment, compared to individuals at the extreme end of common genetic variation. Our findings highlight that the contribution of traditional high-risk variants such as CNVs should be analyzed in a broader genetic context, rather than evaluated in isolation.

[Keywords: bipolar disorder, depression, genetics, predictive markers, schizophrenia]

Figure 3: Health impact of high-risk CNVs and PRSs in Finnish cohorts: A: Hazard ratios in a Cox regression model for mortality in unaffected carriers of high-risk CNVs and individuals at the PRS extremes in FINRISK (n = 22,210). ID gene deletions are not pictured as there were no deaths during follow-up for carriers of this type of CNV. B: Incidence rate ratio (IRR) of high-risk CNVs and PRS extremes in a Poisson regression model of the Charlson comorbidity index (CCI) in FINRISK individuals with no SNPD (n = 22,210). The incidence of one CCI unit was more than 3.5 higher in ID gene deletion carriers than in individuals with no high-risk CNV. C, D: Impact of CNVs and PRS outlier status on socioeconomic status and health. The odds of low SES and poor health were highest for individuals with low PRSIQ, and to a lesser extent for individuals at the lowest extreme of PRSEA (A). The odds of high SES and good health was lowest for individuals at the lowest extreme of PRSEA, and to a lesser extent for individuals at the lowest extreme of PRSIQ (B). Effects meta-analyzed using a random-effects assumption are denoted by triangles, otherwise, a fixed-effect assumption was made. The Bonferroni-adjusted p-value is denoted above the point estimate of each variant.

“Protein-coding Repeat Polymorphisms Strongly Shape Diverse Human Phenotypes”, Mukamel et al 2021

“Protein-coding repeat polymorphisms strongly shape diverse human phenotypes”⁠, Ronen E. Mukamel, Robert E. Handsaker, Maxwell A. Sherman, Alison R. Barton, Yiming Zheng, Steven A. McCarroll et al (2021-01-20; similar):

Hundreds of the proteins encoded in human genomes contain domains that vary in size or copy number due to variable numbers of tandem repeats (VNTRs) in protein-coding exons. VNTRs have eluded analysis by the molecular methods—SNP arrays and high-throughput sequencing—used in large-scale human genetic studies to date; thus, the relationships of VNTRs to most human phenotypes are unknown. We developed ways to estimate VNTR lengths from whole-exome sequencing data, identify the SNP haplotypes on which VNTR alleles reside, and use imputation to project these haplotypes into abundant SNP data. We analyzed 118 protein-altering VNTRs in 415,280 UK Biobank participants for association with 791 phenotypes. Analysis revealed some of the strongest associations of common variants with human phenotypes including height, hair morphology, and biomarkers of human health; for example, a VNTR encoding 13–44 copies of a 19-amino-acid repeat in the chondroitin sulfate domain of aggrecan (ACAN) associated with height variation of 3.4 centimeters (s.e. 0.3 cm). Incorporating large-effect VNTRs into analysis also made it possible to map many additional effects at the same loci: for the blood biomarker lipoprotein(a), for example, analysis of the kringle IV-2 VNTR within the LPA gene revealed that 18 coding SNPs and the VNTR in LPA explained 90% of lipoprotein(a) heritability in Europeans, enabling insights about population differences and epidemiological significance of this clinical biomarker. These results point to strong, cryptic effects of highly polymorphic common structural variants that have largely eluded molecular analyses to date.

“Exome Sequencing and Analysis of 454,787 UK Biobank Participants”, Backman et al 2021

“Exome sequencing and analysis of 454,787 UK Biobank participants”⁠, Joshua D. Backman, Alexander H. Li, Anthony Marcketta, Dylan Sun, Joelle Mbatchou, Michael D. Kessler et al (2021; similar):

A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing1 to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study2. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at p ≤ 2.18 × 10−11. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.

“Long Read Sequencing of 3,622 Icelanders Provides Insight into the Role of Structural Variants in Human Diseases and Other Traits”, Beyter et al 2020

“Long read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits”⁠, Doruk Beyter, Helga Ingimundardottir, Asmundur Oddsson, Hannes P. Eggertsson, Eythor Bjornsson, Hakon Jonsson et al (2020-12-14; similar):

Long-read sequencing (LRS) promises to improve characterization of structural variants (SVs), a major source of genetic diversity. We generated LRS data on 3,622 Icelanders using Oxford Nanopore Technologies, and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions), spanning a median of 10 Mb per haploid genome. We discovered a set of 133,886 reliably genotyped SV alleles and imputed them into 166,281 individuals to explore their effects on diseases and other traits. We discovered an association with a rare (AF = 0.037%) deletion of the first exon of PCSK9. Carriers of this deletion have 0.93 mmol/​L (1.31 SD) lower LDL cholesterol levels than the population average (p-value = 7.0·10−20). We also discovered an association with a multi-allelic SV inside a large repeat region, contained within single long reads, in an exon of ACAN. Within this repeat region we found 11 alleles that differ in the number of a 57 bp-motif repeat, and observed a linear relationship (0.016 SD per motif inserted, p = 6.2·10−18) between the number of repeats carried and height. These results show that SVs can be accurately characterized at population scale using long read sequence data in a genome-wide non-targeted approach and demonstrate how SVs impact phenotypes.

“A Broad Exome Study of the Genetic Architecture of Asthma Reveals Novel Patient Subgroups”, Cameron-Christie et al 2020

“A broad exome study of the genetic architecture of asthma reveals novel patient subgroups”⁠, Sophia Cameron-Christie, Alex Mackay, Quanli Wang, Henric Olsson, Bastian Angermann, Glenda Lassi, Julia Lindgren et al (2020-12-11; similar):


Asthma risk is a complex interplay between genetic susceptibility and environment. Despite many significantly-associated common variants, the contribution of rarer variants with potentially greater effect sizes has not been as extensively studied. We present an exome-based study adopting 24,576 cases and 120,530 controls to assess the contribution of rare protein-coding variants to the risk of early-onset or all-comer asthma.

Methods: We performed case-control analyses on three genetic units: variant-level, gene-level and pathway-level, using sequence data from the Scandinavian Asthma Genetic Study and UK Biobank participants with asthma. Cases were defined as all-comer asthma (n = 24,576) and early-onset asthma (n = 5,962). Controls were 120,530 UK Biobank participants without reported history of respiratory illness.

Results: Variant-level analyses identified statistically-significant variants at moderate-to-common allele frequency, including protein-truncating variants in FLG and IL33. Asthma risk was significantly increased not only by individual, common FLG protein-truncating variants, but also among the collection of rare-to-private FLG protein-truncating variants (p = 6.8×10−7). This signal was driven by early-onset asthma and did not correlate with circulating eosinophil levels. In contrast, a single splice variant in IL33 was significantly protective (p = 8.0×10−10), while the collection of remaining IL33 protein-truncating variants showed no class effect (p = 0.54). A pathway-based analysis identified that protein-truncating variants in loss-of-function intolerant genes were statistically-significantly enriched among individuals with asthma.

Conclusion: Access to the full allele frequency spectrum of protein-coding variants provides additional clarity about the potential mechanisms of action for FLG and IL33. Beyond these two significant drivers, we detected a significant enrichment of protein-truncating variants in loss-of-function intolerant genes.

“Rare Genetic Variation Underlying Human Diseases and Traits: Results from 200,000 Individuals in the UK Biobank”, Jurgens et al 2020

“Rare Genetic Variation Underlying Human Diseases and Traits: Results from 200,000 Individuals in the UK Biobank”⁠, Sean J. Jurgens, Seung Hoan Choi, Valerie N. Morrill, Mark Chaffin, James P. Pirruccello, Jennifer L. Halford et al (2020-11-29; similar):

Background: Many human diseases are known to have a genetic contribution. While genome-wide studies have identified many disease-associated loci, it remains challenging to elucidate causal genes. In contrast, exome sequencing provides an opportunity to identify new disease genes and large-effect variants of clinical relevance. We therefore sought to determine the contribution of rare genetic variation in a curated set of human diseases and traits using an unique resource of 200,000 individuals with exome sequencing data from the UK Biobank.

Methods and Results: We included 199,832 participants with a mean age of 68 at follow-up. Exome-wide gene-based tests were performed for 64 diseases and 23 quantitative traits using a mixed-effects model⁠, testing rare loss-of-function and damaging missense variants. We identified 51 known and 23 novel associations with 26 diseases and traits at a false-discovery-rate of 1%. There was a striking risk associated with many Mendelian disease genes including: MYPBC3 with over a 100-fold increased odds of hypertrophic cardiomyopathy, PKD1 with a greater than 25-fold increased odds of chronic kidney disease, and BRCA2, BRCA1, ATM and PALB2 with 3 to 10-fold increased odds of breast cancer. Notable novel findings included an association between GIGYF1 and type 2 diabetes (OR 5.6, p = 5.35×10−8), elevated blood glucose, and lower insulin-like-growth-factor-1 levels. Rare variants in CCAR2 were also associated with diabetes risk (OR 13, p = 8.5×10−8), while COL9A3 was associated with cataract (OR 3.4, p = 6.7×10−8). Notable associations for blood lipids and hypercholesterolemia included NR1H3, RRBP1, GIGYF1, SCGN, APH1A, PDE3B and ANGPTL8. A number of novel genes were associated with height, including DTL, PIEZO1, SCUBE3, PAPPA and ADAMTS6, while BSN was associated with body-mass-index. We further assessed putatively pathogenic variants in known Mendelian cardiovascular disease genes and found that between 1.3 and 2.3% of the population carried likely pathogenic variants in known cardiomyopathy, arrhythmia or hypercholesterolemia genes.

Conclusions: Large-scale population sequencing identifies known and novel genes harboring high-impact variation for human traits and diseases. A number of novel findings, including GIGYF1, represent interesting potential therapeutic targets. Exome sequencing at scale can identify a meaningful proportion of the population that carries a pathogenic variant underlying cardiovascular disease.

“Discovery of Rare Variants Associated With Blood Pressure Regulation through Meta-analysis of 1.3 Million Individuals”, Surendran et al 2020

2020-surendran.pdf: “Discovery of rare variants associated with blood pressure regulation through meta-analysis of 1.3 million individuals”⁠, Praveen Surendran, Elena V. Feofanova, Najim Lahrouchi, Ioanna Ntalla, Savita Karthikeyan, James Cook et al (2020-11-23; similar):

Genetic studies of blood pressure (BP) to date have mainly analyzed common variants (minor allele frequency > 0.05). In a meta-analysis of up to ~1.3 million participants, we discovered 106 new BP-associated genomic regions and 87 rare (minor allele frequency ≤ 0.01) variant BP associations (p <5 × 10−8), of which 32 were in new BP-associated loci and 55 were independent BP-associated single-nucleotide variants within known BP-associated regions. Average effects of rare variants (44% coding) were ~8× larger than common variant effects and indicate potential candidate causal genes at new and known loci (for example, GATA5 and PLCB3). BP-associated variants (including rare and common) were enriched in regions of active chromatin in fetal tissues, potentially linking fetal development with BP regulation in later life. Multivariable Mendelian randomization suggested possible inverse effects of elevated systolic and diastolic BP on large artery stroke. Our study demonstrates the utility of rare-variant analyses for identifying candidate genes and the results highlight potential therapeutic targets.

“Mutations in Metabotropic Glutamate Receptor 1 Contribute to Natural Short Sleep Trait”, Shi et al 2020

“Mutations in Metabotropic Glutamate Receptor 1 Contribute to Natural Short Sleep Trait”⁠, Guangsen Shi, Chen Yin, Zenghua Fan, Lijuan Xing, Yulia Mostovoy, Pui-Yan Kwok, Liza H. Ashbrook, Andrew D. Krystal et al (2020-10-15; ):

  • 2 independent mutations found in GRM1 cause familial natural short sleep
  • Both mGluR1 mutations have less activity than wild-type receptors in vitro
  • Both mutant mouse models have shorter sleep duration than control mice
  • Brain slices from mutant mice showed increased excitatory synaptic transmission

Sufficient and efficient sleep is crucial for our health. Natural short sleepers can sleep substantially shorter than the average population without a desire for more sleep and without any obvious negative health consequences.

In searching for genetic variants underlying the short sleep trait, we found 2 different mutations in the same gene (metabotropic glutamate receptor 1) from 2 independent natural short sleep families.

In vitro, both of the mutations exhibited loss of function in receptor-mediated signaling. In vivo, the mice carrying the individual mutations both demonstrated short sleep behavior. In brain slices, both of the mutations changed the electrical properties and increased excitatory synaptic transmission.

These results highlight the important role of metabotropic glutamate receptor 1 in modulating sleep duration.

[Keywords: mGluR1, loss-of-function, short-sleep]

“Exome Sequencing Identifies Rare Coding Variants in 10 Genes Which Confer Substantial Risk for Schizophrenia”, Singh et al 2020

“Exome sequencing identifies rare coding variants in 10 genes which confer substantial risk for schizophrenia”⁠, Tarjinder Singh, Timothy Poterba, David Curtis, Huda Akil, Mariam Al Eissa, Jack D. Barchas, Nicholas Bass et al (2020-09-18; ; similar):

By meta-analyzing the whole-exomes of 24,248 cases and 97,322 controls, we implicate ultra-rare coding variants (URVs) in ten genes as conferring substantial risk for schizophrenia (odds ratios 3–50, p <2.14×10−6), and 32 genes at an FDR < 5%. These genes have the greatest expression in central nervous system neurons and have diverse molecular functions that include the formation, structure, and function of the synapse. The associations of NMDA receptor subunit GRIN2A and AMPA receptor subunit GRIA3 provide support for the dysfunction of the glutamatergic system as a mechanistic hypothesis in the pathogenesis of schizophrenia. We find statistically-significant evidence for an overlap of rare variant risk between schizophrenia, autism spectrum disorders (ASD), and severe neurodevelopmental disorders (DD/​ID), supporting a neurodevelopmental etiology for schizophrenia. We show that protein-truncating variants in GRIN2A, TRIO, and CACNA1G confer risk for schizophrenia whereas specific missense mutations in these genes confer risk for DD/​ID. Nevertheless, few of the strongly associated schizophrenia genes appear to confer risk for DD/​ID. We demonstrate that genes prioritized from common variant analyses of schizophrenia are enriched in rare variant risk, suggesting that common and rare genetic risk factors at least partially converge on the same underlying pathogenic biological processes. Even after excluding statistically-significantly associated genes, schizophrenia cases still carry a substantial excess of URVs, implying that more schizophrenia risk genes await discovery using this approach.

Figure 6: The contributions of ultra-rare PTVs [protein-truncating variants] to schizophrenia risk. A: Genetic architecture of schizophrenia. statistically-significant genetic associations for schizophrenia from the most recent GWAS, CNV, and sequencing studies are displayed. The in-sample odds ratio is plotted against the minor allele frequency in the general population. The color of each dot corresponds to the source of the association, and the size of the dot to the odds ratio. The shaded area represented the LOESS-smoothed lines of the upper and lower bounds of the point estimates…Because schizophrenia as a trait is under strong selection38–40, we expect that URVs of large effect to be frequently de novo or of very recent origin and contribute to risk in only a fraction of diagnosed patients.

“Mapping Genomic Loci Prioritises Genes and Implicates Synaptic Biology in Schizophrenia”, Consortium et al 2020

“Mapping genomic loci prioritises genes and implicates synaptic biology in schizophrenia”⁠, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Stephan Ripke, James T. R. Walters et al (2020-09-13; ; backlinks; similar):

Extended Data Figure 2: GWAS progress over time. The relationship of GWAS associations to sample-size is shown in this plot with selected SCZ GWAS meta-analyses of the past 11 years. The x-axis shows number of cases. The y-axis shows the number of independent loci discovered with at least one genome-wide statistically-significant index SNP in the discovery meta-analysis (eg. without replication data)…The slope of ~4 newly discovered loci per 1000 cases between 2013 and 2019 increased to a slope of ~6 with the latest sample-size increase.

Schizophrenia is a psychiatric disorder whose pathophysiology is largely unknown. It has a heritability of 60–80%, much of which is attributable to common risk alleles, suggesting genome-wide association studies can inform our understanding of aetiology. Here, in 69,369 people with schizophrenia and 236,642 controls, we report common variant associations at 270 distinct loci. Using fine-mapping and functional genomic data, we prioritise 19 genes based on protein-coding or UTR variation, and 130 genes in total as likely to explain these associations. Fine-mapped candidates were enriched for genes associated with rare disruptive coding variants in people with schizophrenia, including the glutamate receptor subunit GRIN2A and transcription factor SP4, and were also enriched for genes implicated by such variants in autism and developmental disorder. Associations were concentrated in genes expressed in CNS neurons, both excitatory and inhibitory, but not other tissues or cell types, and implicated fundamental processes related to neuronal function, particularly synaptic organisation, differentiation and transmission. We identify biological processes of pathophysiological relevance to schizophrenia, show convergence of common and rare variant associations in schizophrenia and neurodevelopmental disorders, and provide a rich resource of priority genes and variants to advance mechanistic studies.

“Novel Ultra-Rare Exonic Variants Identified in a Founder Population Implicate Cadherins in Schizophrenia”, Lencz et al 2020

“Novel Ultra-Rare Exonic Variants Identified in a Founder Population Implicate Cadherins in Schizophrenia”⁠, Todd Lencz, Jin Yu, Raiyan Rashid Khan, Shai Carmi, Max Lam, Danny Ben-Avraham, Nir Barzilai, Susan Bressman et al (2020-09-11; ; similar):

Identification of rare genetic variants associated with schizophrenia has proven challenging due to multiple sources of heterogeneity, which may be reduced in founder populations. We examined ultra-rare exonic variants in 786 patients with schizophrenia and 463 healthy comparison subjects, all drawn from the Ashkenazi Jewish population. Cases had a higher frequency of novel missense or loss of function (MisLoF) variants compared to controls. Characterizing 141 “case-only” genes (in which ≥ 3 cases in our dataset had MisLoF variants with none found in controls), we identified cadherins as a novel gene set associated with schizophrenia, including a recurrent mutation in PCDHA3. Modeling the effects of purifying selection demonstrated that deleterious ultra-rare variants are greatly over-represented in the Ashkenazi population, resulting in enhanced power for rare variant association. Identification of cell adhesion genes in the cadherin/​protocadherin family helps specify the synaptic abnormalities central to the disorder, and suggests novel potential treatment strategies.

“Whole-exome Imputation within UK Biobank Powers Rare Coding Variant Association and Fine-mapping Analyses”, Barton et al 2020

“Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses”⁠, Alison R. Barton, Maxwell A. Sherman, Ronen E. Mukamel, Po-Ru Loh (2020-09-01; similar):

Exome association studies to date have generally been underpowered to systematically evaluate the phenotypic impact of very rare coding variants. We leveraged extensive haplotype sharing between 49,960 exome-sequenced UK Biobank participants and the remainder of the cohort (total N~500K) to impute exome-wide variants at high accuracy (R2>0.5) down to minor allele frequency (MAF) ~0.00005. Association and fine-mapping analyses of 54 quantitative traits identified 1,189 statistically-significant associations (P<5 x 10−8) involving 675 distinct rare protein-altering variants (MAF<0.01) that passed stringent filters for likely causality; 600 of the 675 variants (89%) were not present in the NHGRI-EBI GWAS Catalog. We replicated the effect directions of 28 of 28 height-associated variants genotyped in previous exome array studies, including missense variants in newly-associated collagen genes COL16A1 and COL11A2. Across all traits, 49% of associations (578/​1,189) occurred in genes with two or more hits; follow-up analyses of these genes identified long allelic series containing up to 45 distinct likely-causal variants within the same gene (on average exhibiting 93%-concordant effect directions). In particular, 24 rare coding variants in IFRD2 independently associated with reticulocyte indices, suggesting an important role of IFRD2 in red blood cell development, and 11 rare coding variants in NPR2 (a gene previously implicated in Mendelian skeletal disorders) exhibited intermediate-to-strong effects on height (0.18–1.09 s.d.). Our results demonstrate the utility of within-cohort imputation in population-scale GWAS cohorts, provide a catalog of likely-causal, large-effect coding variant associations, and foreshadow the insights that will be revealed as genetic biobank studies continue to grow.

“Exome-wide Association Studies in General and Long-lived Populations Identify Genetic Variants Related to Human Age”, Sin-Chan et al 2020

“Exome-wide association studies in general and long-lived populations identify genetic variants related to human age”⁠, Patrick Sin-Chan, Nehal Gosalia, Chuan Gao, Cristopher V. Van Hout, Bin Ye, Anthony Marcketta, Alexander H. Li et al (2020-07-19; similar):

Aging is characterized by degeneration in cellular and organismal functions leading to increased disease susceptibility and death. Although our understanding of aging biology in model systems has increased dramatically, large-scale sequencing studies to understand human aging are now just beginning. We applied exome sequencing and association analyses (ExWAS) to identify age-related variants on 58,470 participants of the DiscovEHR cohort. Linear Mixed Model regression analyses of age at last encounter revealed variants in genes known to be linked with clonal hematopoiesis of indeterminate potential, which are associated with myelodysplastic syndromes, as top signals in our analysis, suggestive of age-related somatic mutation accumulation in hematopoietic cells despite patients lacking clinical diagnoses. In addition to APOE, we identified rare DISP2 rs183775254 (p = 7.40×10−10) and ZYG11A rs74227999 (p = 2.50×10−08) variants that were negatively associated with age in either both sexes combined and females, respectively, which were replicated with directional consistency in two independent cohorts. Epigenetic mapping showed these variants are located within cell-type-specific enhancers, suggestive of important transcriptional regulatory functions. To discover variants associated with extreme age, we performed exome-sequencing on persons of Ashkenazi Jewish descent ascertained for extensive lifespans. Case-Control analyses in 525 Ashkenazi Jews cases (Males ≥ 92 years, Females ≥ 95years) were compared to 482 controls. Our results showed variants in APOE (rs429358, rs6857), and TMTC2 (rs7976168) passed Bonferroni-adjusted p-value, as well as several nominally-associated population-specific variants. Collectively, our Age-ExWAS, the largest performed to date, confirmed and identified previously unreported candidate variants associated with human age.

“Genetic Ancestry Analysis on >93,000 Individuals Undergoing Expanded Carrier Screening Reveals Limitations of Ethnicity-based Medical Guidelines”, Kaseniit et al 2020

“Genetic ancestry analysis on >93,000 individuals undergoing expanded carrier screening reveals limitations of ethnicity-based medical guidelines”⁠, Kristjan E. Kaseniit, Imran S. Haque, James D. Goldberg, Lee P. Shulman, Dale Muzzey (2020-06-29; similar):

Purpose: Carrier status associates strongly with genetic ancestry, yet current carrier screening guidelines recommend testing for a limited set of conditions based on a patient’s self-reported ethnicity. Ethnicity, which can reflect both genetic ancestry and cultural factors (eg. religion), may be imperfectly known or communicated by patients. We sought to quantitatively assess the efficacy and equity with which ethnicity-based carrier screening captures recessive disease risk.

Methods: For 93,419 individuals undergoing a 96-gene expanded carrier screen (ECS), correspondence was assessed among carrier status, self-reported ethnicity, and a dual-component genetic ancestry (eg. 75% African/​25% European) calculated from sequencing data.

Results: Self-reported ethnicity was an imperfect indicator of genetic ancestry, with 9% of individuals having >50% genetic ancestry from a lineage inconsistent with self-reported ethnicity. Limitations of self-reported ethnicity led to missed carriers in at-risk populations: for 10 ECS conditions, patients with intermediate genetic ancestry backgrounds—who did not self-report the associated ethnicity—had statistically-significantly elevated carrier risk. Finally, for 7 of the 16 conditions included in current screening guidelines, most carriers were not from the population the guideline aimed to serve.

Conclusion: Substantial and disproportionate risk for recessive disease is not detected when carrier screening is based on ethnicity, leading to inequitable reproductive care.

“Genomic Analyses Implicate Noncoding de Novo Variants in Congenital Heart Disease”, Richter et al 2020

2020-richter.pdf: “Genomic analyses implicate noncoding de novo variants in congenital heart disease”⁠, Felix Richter, Sarah U. Morton, Seong Won Kim, Alexander Kitaygorodsky, Lauren K. Wasson, Kathleen M. Chen et al (2020-06-29; similar):

A genetic etiology is identified for 1⁄3rd of patients with congenital heart disease (CHD), with 8% of cases attributable to coding de novo variants (DNVs). To assess the contribution of noncoding DNVs to CHD, we compared genome sequences from 749 CHD probands and their parents with those from 1,611 unaffected trios. Neural network prediction of noncoding DNV transcriptional impact identified a burden of DNVs in individuals with CHD (n = 2,238 DNVs) compared to controls (n = 4,177; p = 8.7 × 10−4). Independent analyses of enhancers showed an excess of DNVs in associated genes (27 genes versus 3.7 expected, p = 1 × 10−5). We observed statistically-significant overlap between these transcription-based approaches (odds ratio (OR) = 2.5, 95% confidence interval (CI) 1.1–5.0, p = 5.4 × 10−3). CHD DNVs altered transcription levels in 5 of 31 enhancers assayed. Finally, we observed a DNV burden in RNA-binding-protein regulatory sites (OR = 1.13, 95% CI 1.1–1.2, p = 8.8 × 10−5). Our findings demonstrate an enrichment of potentially disruptive regulatory noncoding DNVs in a fraction of CHD at least as high as that observed for damaging coding DNVs.

“An Integrated Polygenic and Clinical Risk Tool Enhances Coronary Artery Disease Prediction”, Aguilera et al 2020

“An integrated polygenic and clinical risk tool enhances coronary artery disease prediction”⁠, Fernando Riveros-Mckay Aguilera, Michael E. Weale, Rachel Moore, Saskia Selzam, Eva Krapohl, R. Michael Sivley et al (2020-06-03; backlinks; similar):

Background: There is considerable interest in whether genetic data can be used to improve standard cardiovascular disease risk calculators, as the latter are routinely used in clinical practice to manage preventative treatment.

Methods: This research has been conducted using the UK Biobank (UKB) resource. We developed our own polygenic risk score (PRS) for coronary artery disease (CAD), using novel and established methods to combine published genome-wide association study (GWAS) data with data from 114,196 UK Biobank individuals, also leveraging a large resource of other GWAS datasets along with functional information, to aid in the identification of causal variants, and thence define weights for > 8M genetic variants. We utilised a further 60,000 UKB individuals to develop an integrated risk tool (IRT) that combined our PRS with established risk tools (either the American Heart Association/​American College of Cardiology’s pooled cohort equations (PCE) or the UK’s QRISK3) which was then tested in an additional, independent, set of 212,563 UKB individuals. We evaluated prediction performance in individuals of European ancestry, both as a whole and stratified by age and sex.

Findings: The novel CAD PRS showed superior predictive power for CAD events, compared to other published PRSs. As an individual risk factor, it has similar predictive power to each of systolic blood pressure, HDL cholesterol, and LDL cholesterol, but is more predictive than total cholesterol and smoking history. Our novel CAD PRS is largely uncorrelated with PCE, QRISK3, and family history, and, when combined with PCE into an integrated risk tool, had superior predictive accuracy. In individuals reclassified as high risk, CAD event rates were markedly and statistically-significantly higher compared to those reclassified as low risk. Overall, 9.7% of incident CAD cases were misclassified as low risk by PCE and correctly classified as high risk by the IRT, in contrast to 3.7% misclassified by the IRT and correctly classified by PCE. The overall net reclassification improvement for the IRT was 5.7% (95% CI 4.4–7.0), but when individuals were stratified into four age-by-sex subgroups the improvement was larger for all subgroups (range 7.7%-17.3%), with best performance in younger middle-aged men aged 40–54yo (17.3%, 95% CI 13.0–21.5). Broadly similar results were found using a different risk tool (QRISK3), and also for cardiovascular disease events defined more broadly.

Interpretation: An integrated risk tool that includes polygenic risk outperforms current, clinical risk stratification tools, and offers greater opportunity for early interventions. Given the plummeting costs of genetic tests, future iterations of CAD risk tools would be enhanced with the addition of a person’s polygenic risk.

“The Burden of Rare Protein-truncating Genetic Variants on Human Lifespan”, Liu et al 2020

“The burden of rare protein-truncating genetic variants on human lifespan”⁠, Jimmy Z. Liu, Chia-Yen Chen, Ellen A. Tsai, Christopher D. Whelan, David Sexton, Sally John, Heiko Runz et al (2020-06-03; ):

Genetic predisposition is believed to contribute substantially to the age at which we die. Genome-wide association studies (GWAS) have implicated more than 20 genetic loci to phenotypes related to human lifespan1. However, little is known about how lifespan is impacted by gene loss-of-function. Through whole-exome sequencing of 238,239 UK Biobank participants, we assessed the relevance of protein-truncating variant (PTV) gene burden on individual and parental survival. We identified exome-wide (p < 2.5e-6) statistically-significant associations between BRCA2, BRCA1, TET2, PPM1D, LDLR, EML2 and DEDD2 PTV-burden with human lifespan. Gene and gene-set PTV-burden phenome-wide association studies (PheWAS) further highlighted the roles of these genes in cancer and cardiovascular disease as relevant for overall survival. The overlap between PTV-burden and prior GWAS results was modest, underscoring the value of sequencing in well-powered cohorts to complement GWAS for identifying loci associated with complex traits and disease.

“Whole-genome Sequencing of Rare Disease Patients in a National Healthcare System”, Ouwehand et al 2020

“Whole-genome sequencing of rare disease patients in a national healthcare system”⁠, Willem H. Ouwehand, on behalf of the NIHR BioResource, the 100,000 Genomes Project (2020-02-18; similar):

Most patients with rare diseases do not receive a molecular diagnosis and the aetiological variants and mediating genes for more than half such disorders remain to be discovered. We implemented whole-genome sequencing (WGS) in a national healthcare system to streamline diagnosis and to discover unknown aetiological variants, in the coding and non-coding regions of the genome.

In a pilot study for the 100,000 Genomes Project, we generated WGS data for 13,037 participants, of whom 9,802 had a rare disease, and provided a genetic diagnosis to 1,138 of the 7,065 patients with detailed phenotypic data. We identified 95 Mendelian associations between genes and rare diseases, of which 11 have been discovered since 2015 and at least 79 are confirmed aetiological.

Using WGS of UK Biobank1, we showed that rare alleles can explain the presence of some individuals in the tails of a quantitative red blood cell (RBC) trait. Finally, we reported 4 novel non-coding variants which cause disease through the disruption of transcription of ARPC1B, GATA1, LRBA and MPL. Our study demonstrates a synergy by using WGS for diagnosis and aetiological discovery in routine healthcare.

“Rare Genetic Variants Associated With Sudden Cardiac Death in Adults”, Khera et al 2019

2019-khera.pdf: “Rare Genetic Variants Associated With Sudden Cardiac Death in Adults”⁠, Amit V. Khera, Heather Mason-Suares, Deanna Brockman, Minxian Wang, Martin J. VanDenburgh, Ozlem Senol-Cosar et al (2019-11-18; similar):

Background: Sudden cardiac death occurs in ~220,000 U.S. adults annually, the majority of whom have no prior symptoms or cardiovascular diagnosis. Rare pathogenic DNA variants in any of 49 genes can pre-dispose to 4 important causes of sudden cardiac death: cardiomyopathy, coronary artery disease, inherited arrhythmia syndrome, and aortopathy or aortic dissection.

Objectives: This study assessed the prevalence of rare pathogenic variants in sudden cardiac death cases versus controls, and the prevalence and clinical importance of such mutations in an asymptomatic adult population.

Methods: The authors performed whole-exome sequencing in a case-control cohort of 600 adult-onset sudden cardiac death cases and 600 matched controls from 106,098 participants of 6 prospective cohort studies. Observed DNA sequence variants in any of 49 genes with known association to cardiovascular disease were classified as pathogenic or likely pathogenic by a clinical laboratory geneticist blinded to case status. In an independent population of 4,525 asymptomatic adult participants of a prospective cohort study, the authors performed whole-genome sequencing and determined the prevalence of pathogenic or likely pathogenic variants and prospective association with cardiovascular death.

Results: Among the 1,200 sudden cardiac death cases and controls, the authors identified 5,178 genetic variants and classified 14 as pathogenic or likely pathogenic. These 14 variants were present in 15 individuals, all of whom had experienced sudden cardiac death—corresponding to a pathogenic variant prevalence of 2.5% in cases and 0% in controls (p < 0.0001). Among the 4,525 participants of the prospective cohort study, 41 (0.9%) carried a pathogenic or likely pathogenic variant and these individuals had 3.24-fold higher risk of cardiovascular death over a median follow-up of 14.3 years (p = 0.02).

Conclusions: Gene sequencing identifies a pathogenic or likely pathogenic variant in a small but potentially important subset of adults experiencing sudden cardiac death; these variants are present in ~1% of asymptomatic adults.

“Mutant Neuropeptide S Receptor Reduces Sleep Duration With Preserved Memory Consolidation”, Xing et al 2019

2019-xing.pdf: “Mutant neuropeptide S receptor reduces sleep duration with preserved memory consolidation”⁠, Lijuan Xing, Guangsen Shi, Yulia Mostovoy, Nicholas W. Gentry, Zenghua Fan, Thomas B. Mcmahon, Pui-Yan Kwok et al (2019-10-16; ):

Sleep is a crucial physiological process for our survival and cognitive performance, yet the factors controlling human sleep regulation remain poorly understood.

Here, we identified a missense mutation in a G protein-coupled neuropeptide S receptor 1 (NPSR1) that is associated with a natural short sleep phenotype in humans. Mice carrying the homologous mutation exhibited less sleep time despite increased sleep pressure. These animals were also resistant to contextual memory deficits associated with sleep deprivation. In vivo, the mutant receptors showed increased sensitivity to neuropeptide S exogenous activation.

These results suggest that the NPS/​NPSR1 pathway might play a critical role in regulating human sleep duration and in the link between sleep homeostasis and memory consolidation.

“Germline Burden of Rare Damaging Variants Negatively Affects Human Healthspan and Lifespan”, Shindyapina et al 2019

“Germline burden of rare damaging variants negatively affects human healthspan and lifespan”⁠, Anastasia V. Shindyapina, Aleksandr A. Zenin, Andrei E. Tarkhov, Peter O. Fedichev, Vadim N. Gladyshev et al (2019-10-13; similar):

Genome-wide association studies often explore links between particular genes and phenotypes of interest. Known genetic variants, however, are responsible for only a small fraction of human lifespan variation evident from genetic twin studies. To account for the missing longevity variance, we hypothesized that the cumulative effect of deleterious variants may affect human longevity. Here, we report that the burden of rarest protein-truncating variants (PTVs) negatively impacts both human healthspan and lifespan in two large independent cohorts. Longer-living subjects have both fewer rarest PTVs and less damaging PTVs. In contrast, we show that the burden of frequent PTVs and rare non-PTVs is less deleterious, lacking association with longevity. The combined effect of rare PTVs is similar to that of known variants associated with longer lifespan and accounts for 1 − 2 years of lifespan variability. We further find that somatic accumulation of PTVs accounts for a minute fraction of mortality and morbidity acceleration and hence provides little support for its causal role in aging. Thus, damaging mutations, germline and somatic, can only contribute to aging as a result of higher-order effects including interactions of multiple forms of damage.

“The Human-Specific BOLA2 Duplication Modifies Iron Homeostasis and Anemia Predisposition in Chromosome 16p11.2 Autism Individuals”, Giannuzzi et al 2019

“The Human-Specific BOLA2 Duplication Modifies Iron Homeostasis and Anemia Predisposition in Chromosome 16p11.2 Autism Individuals”⁠, Giuliana Giannuzzi, Paul J. Schmidt, Eleonora Porcu, Gilles Willemin, Katherine M. Munson, Xander Nuttle et al (2019; similar):

Human-specific duplications at chromosome 16p11.2 mediate recurrent pathogenic 600 Kbp BP4-BP5 copy-number variations, which are among the most common genetic causes of autism. These copy-number polymorphic duplications are under positive selection and include three to eight copies of BOLA2, a gene involved in the maturation of cytosolic iron-sulfur proteins. To investigate the potential advantage provided by the rapid expansion of BOLA2, we assessed hematological traits and anemia prevalence in 379,385 controls and individuals who have lost or gained copies of BOLA2: 89 chromosome 16p11.2 BP4-BP5 deletion carriers and 56 reciprocal duplication carriers in the UK Biobank. We found that the 16p11.2 deletion is associated with anemia (18/​89 carriers, 20%, p = 4e-7, OR = 5), particularly iron-deficiency anemia. We observed similar enrichments in two clinical 16p11.2 deletion cohorts, which included 6/​63 (10%) and 7/​20 (35%) unrelated individuals with anemia, microcytosis, low serum iron, or low blood hemoglobin. Upon stratification by BOLA2 copy number, our data showed an association between low BOLA2 dosage and the above phenotypes (8/​15 individuals with three copies, 53%, p = 1e-4). In parallel, we analyzed hematological traits in mice carrying the 16p11.2 orthologous deletion or duplication, as well as Bola2± and Bola2-/​- animals. The Bola2-deficient mice and the mice carrying the deletion showed early evidence of iron deficiency, including a mild decrease in hemoglobin, lower plasma iron, microcytosis, and an increased red blood cell zinc-protoporphyrin-to-heme ratio. Our results indicate that BOLA2 participates in iron homeostasis in vivo, and its expansion has a potential adaptive role in protecting against iron deficiency.

“A Rare Mutation of Β1-Adrenergic Receptor Affects Sleep/Wake Behaviors”, Shi et al 2019

“A Rare Mutation of β1-Adrenergic Receptor Affects Sleep/Wake Behaviors”⁠, Guangsen Shi, Lijuan Xing, David Wu, Bula J. Bhattacharyya, Christopher R. Jones, Thomas McMahon, S. Y Christin Chong et al (2019; ):

Sleep is crucial for our survival, and many diseases are linked to long-term poor sleep quality. Before we can use sleep to enhance our health and performance and alleviate diseases associated with poor sleep, a greater understanding of sleep regulation is necessary. We have identified a mutation in the β1-adrenergic receptor gene in humans who require fewer hours of sleep than most. In vitro, this mutation leads to decreased protein stability and dampened signaling in response to agonist treatment. In vivo, the mice carrying the same mutation demonstrated short sleep behavior. We found that this receptor is highly expressed in the dorsal pons and that these ADRB1+ neurons are active during rapid eye movement (REM) sleep and wakefulness. Activating these neurons can lead to wakefulness, and the activity of these neurons is affected by the mutation. These results highlight the important role of β1-adrenergic receptors in sleep/​wake regulation.

“Phenome-wide Burden of Copy-Number Variation in the UK Biobank”, Aguirre et al 2019

“Phenome-wide Burden of Copy-Number Variation in the UK Biobank”⁠, Matthew Aguirre, Manuel A. Rivas, James Priest (2019; similar):

Copy-number variations (CNVs) represent a significant proportion of the genetic differences between individuals and many CNVs associate causally with syndromic disease and clinical outcomes. Here, we characterize the landscape of copy-number variation and their phenome-wide effects in a sample of 472,228 array-genotyped individuals from the UK Biobank. In addition to population-level selection effects against genic loci conferring high mortality, we describe genetic burden from potentially pathogenic and previously uncharacterized CNV loci across more than 3,000 quantitative and dichotomous traits, with separate analyses for common and rare classes of variation.

Specifically, we highlight the effects of CNVs at two well-known syndromic loci 16p11.2 and 22q11.2, previously uncharacterized variation at 9p23, and several genic associations in the context of acute coronary artery disease and high body mass index. Our data constitute a deeply contextualized portrait of population-wide burden of copy-number variation, as well as a series of dosage-mediated genic associations across the medical phenome.

“Schizophrenia Risk Conferred by Protein-coding de Novo Mutations”, Howrigan et al 2018

“Schizophrenia risk conferred by protein-coding de novo mutations”⁠, Daniel P. Howrigan, Samuel A. Rose, Kaitlin E. Samocha, Menachem Fromer, Felecia Cerrato, Wei J. Chen et al (2018-12-13; ; similar):

Protein-coding de novo mutations (DNMs) in the form of single nucleotide changes and short insertions/​deletions are significant genetic risk factors for autism, intellectual disability, developmental delay, and epileptic encephalopathy. In contrast, the burden of DNMs has thus far only had a modest documented impact on schizophrenia (SCZ) risk. Here, we analyze whole-exome sequence from 1,695 SCZ affected parent-offspring trios from Taiwan along with DNMs from 1,077 published SCZ trios to better understand the contribution of coding DNMs to SCZ risk. Among 2,772 SCZ affected probands, the increased burden of DNMs is modest. Gene set analyses show that the modest increase in risk from DNMs in SCZ probands is concentrated in genes that are either highly brain expressed, under strong evolutionary constraint, and/​or overlap with genes identified as DNM risk factors in other neurodevelopmental disorders. No single gene meets the criteria for genome-wide statistical-significance, but we identify 16 genes that are recurrently hit by a protein-truncating DNM, which is a 3.15× higher rate than mutation model expectation of 5.1 genes (permuted 95% CI = 1–10 genes, permuted p = 3e-5). Overall, DNMs explain only a small fraction of SCZ risk, and this risk is polygenic in nature suggesting that coding variation across many different genes will be a risk factor for SCZ in the population.

“Quantifying the Effects of 16p11.2 Copy Number Variants on Brain Structure: A Multisite Genetic-First Study”, Martin-Brevet et al 2018

“Quantifying the Effects of 16p11.2 Copy Number Variants on Brain Structure: A Multisite Genetic-First Study”⁠, Sandra Martin-Brevet, Borja Rodríguez-Herreros, Jared A. Nielsen, Clara Moreau, Claudia Modenato, Anne M. Maillard et al (2018-08-15; ⁠, ; similar):

Background: 16p11.2 breakpoint 4 to 5 copy number variants (CNVs) increase the risk for developing autism spectrum disorder⁠, schizophrenia, and language and cognitive impairment. In this multisite study, we aimed to quantify the effect of 16p11.2 CNVs on brain structure.

Methods: Using voxel-based and surface-based brain morphometric methods, we analyzed structural magnetic resonance imaging collected at 7 sites from 78 individuals with a deletion, 71 individuals with a duplication, and 212 individuals without a CNV.

Results: Beyond the 16p11.2-related mirror effect on global brain morphometry, we observe regional mirror differences in the insula (deletion > control > duplication). Other regions are preferentially affected by either the deletion or the duplication: the calcarine cortex and transverse temporal gyrus (deletion > control; Cohen’s d > 1), the superior and middle temporal gyri (deletion < control; Cohen’s d < −1), and the caudate and hippocampus (control > duplication; −0.5 > Cohen’s d > −1). Measures of cognition, language, and social responsiveness and the presence of psychiatric diagnoses do not influence these results.

Conclusions: The global and regional effects on brain morphometry due to 16p11.2 CNVs generalize across site, computational method, age, and sex. effect-sizes on neuroimaging and cognitive traits are comparable. Findings partially overlap with results of meta-analyses performed across psychiatric disorders. However, the lack of correlation between morphometric and clinical measures suggests that CNV-associated brain changes contribute to clinical manifestations but require additional factors for the development of the disorder. These findings highlight the power of genetic risk factors as a complement to studying groups defined by behavioral criteria.

[Keywords: 16p11.2, autism spectrum disorder, copy number variant, genetics, imaging, neurodevelopmental disorders]

“Common Genetic Variants Contribute to Risk of Rare Severe Neurodevelopmental Disorders”, Niemi et al 2018

“Common genetic variants contribute to risk of rare severe neurodevelopmental disorders”⁠, Mari E. K. Niemi, Hilary C. Martin, Daniel L. Rice, Giuseppe Gallone, Scott Gordon, Martin Kelemen, Kerrie McAloney et al (2018-05-04; ; similar):

There are thousands of rare human disorders caused by a single deleterious, protein-coding genetic variant 1. However, patients with the same genetic defect can have different clinical presentation 2–4, and some individuals carrying known disease-causing variants can appear unaffected 5. What explains these differences? Here, we show in a cohort of 6,987 children with heterogeneous severe neurodevelopmental disorders expected to be almost entirely monogenic that 7.7% of variance in risk is attributable to inherited common genetic variation. We replicated this genome wide common variant burden by showing that it is over-transmitted from parents to children in an independent sample of 728 trios from the same cohort. Our common variant signal is significantly positively correlated with genetic predisposition to fewer years of schooling, decreased intelligence, and risk of schizophrenia. We found that common variant risk was not significantly different between individuals with and without a known protein-coding diagnostic variant, suggesting that common variant risk is not confined to patients without a monogenic diagnosis. In addition, previously published common variant scores for autism, height, birth weight, and intracranial volume were all correlated with those traits within our cohort, suggesting that phenotypic expression in individuals with monogenic disorders is affected by the same variants as the general population. Our results demonstrate that common genetic variation affects both overall risk and clinical presentation in disorders typically considered to be monogenic.

“Frequency and Distribution of 152 Genetic Disease Variants in over 100,000 Mixed Breed and Purebred Dogs”, Donner et al 2018

“Frequency and distribution of 152 genetic disease variants in over 100,000 mixed breed and purebred dogs”⁠, Jonas Donner, Heidi Anderson, Stephen Davison, Angela M. Hughes, Julia Bouirmane, Johan Lindqvist, Katherine M. Lytle et al (2018-04-11; ; backlinks; similar):

Knowledge on the genetic epidemiology of disorders in the dog population has implications for both veterinary medicine and sustainable breeding. Limited data on frequencies of genetic disease variants across breeds exists, and the disease heritage of mixed breed dogs remains poorly explored to date. Advances in genetic screening technologies now enable comprehensive investigations of the canine disease heritage, and generate health-related big data that can be turned into action.

We pursued population screening of genetic variants implicated in Mendelian disorders in the largest canine study sample examined to date by examining over 83,000 mixed breed and 18,000 purebred dogs representing 330 breeds for 152 known variants using a custom-designed beadchip microarray. We further announce the creation of MyBreedData, an online updated inherited disorder prevalence resource with its foundation in the generated data.

We identified the most prevalent, and rare, disease susceptibility variants across the general dog population while providing the first extensive snapshot of the mixed breed disease heritage. Approximately two in five dogs carried at least one copy of a tested disease variant. Most disease variants are shared by both mixed breeds and purebreds, while breed-specificity or line-specificity of others is strongly suggested. Mixed breed dogs were more likely to carry a common recessive disease, whereas purebreds were more likely to be genetically affected with one, providing DNA-based evidence for hybrid vigor. We discovered genetic presence of 22 disease variants in at least one additional breed in which they were previously undescribed. Some mutations likely manifest similarly independently of breed background; however, we emphasize the need for follow up investigations in each case and provide a suggested validation protocol for broader consideration. In conclusion, our study provides unique insight into genetic epidemiology of canine disease risk variants, and their relevance for veterinary medicine, breeding programs and animal welfare.

Author summary:

Like any human, dogs may suffer from or pass on a variety of inherited disorders. Knowledge of how likely a typical dog is to carry an inherited disorder in its genome, and which disorders are the most common and relevant ones across dog breeds, is valuable for both veterinary care and breeding of healthy dogs.

We have explored the largest global dog study sample collected to date, consisting of more than 100,000 mixed breed and purebred dogs, to advance research on this subject. We found that mixed breed dogs and purebred dogs potentially suffer from many of the same inherited disorders, and that around two in five dogs carried at least one of the conditions that we screened for. A dog carrying an inherited disorder is not a “bad dog”—but we humans responsible for breeding selections do need to make sustainable decisions avoiding inbreeding, ie. mating of dogs that are close relatives. The disease prevalence information we generated during this study is made available online ( [now defunct?]), as a free tool for breed and kennel clubs, breeders, as well as the veterinary and scientific community.

“Relationships between Estimated Autozygosity and Complex Traits in the UK Biobank”, Johnson et al 2018

“Relationships between estimated autozygosity and complex traits in the UK Biobank”⁠, Emma C. Johnson, Luke M. Evans, Matthew C. Keller (2018-03-29; similar):

Inbreeding increases the risk of certain Mendelian disorders in humans but may also reduce fitness through its effects on complex traits and diseases. Such inbreeding depression is thought to occur due to increased homozygosity at causal variants that are recessive with respect to fitness. Until recently it has been difficult to amass large enough sample sizes to investigate the effects of inbreeding depression on complex traits using genome-wide single nucleotide polymorphism (SNP) data in population-based samples. Further, it is difficult to infer causation in analyses that relate degree of inbreeding to complex traits because confounding variables (eg. education) may influence both the likelihood for parents to outbreed and offspring trait values. The present study used runs of homozygosity in genome-wide SNP data in up to 400,000 individuals in the UK Biobank to estimate the proportion of the autosome that exists in autozygous tracts—stretches of the genome which are identical due to a shared common ancestor. After multiple testing corrections and controlling for possible sociodemographic confounders, we found significant relationships in the predicted direction between estimated autozygosity and three of the 26 traits we investigated: age at first sexual intercourse, fluid intelligence, and forced expiratory volume in 1 second. Our findings for fluid intelligence and forced expiratory volume corroborate those of several published studies while the finding for age at first sexual intercourse was novel. These results may suggest that these traits have been associated with Darwinian fitness over evolutionary time, although there are other possible explanations for these associations that cannot be eliminated. Some of the autozygosity-trait relationships were attenuated after controlling for background sociodemographic characteristics, suggesting that care needs to be taken in the design and interpretation of ROH studies in order to glean reliable information about the genetic architecture and evolutionary history of complex traits.

Author Summary

Inbreeding is well known to increase the risk of rare, monogenic diseases, and there has been some evidence that it also affects complex traits, such as cognition and educational attainment. However, difficulties can arise when inferring causation in these types of analyses because of the potential for confounding variables (eg. socioeconomic status) to bias the observed relationships between distant inbreeding and complex traits. In this investigation, we used single-nucleotide polymorphism data in a very large (N > 400,000) sample of seemingly outbred individuals to quantify the degree to which distant inbreeding is associated with 26 complex traits. We found robust evidence that distant inbreeding is inversely associated with fluid intelligence and a measure of lung function, and is positively associated with age at first sex, while other trait associations with inbreeding were attenuated after controlling for background sociodemographic characteristics. Our findings are consistent with evolutionary predictions that fluid intelligence, lung function, and age at first sex have been under selection pressures over time; however, they also suggest that confounding variables must be accounted for in order to reliably interpret results from these types of analyses.

“Measuring and Estimating the Effect Sizes of Copy Number Variants on General Intelligence in Community-Based Samples”, Huguet et al 2018

“Measuring and Estimating the Effect Sizes of Copy Number Variants on General Intelligence in Community-Based Samples”⁠, Guillaume Huguet, Catherine Schramm, Elise Douard, Lai Jiang, Aurélie Labbe, Frédérique Tihy, Géraldine Mathonnet et al (2018; ; similar):

Importance;: Copy number variants (CNVs) classified as pathogenic are identified in 10% to 15% of patients referred for neurodevelopmental disorders. However, their effect sizes on cognitive traits measured as a continuum remain mostly unknown because most of them are too rare to be studied individually using association studies.

Objective: To measure and estimate the effect sizes of recurrent and nonrecurrent CNVs on IQ.

Design, Setting, and Participants: This study identified all CNVs that were 50 kilobases (kb) or larger in 2 general population cohorts (the IMAGEN project and the Saguenay Youth Study) with measures of IQ. Linear regressions, including functional annotations of genes included in CNVs, were used to identify features to explain their association with IQ. Validation was performed using intraclass correlation that compared IQ estimated by the model with empirical data.

Main Outcomes and Measures: Performance IQ (PIQ), verbal IQ (VIQ), and frequency of de novo CNV events.

Results: The study included 2090 European adolescents from the IMAGEN study and 1983 children and parents from the Saguenay Youth Study. Of these, genotyping was performed on 1804 individuals from IMAGEN and 977 adolescents, 445 mothers, and 448 fathers (484 families) from the Saguenay Youth Study. We observed 4928 autosomal CNVs larger than 50 kb across both cohorts. For rare deletions, size, number of genes, and exons affect IQ, and each deleted gene is associated with a mean (SE) decrease in PIQ of 0.67 (0.19) points (p = 6 × 10–4); this is not so for rare duplications and frequent CNVs. Among 10 functional annotations, haploinsufficiency scores best explain the association of any deletions with PIQ with a mean (SE) decrease of 2.74 (0.68) points per unit of the probability of being loss-of-function intolerant (p = 8 × 10–5). Results are consistent across cohorts and unaffected by sensitivity analyses removing pathogenic CNVs. There is a 0.75 concordance (95% CI, 0.39–0.91) between the effect size on IQ estimated by our model and IQ loss calculated in previous studies of 15 recurrent CNVs. There is a close association between effect size on IQ and the frequency at which deletions occur de novo (odds ratio, 0.86; 95% CI, 0.84–0.87; p = 2.7 × 10–88). There is a 0.76 concordance (95% CI, 0.41–0.91) between de novo frequency estimated by the model and calculated using data from the DECIPHER database.

Conclusions and Relevance: Models trained on nonpathogenic deletions in the general population reliably estimate the effect size of pathogenic deletions and suggest omnigenic associations of haploinsufficiency with IQ. This represents a new framework to study variants too rare to perform individual association studies and can help estimate the cognitive effect of undocumented deletions in the neurodevelopmental clinic.

“Medical Consequences of Pathogenic CNVs in Adults: Analysis of the UK Biobank”, Crawford et al 2018

“Medical consequences of pathogenic CNVs in adults: analysis of the UK Biobank”⁠, Karen Crawford, Matthew Bracher-Smith, David Owen, Kimberley M. Kendall, Elliott Rees, Antonio F. Pardiñas et al (2018; similar):

Background: Genomic CNVs increase the risk for early-onset neurodevelopmental disorders, but their impact on medical outcomes in later life is still poorly understood. The UK Biobank allows us to study the medical consequences of CNVs in middle and old age in half a million well-phenotyped adults.

Methods: We analysed all Biobank participants for the presence of 54 CNVs associated with genomic disorders or clinical phenotypes, including their reciprocal deletions or duplications. After array quality control and exclusion of first-degree relatives, we compared 381 452 participants of white British or Irish origin who carried no CNVs with carriers of each of the 54 CNVs (ranging from 5 to 2,843 persons). We used logistic regression analysis to estimate the risk of developing 58 common medical phenotypes (3,132 comparisons).

Results: and conclusions Many of the CNVs have profound effects on medical health and mortality, even in people who have largely escaped early neurodevelopmental outcomes. 46 CNV-phenotype associations were statistically-significant at a false discovery rate threshold of 0.1, all in the direction of increased risk. Known medical consequences of CNVs were confirmed, but most identified associations are novel. Deletions at 16p11.2 and 16p12.1 had the largest numbers of statistically-significantly associated phenotypes (seven each). Diabetes⁠, hypertension⁠, obesity and renal failure were affected by the highest numbers of CNVs.

Our work should inform clinicians in planning and managing the medical care of CNV carriers.

“Singleton Variants Dominate the Genetic Architecture of Human Gene Expression”, Hernandez et al 2017

“Singleton Variants Dominate the Genetic Architecture of Human Gene Expression”⁠, Ryan D. Hernandez, Lawrence H. Uricchio, Kevin Hartman, Chun Ye, Andrew Dahl, Noah Zaitlen (2017-11-14; similar):

The vast majority of human mutations have minor allele frequencies (MAF) under 1%, with the plurality observed only once (ie. “singletons”). While Mendelian diseases are predominantly caused by rare alleles, their role in complex phenotypes remains largely unknown. We develop and rigorously validate an approach to jointly estimate the contribution of alleles with different frequencies, including singletons, to phenotypic variation. We apply our approach to transcriptional regulation, an intermediate between genetic variation and complex disease. Using whole genome DNA and RNA sequencing data from 360 European individuals, we find that singletons alone contribute ~23% of all cis-heritability across genes (dwarfing the contributions of other frequencies). We then integrate external estimates of global MAF from worldwide samples to improve our inference, and find that average cis-heritability is 15.3%. Strikingly, 50.9% of cis-heritability is contributed by globally rare variants (MAF<0.1%), implicating purifying selection as a pervasive force shaping the regulatory architecture of most human genes.

One Sentence Summary

The vast majority of variants so far discovered in humans are rare, and together they have a substantial impact on gene regulation.

“Quantification of Frequency-dependent Genetic Architectures and Action of Negative Selection in 25 UK Biobank Traits”, Schoech et al 2017

“Quantification of frequency-dependent genetic architectures and action of negative selection in 25 UK Biobank traits”⁠, Armin P. Schoech, Daniel Jordan, Po-Ru Loh, Steven Gazal, Luke O’Connor, Daniel J. Balick, Pier F. Palamara et al (2017-09-13; ; backlinks; similar):

Understanding the role of rare variants is important in elucidating the genetic basis of human diseases and complex traits. It is widely believed that negative selection can cause rare variants to have larger per-allele effect sizes than common variants. Here, we develop a method to estimate the minor allele frequency (MAF) dependence of SNP effect sizes. We use a model in which per-allele effect sizes have variance proportional to [p(1−p)]α, where p is the MAF and negative values of α imply larger effect sizes for rare variants. We estimate α by maximizing its profile likelihood in a linear mixed model framework using imputed genotypes, including rare variants (MAF >0.07%). We applied this method to 25 UK Biobank diseases and complex traits (n = 113,851). All traits produced negative α estimates with 20 significantly negative, implying larger rare variant effect sizes. The inferred best-fit distribution of true α values across traits had mean −0.38 (s.e. 0.02) and standard deviation 0.08 (s.e. 0.03), with statistically-significant heterogeneity across traits (p = 0.0014). Despite larger rare variant effect sizes, we show that for most traits analyzed, rare variants (MAF <1%) explain less than 10% of total SNP-heritability. Using evolutionary modeling and forward simulations, we validated the α model of MAF-dependent trait effects and estimated the level of coupling between fitness effects and trait effects. Based on this analysis an average genome-wide negative selection coefficient on the order of 10−4 or stronger is necessary to explain the α values that we inferred.

“A Genome-wide Association Study for Extremely High Intelligence”, Zabaneh et al 2017

“A genome-wide association study for extremely high intelligence”⁠, D. Zabaneh, E. Krapohl, H. A. Gaspar, C. Curtis, S. H. Lee, H. Patel, S. Newhouse, H. M. Wu, M. A. Simpson et al (2017-07-04; ; backlinks; similar):

We used a case-control genome-wide association (GWA) design with cases consisting of 1238 individuals from the top 0.0003 (~170 mean IQ) of the population distribution of intelligence and 8172 unselected population-based controls. The single-nucleotide polymorphism heritability for the extreme IQ trait was 0.33 (0.02), which is the highest so far for a cognitive phenotype, and statistically-significant genome-wide genetic correlations of 0.78 were observed with educational attainment and 0.86 with population IQ. Three variants in locus ADAM12 achieved genome-wide statistical-significance, although they did not replicate with published GWA analyses of normal-range IQ or educational attainment. A genome-wide polygenic score constructed from the GWA results accounted for 1.6% of the variance of intelligence in the normal range in an unselected sample of 3414 individuals, which is comparable to the variance explained by GWA studies of intelligence with substantially larger sample sizes. The gene family plexins, members of which are mutated in several monogenic neurodevelopmental disorders, was statistically-significantly enriched for associations with high IQ. This study shows the utility of extreme trait selection for genetic study of intelligence and suggests that extremely high intelligence is continuous genetically with normal-range intelligence in the population.

“The Surprising Implications of Familial Association in Disease Risk”, Valberg et al 2017

“The surprising implications of familial association in disease risk”⁠, Morten Valberg, Mats Julius Stensrud, Odd O. Aalen (2017-06-14; ⁠, ; backlinks; similar):

Background: A wide range of diseases show some degree of clustering in families; family history is therefore an important aspect for clinicians when making risk predictions. Familial aggregation is often quantified in terms of a familial relative risk (FRR), and although at first glance this measure may seem simple and intuitive as an average risk prediction, its implications are not straightforward.

Methods: We use two statistical models for the distribution of disease risk in a population: a dichotomous risk model that gives an intuitive understanding of the implication of a given FRR, and a continuous risk model that facilitates a more detailed computation of the inequalities in disease risk. Published estimates of FRRs are used to produce Lorenz curves and Gini indices that quantifies the inequalities in risk for a range of diseases.

Results: We demonstrate that even a moderate familial association in disease risk implies a very large difference in risk between individuals in the population. We give examples of diseases for which this is likely to be true, and we further demonstrate the relationship between the point estimates of FRRs and the distribution of risk in the population.

Conclusions: The variation in risk for several severe diseases may be larger than the variation in income in many countries. The implications of familial risk estimates should be recognized by epidemiologists and clinicians.

“Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum”, Ganna et al 2017

“Quantifying the impact of rare and ultra-rare coding variation across the phenotypic spectrum”⁠, Andrea Ganna, F. Kyle Satterstrom, Seyedeh M. Zekavat, Indraniel Das, Mitja I. Kurki, Claire Churchhouse et al (2017-06-09; ; similar):

Protein truncating variants (PTVs) are likely to modify gene function and have been linked to hundreds of Mendelian disorders1,2. However, the impact of PTVs on complex traits has been limited by the available sample size of whole-exome sequencing studies (WES) 3. Here we assemble WES data from 100,304 individuals to quantify the impact of rare PTVs on 13 quantitative traits and 10 diseases. We focus on those PTVs that occur in PTV-intolerant (PI) genes, as these are more likely to be pathogenic. Carriers of at least one PI-PTV were found to have an increased risk of autism, schizophrenia, bipolar disorder, intellectual disability and ADHD (p-value (p) range: 5×10−3−9×10−12). In controls, without these disorders, we found that this burden associated with increased risk of mental, behavioral and neurodevelopmental disorders as captured by electronic health record information. Furthermore, carriers of PI-PTVs tended to be shorter (p = 2×10−5), have fewer years of education (p = 2×10−4) and be younger (p = 2×10−7); the latter observation possibly reflecting reduced survival or study participation. While other gene-sets derived from in vivo experiments did not show any associations with PTV-burden, gene sets implicated in GWAS of cardiovascular-related traits and inflammatory bowel disease showed a significant PTV-burden with corresponding traits, mainly driven by established genes involved in familial forms of these disorders. We leveraged population health registries from 14,117 individuals to study the phenome-wide impact of PIPTVs and identified an increase in the number of hospital visits among PI-PTV carriers. In conclusion, we provide the most thorough investigation to date of the impact of rare deleterious coding variants on complex traits, suggesting widespread pleiotropic risk.

“Genomic Analysis of Family Data Reveals Additional Genetic Effects on Intelligence and Personality”, Hill et al 2017

“Genomic analysis of family data reveals additional genetic effects on intelligence and personality”⁠, W. David Hill, Ruben C. Arslan, Charley Xia, Michelle Luciano, Carmen Amador, Pau Navarro, Caroline Hayward et al (2017-06-05; ⁠, ; similar):

Pedigree-based analyses of intelligence have reported that genetic differences account for 50–80% of the phenotypic variation. For personality traits these effects are smaller, with 34–48% of the variance being explained by genetic differences. However, molecular genetic studies using unrelated individuals typically report a heritability estimate of around 30% for intelligence and between 0% and 15% for personality variables. Pedigree-based estimates and molecular genetic estimates may differ because current genotyping platforms are poor at tagging causal variants, variants with low minor allele frequency, copy number variants, and structural variants. Using ~20 000 individuals in the Generation Scotland family cohort genotyped for ~700 000 single nucleotide polymorphisms (SNPs), we exploit the high levels of linkage disequilibrium (LD) found in members of the same family to quantify the total effect of genetic variants that are not tagged in GWASs of unrelated individuals. In our models, genetic variants in low LD with genotyped SNPs explain over half of the genetic variance in intelligence, education, and neuroticism. By capturing these additional genetic effects our models closely approximate the heritability estimates from twin studies for intelligence and education, but not for neuroticism and extraversion. We then replicated our finding using imputed molecular genetic data from unrelated individuals to show that ~50% of differences in intelligence, and ~40% of the differences in education, can be explained by genetic effects when a larger number of rare SNPs are included. From an evolutionary genetic perspective, a substantial contribution of rare genetic variants to individual differences in intelligence and education is consistent with mutation-selection balance.

“Prevalence and Architecture of de Novo Mutations in Developmental Disorders”, McRae et al 2017

2017-mcrae.pdf: “Prevalence and architecture of de novo mutations in developmental disorders”⁠, Jeremy F. McRae, Stephen Clayton, Tomas W. Fitzgerald, Joanna Kaplanis, Elena Prigmore, Diana Rajan, Alejandro Sifrim et al (2017-01-25; ; backlinks; similar):

The genomes of individuals with severe, undiagnosed developmental disorders are enriched in damaging de novo mutations (DNMs) in developmentally important genes. Here we have sequenced the exomes of 4,293 families containing individuals with developmental disorders, and meta-analysed these data with data from another 3,287 individuals with similar disorders. We show that the most important factors influencing the diagnostic yield of DNMs are the sex of the affected individual, the relatedness of their parents, whether close relatives are affected and the parental ages. We identified 94 genes enriched in damaging DNMs, including 14 that previously lacked compelling evidence of involvement in developmental disorders. We have also characterized the phenotypic diversity among these disorders. We estimate that 42% of our cohort carry pathogenic DNMs in coding sequences; ~half of these DNMs disrupt gene function and the remainder result in altered protein function. We estimate that developmental disorders caused by DNMs have an average prevalence of 1 in 213 to 1 in 448 births, depending on parental age. Given current global demographics, this equates to almost 400,000 children born per year.

“Inequality in Genetic Cancer Risk Suggests Bad Genes rather than Bad Luck”, Stensrud & Valberg 2017

“Inequality in genetic cancer risk suggests bad genes rather than bad luck”⁠, Mats Julius Stensrud, Morten Valberg (2017; ; similar):

Heritability is often estimated by decomposing the variance of a trait into genetic and other factors. Interpreting such variance decompositions, however, is not straightforward. In particular, there is an ongoing debate on the importance of genetic factors in cancer development, even though heritability estimates exist. Here we show that heritability estimates contain information on the distribution of absolute risk due to genetic differences. The approach relies on the assumptions underlying the conventional heritability of liability models. We also suggest a model unrelated to heritability estimates. By applying these strategies, we describe the distribution of absolute genetic risk for 15 common cancers. We highlight the considerable inequality in genetic risk of cancer using different metrics, eg. the Gini Index and quantile ratios which are frequently used in economics. For all these cancers, the estimated inequality in genetic risk is larger than the inequality in income in the USA.

“Polygenic Transmission Disequilibrium Confirms That Common and Rare Variation Act Additively to Create Risk for Autism Spectrum Disorders”, Weiner et al 2016

“Polygenic transmission disequilibrium confirms that common and rare variation act additively to create risk for autism spectrum disorders”⁠, Daniel J. Weiner, Emilie M. Wigdor, Stephan Ripke, Raymond K. Walters, Jack A. Kosmicki, Jakob Grove et al (2016-11-23; ; similar):

Autism spectrum disorder (ASD) risk is influenced by both common polygenic and de novo variation. The purpose of this analysis was to clarify the influence of common polygenic risk for ASDs and to identify subgroups of cases, including those with strong acting de novo variants, in which different types of polygenic risk are relevant. To do so, we extend the transmission disequilibrium approach to encompass polygenic risk scores, and introduce the polygenic transmission disequilibrium test. Using data from more than 6,400 children with ASDs and 15,000 of their family members, we show that polygenic risk for ASDs, schizophrenia, and greater educational attainment is over transmitted to children with ASDs in two independent samples, but not to their unaffected siblings. These findings hold independent of proband IQ. We find that common polygenic variation contributes additively to risk in ASD cases that carry a very strong acting de novo variant. Lastly, we find evidence that elements of polygenic risk are independent and differ in their relationship with proband phenotype. These results confirm that ASDs’ genetic influences are highly additive and suggest that they create risk through at least partially distinct etiologic pathways.

“A Prospective Study of Sudden Cardiac Death among Children and Young Adults”, Bagnall et al 2016

2016-bagnall.pdf: “A Prospective Study of Sudden Cardiac Death among Children and Young Adults”⁠, Richard D. Bagnall, Robert G. Weintraub, Jodie Ingles, Johan Duflou, Laura Yeates, Lien Lam, Andrew M. Davis et al (2016-06-23; similar):

Background: Sudden cardiac death among children and young adults is a devastating event. We performed a prospective, population-based, clinical and genetic study of sudden cardiac death among children and young adults.

Methods: We prospectively collected clinical, demographic, and autopsy information on all cases of sudden cardiac death among children and young adults 1 to 35 years of age in Australia and New Zealand from 2010 through 2012. In cases that had no cause identified after a comprehensive autopsy that included toxicologic and histologic studies (unexplained sudden cardiac death), at least 59 cardiac genes were analyzed for a clinically relevant cardiac gene mutation.

Results: A total of 490 cases of sudden cardiac death were identified. The annual incidence was 1.3 cases per 100,000 persons 1 to 35 years of age; 72% of the cases involved boys or young men. Persons 31 to 35 years of age had the highest incidence of sudden cardiac death (3.2 cases per 100,000 persons per year), and persons 16 to 20 years of age had the highest incidence of unexplained sudden cardiac death (0.8 cases per 100,000 persons per year). The most common explained causes of sudden cardiac death were coronary artery disease (24% of cases) and inherited cardiomyopathies (16% of cases). Unexplained sudden cardiac death (40% of cases) was the predominant finding among persons in all age groups, except for those 31 to 35 years of age, for whom coronary artery disease was the most common finding. Younger age and death at night were independently associated with unexplained sudden cardiac death as compared with explained sudden cardiac death. A clinically relevant cardiac gene mutation was identified in 31 of 113 cases (27%) of unexplained sudden cardiac death in which genetic testing was performed. During follow-up, a clinical diagnosis of an inherited cardiovascular disease was identified in 13% of the families in which an unexplained sudden cardiac death occurred.

Conclusions: The addition of genetic testing to autopsy investigation substantially increased the identification of a possible cause of sudden cardiac death among children and young adults.

“Cognitive Performance Among Carriers of Pathogenic Copy Number Variants: Analysis of 152,000 UK Biobank Subjects”, Kendall et al 2016

2016-kendall.pdf: “Cognitive Performance Among Carriers of Pathogenic Copy Number Variants: Analysis of 152,000 UK Biobank Subjects”⁠, Kimberley M. Kendall, Elliott Rees, Valentina Escott-Price, Mark Einon, Rhys Thomas, Jonathan Hewitt et al (2016-01-01)

“Family-Specific Variants and the Limits of Human Genetics”, Shirts et al 2016

“Family-Specific Variants and the Limits of Human Genetics”⁠, Brian H. Shirts, Colin C. Pritchard, Tom Walsh (2016; similar):

Every single-nucleotide change compatible with life is present in the human population today. Understanding these rare human variants defines an extraordinary challenge for genetics and medicine. The new clinical practice of sequencing many genes for hereditary cancer risk has illustrated the utility of clinical next-generation sequencing in adults, identifying more medically actionable variants than single-gene testing. However, it has also revealed a linear relationship between the length of DNA evaluated and the number of rare ‘variants of uncertain significance’ reported. We propose that careful approaches to phenotype-genotype inference, distinguishing between diagnostic and screening intent, in conjunction with expanded use of family-scale genetics studies as a source of information on family-specific variants, will reduce variants of uncertain significance reported to patients.

“The Contribution of de Novo Coding Mutations to Autism Spectrum Disorder”, Iossifov et al 2014

“The contribution of de novo coding mutations to autism spectrum disorder”⁠, Ivan Iossifov, Brian J. O’Roak, Stephan J. Sanders, Michael Ronemus, Niklas Krumm, Dan Levy, Holly A. Stessman et al (2014; ; similar):

Whole exome sequencing has proven to be a powerful tool for understanding the genetic architecture of human disease. Here we apply it to more than 2,500 simplex families, each having a child with an autistic spectrum disorder. By comparing affected to unaffected siblings, we show that 13% of de novo missense mutations and 43% of de novo likely gene-disrupting (LGD) mutations contribute to 12% and 9% of diagnoses, respectively. Including copy number variants, coding de novo mutations contribute to about 30% of all simplex and 45% of female diagnoses. Almost all LGD mutations occur opposite wild-type alleles. LGD targets in affected females significantly overlap the targets in males of lower intelligence quotient (IQ), but neither overlaps significantly with targets in males of higher IQ. We estimate that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to contributory missense mutation. LGD targets in the joint class overlap with published targets for intellectual disability and schizophrenia, and are enriched for chromatin modifiers, FMRP-associated genes and embryonically expressed genes. Most of the significance for the latter comes from affected females.

“A Novel BHLHE41 Variant Is Associated With Short Sleep and Resistance to Sleep Deprivation in Humans”, Pellegrino et al 2014

“A novel BHLHE41 variant is associated with short sleep and resistance to sleep deprivation in humans”⁠, Renata Pellegrino, Ibrahim Halil Kavakli, Namni Goel, Christopher J. Cardinale, David F. Dinges, Samuel T. Kuna et al (2014; ⁠, ; backlinks; similar):

Study Objectives: Earlier work described a mutation in DEC2 also known as BHLHE41 (basic helix-loophelix family member e41) as causal in a family of short sleepers, who needed just 6 h sleep per night. We evaluated whether there were other variants of this gene in two well-phenotyped cohorts.

Design: Sequencing of the BHLHE41 gene, electroencephalographic data, and delta power analysis and functional studies using cell-based luciferase.

Results: We identified new variants of the BHLHE41 gene in two cohorts who had either acute sleep deprivation (n = 200) or chronic partial sleep deprivation (n = 217). One variant, Y362H, at another location in the same exon occurred in one twin in a dizygotic twin pair and was associated with reduced sleep duration, less recovery sleep following sleep deprivation, and fewer performance lapses during sleep deprivation than the homozygous twin. Both twins had almost identical amounts of non rapid eye movement (NREM) sleep. This variant reduced the ability of BHLHE41 to suppress CLOCK/​BMAL1 and NPAS2/​BMAL1 transactivation in vitro. Another variant in the same exome had no effect on sleep or response to sleep deprivation and no effect on CLOCK/​BMAL1 transactivation. Random mutagenesis identified a number of other variants of BHLHE41 that affect its function.

Conclusions: There are a number of mutations of BHLHE41. Mutations reduce total sleep while maintaining NREM sleep and provide resistance to the effects of sleep loss. Mutations that affect sleep also modify the normal inhibition of BHLHE41 of CLOCK/​BMAL1 transactivation. Thus, clock mechanisms are likely involved in setting sleep length and the magnitude of sleep homeostasis.

Citation: Pellegrino R, Kavakli IH, Goel N, Cardinale CJ, Dinges DF, Kuna ST, Maislin G, Van Dongen HP, Tufik S, Hogenesch JB, Hakonarson H, Pack AI. A novel BHLHE41 variant is associated with short sleep and resistance to sleep deprivation in humans. SLEEP 2014;37(8):1327–1336.

“The Incidence of Leukemia, Lymphoma and Multiple Myeloma among Atomic Bomb Survivors: 1950-2001”, Hsu et al 2013

“The incidence of leukemia, lymphoma and multiple myeloma among atomic bomb survivors: 1950-2001”⁠, Wan-Ling Hsu, Dale L. Preston, Midori Soda, Hiromi Sugiyama, Sachiyo Funamoto, Kazunori Kodama, Akiro Kimura et al (2013; backlinks; similar):

A marked increase in leukemia risks was the first and most striking late effect of radiation exposure seen among the Hiroshima and Nagasaki atomic bomb survivors.

This article presents analyses of radiation effects on leukemia, lymphoma and multiple myeloma incidence in the Life Span Study cohort of atomic bomb survivors updated 14 years since the last comprehensive report on these malignancies. These analyses make use of tumor-registry and leukemia-registry based incidence data on 113,011 cohort members with 3.6 million person-years of follow-up from late 1950 through the end of 2001. In addition to a detailed analysis of the excess risk for all leukemias other than chronic lymphocytic leukemia or adult T-cell leukemia (neither of which appear to be radiation-related), we present results for the major hematopoietic malignancy types: acute lymphoblastic leukemia, chronic lymphocytic leukemia, acute myeloid leukemia, chronic myeloid leukemia, adult T-cell leukemia, Hodgkin and non-Hodgkin lymphoma and multiple myeloma. Poisson regression methods were used to characterize the shape of the radiation dose-response relationship and, to the extent the data allowed, to investigate variation in the excess risks with gender, attained age, exposure age and time since exposure. In contrast to the previous report that focused on describing excess absolute rates, we considered both excess absolute rate (EAR) and excess relative risk (ERR) models and found that ERR models can often provide equivalent and sometimes more parsimonious descriptions of the excess risk than EAR models.

The leukemia results indicated that there was a nonlinear dose response for leukemias other than chronic lymphocytic leukemia or adult T-cell leukemia, which varied markedly with time and age at exposure, with much of the evidence for this nonlinearity arising from the acute myeloid leukemia risks. Although the leukemia excess risks generally declined with attained age or time since exposure, there was evidence that the radiation-associated excess leukemia risks, especially for acute myeloid leukemia, had persisted throughout the follow-up period out to 55 years after the bombings. As in earlier analyses, there was a weak suggestion of a radiation dose response for non-Hodgkin lymphoma among men, with no indication of such an effect among women. There was no evidence of radiation-associated excess risks for either Hodgkin lymphoma or multiple myeloma.

“Heritability of Performance Deficit Accumulation during Acute Sleep Deprivation in Twins”, Kuna et al 2012

“Heritability of performance deficit accumulation during acute sleep deprivation in twins”⁠, Samuel T. Kuna, Greg Maislin, Frances M. Pack, Bethany Staley, Robert Hachadoorian, Emil F. Coccaro, Allan I. Pack et al (2012; ; backlinks; similar):

Study Objectives: To determine if the large and highly reproducible interindividual differences in rates of performance deficit accumulation during sleep deprivation, as determined by the number of lapses on a sustained reaction time test, the Psychomotor Vigilance Task (PVT), arise from a heritable trait.

Design: Prospective, observational cohort study.

Setting: Academic medical center.

Participants: There were 59 monozygotic (mean age 29.2 ± 6.8 [SD] yr; 15 male and 44 female pairs) and 41 dizygotic (mean age 26.6 ± 7.6 yr; 15 male and 26 female pairs) same-sex twin pairs with a normal polysomnogram.

Interventions: Thirty-eight hr of monitored, continuous sleep deprivation.

Measurements and Results: Patients performed the 10-min PVT every 2 hr during the sleep deprivation protocol. The primary outcome was change from baseline in square root transformed total lapses (response time ≥ 500 ms) per trial. Patient-specific linear rates of performance deficit accumulation were separated from circadian effects using multiple linear regression. Using the classic approach to assess heritability, the intraclass correlation coefficients for accumulating deficits resulted in a broad sense heritability (h2) estimate of 0.834. The mean within-pair and among-pair heritability estimates determined by analysis of variance-based methods was 0.715. When variance components of mixed-effect multilevel models were estimated by maximum likelihood estimation and used to determine the proportions of phenotypic variance explained by genetic and nongenetic factors, 51.1% (standard error = 8.4%, p < 0.0001) of twin variance was attributed to combined additive and dominance genetic effects.

Conclusion: Genetic factors explain a large fraction of interindividual variance among rates of performance deficit accumulations on PVT during sleep deprivation.

“CNVs: Harbingers of a Rare Variant Revolution in Psychiatric Genetics”, Malhotra & Sebat 2012

“CNVs: harbingers of a rare variant revolution in psychiatric genetics”⁠, Dheeraj Malhotra, Jonathan Sebat (2012; similar):

The genetic bases of neuropsychiatric disorders are beginning to yield to scientific inquiry. Genome-wide studies of copy number variation (CNV) have given rise to a new understanding of disease etiology, bringing rare variants to the forefront. A proportion of risk for schizophrenia, bipolar disorder, and autism can be explained by rare mutations. Such alleles arise by de novo mutation in the individual or in recent ancestry. Alleles can have specific effects on behavioral and neuroanatomical traits; however, expressivity is variable, particularly for neuropsychiatric phenotypes. Knowledge from CNV studies reflects the nature of rare alleles in general and will serve as a guide as we move forward into a new era of whole-genome sequencing.

“Common Variants Show Predicted Polygenic Effects on Height in the Tails of the Distribution, Except in Extremely Short Individuals”, Chan et al 2011

“Common Variants Show Predicted Polygenic Effects on Height in the Tails of the Distribution, Except in Extremely Short Individuals”⁠, Yingleong Chan, Oddgeir L. Holmen, Andrew Dauber, Lars Vatten, Aki S. Havulinna, Frank Skorpen, Kirsti Kvaløy et al (2011-11-13; backlinks; similar):

Common genetic variants have been shown to explain a fraction of the inherited variation for many common diseases and quantitative traits, including height, a classic polygenic trait. The extent to which common variation determines the phenotype of highly heritable traits such as height is uncertain, as is the extent to which common variation is relevant to individuals with more extreme phenotypes. To address these questions, we studied 1,214 individuals from the top and bottom extremes of the height distribution (tallest and shortest ~1.5%), drawn from ~78,000 individuals from the HUNT and FINRISK cohorts. We found that common variants still influence height at the extremes of the distribution: common variants (49⁄141) were nominally associated with height in the expected direction more often than is expected by chance (p <5×10−28), and the odds ratios in the extreme samples were consistent with the effects estimated previously in population-based data. To examine more closely whether the common variants have the expected effects, we calculated a weighted allele score (WAS), which is a weighted prediction of height for each individual based on the previously estimated effect sizes of the common variants in the overall population. The average WAS is consistent with expectation in the tall individuals, but was not as extreme as expected in the shortest individuals (p < 0.006), indicating that some of the short stature is explained by factors other than common genetic variation. The discrepancy was more pronounced (p < 10−6) in the most extreme individuals (height<0.25 percentile). The results at the extreme short tails are consistent with a large number of models incorporating either rare genetic non-additive or rare non-genetic factors that decrease height. We conclude that common genetic variants are associated with height at the extremes as well as across the population, but that additional factors become more prominent at the shorter extreme.

Author Summary: Although there are many loci in the human genome that have been discovered to be statistically-significantly associated with height, it is unclear if these loci have similar effects in extremely tall and short individuals. Here, we examine hundreds of extremely tall and short individuals in two population-based cohorts to see if these known height determining loci are as predictive as expected in these individuals. We found that these loci are generally as predictive of height as expected in these individuals but that they begin to be less predictive in the most extremely short individuals. We showed that this result is consistent with models that not only include the common variants but also multiple low frequency genetic variants that substantially decrease height. However, this result is also consistent with non-additive genetic effects or rare non-genetic factors that substantially decrease height. This finding suggests the possibility of a major role of low frequency variants, particularly in individuals with extreme phenotypes, and has implications on whole-genome or whole-exome sequencing efforts to discover rare genetic variation associated with complex traits.

“Population-based Carrier Screening for Cystic Fibrosis in Victoria: The First 3 Years Experience”, Massie et al 2009

2009-massie.pdf: “Population-based carrier screening for cystic fibrosis in Victoria: The first 3 years experience”⁠, John Massie, Vicki Petrou, Robyn Forbes, Lisette Curnow, Liane Ioannou, Desiree Dusart, Agnes Bankier et al (2009-09-24; ; backlinks):

Background: Cystic fibrosis (CF) is the most common inherited, life-shortening condition affecting Australian children. The carrier frequency is 1⁄25 and most babies with CF are born to parents with no family history. Carrier testing is possible before a couple has an affected infant.

Aims: To report the outcomes of a carrier screening program for CF.

Method: Carrier screening was offered to women and couples planning a pregnancy, or in early pregnancy, through obstetricians and general practitioners in Victoria, Australia. Samples were collected by cheek swab and posted to the laboratory. 12 CFTR gene mutations were tested. Carriers were offered genetic counselling and partner testing. Carrier couples were offered prenatal testing by chorionic villous sampling (CVS) if pregnant. The number of people tested, carriers detected and pregnancy outcomes were recorded from January 2006 to December 2008.

Results: A total of 3,200 individuals were screened (3,000 females). 106 carriers were identified (1⁄30, 95% confidence interval 1⁄25–1⁄36). All carrier partners were screened, and 9 carrier couples identified (total carriers, 115). 96 individuals (83%) were carriers of the p.508del mutation. Of the 9 carrier couples, 6 were pregnant at the time of screening (5 natural conception and 1 in vitro fertilisation) and all had CVS (mean gestation 12.5 weeks). 2 fetuses were affected, 3 were carriers and 1 was not a carrier. Termination of pregnancy was undertaken for the affected fetuses.

Conclusion: Carrier screening for CF by obstetricians and general practitioners by cheek swab sample can be successfully undertaken prior to pregnancy or in the early stages of pregnancy.

“An Expressed fgf4 Retrogene Is Associated With Breed-defining Chondrodysplasia in Domestic Dogs.”, Parker et al 2009

“An expressed fgf4 retrogene is associated with breed-defining chondrodysplasia in domestic dogs.”⁠, Parker, Heidi G. VonHoldt, Bridgett M. Quignon, Pascale Margulies, Elliott H. Shao, Stephanie Mosher et al (2009-08-21; ; similar):

Retrotransposition of processed mRNAs is a common source of novel sequence acquired during the evolution of genomes. Although the vast majority of retroposed gene copies, or retrogenes, rapidly accumulate debilitating mutations that disrupt the reading frame, a small percentage become new genes that encode functional proteins. By using a multibreed association analysis in the domestic dog, we demonstrate that expression of a recently acquired retrogene encoding fibroblast growth factor 4 (fgf4) is strongly associated with chondrodysplasia, a short-legged phenotype that defines at least 19 dog breeds including dachshund, corgi, and basset hound. These results illustrate the important role of a single evolutionary event in constraining and directing phenotypic diversity in the domestic dog.

“The Transcriptional Repressor DEC2 Regulates Sleep Length in Mammals”, He et al 2009

“The transcriptional repressor DEC2 regulates sleep length in mammals”⁠, He, Ying Jones, Christopher R. Fujiki, Nobuhiro Xu, Ying Guo, Bin Holder, Jimmy L. Rossner, Moritz J. Nishino et al (2009; ⁠, ; backlinks; similar):

Sleep deprivation can impair human health and performance. Habitual total sleep time and homeostatic sleep response to sleep deprivation are quantitative traits in humans. Genetic loci for these traits have been identified in model organisms, but none of these potential animal models have a corresponding human genotype and phenotype.

We have identified a mutation in a transcriptional repressor (hDEC2-P385R) that is associated with a human short sleep phenotype. Activity profiles and sleep recordings of transgenic mice carrying this mutation showed increased vigilance time and less sleep time than control mice in a zeitgeber time-dependent and sleep deprivation-dependent manner.

These mice represent a model of human sleep homeostasis that provides an opportunity to probe the effect of sleep on human physical and mental health.

“Genetic Enhancement of Cognition in a Kindred With Cone-rod Dystrophy due to RIMS1 Mutation”, Sisodiya et al 2007

“Genetic enhancement of cognition in a kindred with cone-rod dystrophy due to RIMS1 mutation”⁠, Sanjay M. Sisodiya, Pamela J. Thompson, Anna Need, Sarah E. Harris, Michael E. Weale, Susan E. Wilkie et al (2007; backlinks; similar):

Background: The genetic basis of variation in human cognitive abilities is poorly understood. RIMS1 encodes a synapse active-zone protein with important roles in the maintenance of normal synaptic function: mice lacking this protein have greatly reduced learning ability and memory function.

Objective: An established paradigm examining the structural and functional effects of mutations in genes expressed in the eye and the brain was used to study a kindred with an inherited retinal dystrophy due to RIMS1 mutation.

Materials and Methods: Neuropsychological tests and high-resolution MRI brain scanning were undertaken in the kindred. In a population cohort, neuropsychological scores were associated with common variation in RIMS1. Additionally, RIMS1 was sequenced in top-scoring individuals. Evolution of RIMS1 was assessed, and its expression in developing human brain was studied.

Results: Affected individuals showed significantly enhanced cognitive abilities across a range of domains. Analysis suggests that factors other than RIMS1 mutation were unlikely to explain enhanced cognition. No association with common variation and verbal IQ was found in the population cohort, and no other mutations in RIMS1 were detected in the highest scoring individuals from this cohort. RIMS1 protein is expressed in developing human brain, but RIMS1 does not seem to have been subjected to accelerated evolution in man.

Conclusions: A possible role for RIMS1 in the enhancement of cognitive function at least in this kindred is suggested. Although further work is clearly required to explore these findings before a role for RIMS1 in human cognition can be formally accepted, the findings suggest that genetic mutation may enhance human cognition in some cases.

“Self-management of Fatal Familial Insomnia. Part 2: Case Report”, Schenkein & Montagna 2006

“Self-management of fatal familial insomnia. Part 2: case report”⁠, Joyce Schenkein, Pasquale Montagna (2006; ⁠, ; similar):

Context: Fatal familial insomnia (FFI) is a genetically transmitted neurodegenerative prion disease that incurs great suffering and has neither a treatment nor cure. The clinical literature is devoid of management plans (other than palliative). Part 1 of this article reviews the sparse literature about FFI, including case descriptions. Part 2 describes the efforts of one patient (with the rapid-course Met-Met subtype) who contended with his devastating symptoms and improved the quality of his life.

Design: Interventions were based on the premise that some symptoms may be secondary to insomnia and not a direct result of the disease itself. Strategies (derived by trial and error) were devised to induce sleep and increase alertness. Interventions included vitamin supplementation, narcoleptics, anesthesia, stimulants, sensory deprivation, exercise, light entrainment, growth hormone, and electroconvulsive therapy (ECT).

Results: The patient exceeded the average survival time by nearly 1 year, and during this time (when most patients are totally incapacitated), he was able to write a book and to successfully drive hundreds of miles.

Conclusion: Methods to induce sleep may extend and enhance life during the disease course, although they do not prevent death. It is hoped that some of his methods will inspire further clinical studies.

Threshold model § Liability threshold model


Fatal insomnia


100,000 Genomes Project