2011-villani.pdf: “Heritability and Characteristics of Catnip Response in Two Domestic Cat Populations”, (2011; ):
The domestic cat response to catnip is unique in nature as it represents a repeatable, recognizable behavioral response to an olfactory stimulus that appears to have little evolutionary importance. There is clear variation in response between and this has been attributed to genetic factors in the past. These factors are explored in this study using behavioral observation after presenting of to in two different research colonies with different environmental and genetic backgrounds. The response trait is defined and Gibbs sampling methods are used to explore a mixed model for the trait to determine genetic effects. Heritabilities obtained in the two colonies for the most important response behaviors, the head over roll and cheek rub, were 0.511 and 0.794 using spray and dried respectively. No clear Mendelian mode of inheritance was ascertained in either colony. The variation in response behaviors and intensity seen in the two colonies reflects the complex nature of expression of the catnip response, but there is a clear genetic influence on the feline predisposition to responding.
1979-schmidt.pdf: “Impact of valid selection procedures on work-force productivity”, (1979-01-01; ):
Used decision theoretic equations to estimate the impact of the Programmer Aptitude Test (PAT) on productivity if used to select new computer programmers for 1 yr in the federal government and the national economy. A newly developed technique was used to estimate the standard deviation of the dollar value of employee job performance, which in the past has been the most difficult and expensive item of required information. For the federal government and the US economy separately, results are presented for different selection ratios and for different assumed values for the validity of previously used selection procedures. The impact of the PAT on programmer productivity was substantial for all combinations of assumptions. Results support the conclusion that hundreds of millions of dollars in increased productivity could be realized by increasing the validity of selection decisions in this occupation. Similarities between computer programmers and other occupations are discussed. It is concluded that the impact of valid selection procedures on work-force productivity is considerably greater than most personnel psychologists have believed.
1996-lubinski-2.pdf: “Seeing The Forest From The Trees: When Predicting The Behavior Or Status Of Groups, Correlate Means”, (1996; ):
When measures of individual differences are used to predict group performance, the reporting of correlations computed on samples of individuals invites misinterpretation and dismissal of the data. In contrast, if regression equations, in which the correlations required are computed on bivariate means, as are the distribution statistics, it is difficult to underappreciate or lightly dismiss the utility of psychological predictors.
Given sufficient sample size and linearity of regression, this technique produces cross-validated regression equations that forecast criterion means with almost perfect accuracy. This level of accuracy is provided by correlations approaching unity between bivariate samples of predictor and criterion means, and this holds true regardless of the magnitude of the “simple” correlation (eg., rxy = 0.20, or rxy = 0.80).
We illustrate this technique empirically using a measure of general intelligence as the predictor and other measures of individual differences and socioeconomic status as criteria. In addition to theoretical applications pertaining to group trends, this methodology also has implications for applied problems aimed at developing policy in numerous fields.
…To summarize, psychological variables generating modest correlations frequently are discounted by those who focus on the magnitude of unaccounted for criterion variance, large standard errors, and frequent false positive and false negative errors in predicting individuals. Dismissal of modest correlations (and the utility of their regressions) by professionals based on this psychometric-statistical reasoning has spread to administrators, journalists, and legislative policy makers. Some examples of this have been compiled by Dawes (1979, 1988) and Linn (1982). They range from squaring a correlation of 0.345 (ie., 0.12) and concluding that for 88% of students, “An SAT score will predict their grade rank no more accurately than a pair of dice” (cf. Linn, 1982, p. 280) to evaluating the differential utility of two correlations 0.20 and 0.40 (based on different procedures for selecting graduate students) as “twice of nothing is nothing” (cf. Dawes, 1979, p. 580).
…Tests are used, however, in ways other than the prediction of individuals or of a specific outcome for Johnny or Jane. And policy decisions based on tests frequently have broader implications for individuals beyond those directly involved in the assessment and selection context (see the discussion later in this article). For example, selection of personnel in education, business, industry, and the military focuses on the criterion performance of groups of applicants whose scores on selection instruments differ. Selection psychologists have long made use of modest predictive correlations when the ratio of applicants to openings becomes large. The relation of utility to size of correlation, relative to the selection ratio and base rate for success (if one ignores the test scores), is incorporated in the well-known Taylor-Russell (1939) tables. These tables are examples of how psychological tests have revealed convincingly economic and societal benefits (Hartigan & Wigdor 1989), even when a correlation of modest size remains at center stage. For example, given a base rate of 30% for adequate performance and a predictive validity coefficient of 0.30 within the applicant population, selecting the top 20% on the predictor test will result in 46% of hires ultimately achieving adequate performance (a 16% gain over base rate). To be sure, the prediction for individuals within any group is not strong—about 9% of the variance in job performance. Yet, when training is expensive or time-consuming, this can result in huge savings. For analyses of groups composed of anonymous persons, however, there is a more unequivocal way of illustrating the importance of modest correlations than even the Taylor-Russell tables provide.
Rationale for an Alternative Approach: Applied psychologists discovered decades ago that it is more advantageous to report correlations between a continuous predictor and a dichotomous criterion graphically rather than as a number that varies between zero and one. For example, the correlation (point biserial) of about 0.40 with the pass-fail pilot training criterion and an ability-stanine predictor looks quite impressive when graphed in the manner of Figure 1a. In contrast, in Figure 1b, a scatter plot of a correlation of 0.40 between two continuous measures looks at first glance like the pattern of birdshot on a target. It takes close scrutiny to perceive that the pattern in Figure 1b is not quite circular for the small correlation. Figure 1a communicates the information more effectively than Figure 1b. When the data on the predictive validity of the pilot ability-stanine were presented in the form of Figure 1a (rather than, say, as a scatter plot of a correlation of 0.40; Figure 1b), general officers in recruitment, training, logistics, and operations immediately grasped the importance of the data for their problems. Because the Army Air Forces were an attractive career choice, there were many more applicants for pilot training than could be accommodated and selection was required…A small gain on a criterion for a unit of gain on the predictor, as long as it is predicted with near-perfect accuracy, can have high utility.
“Crunch: Building a better apple”, (2011-11-14):
Profile of the development & launch of the SweeTango apple, a successor to Honeycrisp (via a hybridization with Zestar), developed by the University of Minnesota apple breeding program, which has been running since 1878 and created 27 notable apples (earning its role as the state fruit).
Breeding programs like that are part of why Americans have historically shifted from consuming hard cider (made with inedible wild-types) to ‘eating apples’, but progress was set back by a drastic decrease in variety to the McIntosh/Golden Delicious/Red Delicious triumvirate—Red Delicious degrading rapidly in quality. The apple revolution began in the 1970s when Granny Smith proved US consumers would buy a better apple, and was followed by the Fuji, Braeburn, and Gala.
How does one breed a new apple? Apples do not breed true and every offspring is wildly different. Apple breeders use brute force and brutally stringent screening: an acre of thousands of saplings will be grown, and in all, the breeders will walk the row, grab 1 apple, chew it briefly, spit it out, and mark the tree if good. Any sapling which is marked several years in a row (a few score out of thousands) survives; clones of it will be transplanted elsewhere for further testing, and evaluated similarly for another decade. If and only a new apple tree & clones pass all these tests, will it even be considered for commercialization.
“I’d like to give a tree a couple chances, but I just don’t have the mouth time for that”, Bedford explained. “So it’s one strike and you’re out. With all these new trees coming on each year, you won’t have space unless you thin out the duds.” He sprayed another tree trunk with the mark of death. “But it is kind of nerve-racking, because you want to give the tree a chance to do its best. No one wants to be known as the guy who killed the next Honeycrisp.”
The winner of such a process must be both brilliant and lucky, and Honeycrisp was both. But UMinn breeders watched with dismay as they felt the released Honeycrisp saplings were mistreated or poorly-raised by careless commercial growers, and decided the next apple, SweeTango, would be a “club apple”: it would be fully patented & controlled, and sold only to select apple growers who would be required to follow stringent rules.
The “club apple” business model has proven to be its own revolution by internalizing the costs & benefits, incentivizing the creation of a dizzying variety of new apples reaching the American grocery market every year.
In those early days, the company, just like almost everybody else in Washington, primarily produced Red Delicious apples, plus a few Goldens and Grannies—familiar workhorse varieties that anybody was allowed to grow. Back then, the state apple commission advertised its wares with a poster of a stoplight: one apple each in red, green, and yellow. Today, across more than 4,000 acres of McDougall apple trees, you won’t find a single Red; every year, you’ll also find fewer acres of the apples that McDougall calls “core varieties”, the more modern open-access standards such as Gala and Fuji. Instead, McDougall is betting on what he calls “value-added apples”: Ambrosias, whose rights he licensed from a Canadian company; Envy, Jazz, and Pacific Rose, whose intellectual properties are owned by the New Zealand giant Enzafruit; and a brand-new variety, commercially available for the first time this year and available only to Washington-state growers: the Cosmic Crisp.
…The Cosmic Crisp is debuting on grocery stores after this fall’s harvest, and in the nervous lead-up to the launch, everyone from nursery operators to marketers wanted me to understand the crazy scope of the thing: the scale of the plantings, the speed with which mountains of commercially untested fruit would be arriving on the market, the size of the capital risk. People kept saying things like “unprecedented”, “on steroids”, “off the friggin’ charts”, and “the largest launch of a single produce item in American history.”
McDougall took me to the highest part of his orchard, where we could look down at all its hundreds of very expensively trellised and irrigated acres (he estimated the costs to plant each individual acre at $60,000 to $65,000, plus another $12,000 in operating costs each year), their neat, thin lines of trees like the stitching over so many quilt squares. “If you’re a farmer, you’re a riverboat gambler anyway”, McDougall said. “But Cosmic Crisp—woo!” I thought of the warning of one former fruit-industry journalist that, with so much on the line, the enormous launch would have to go flawlessly: “It’s gotta be like the new iPhone.”
…Though Washington State University owns the WA 38 patent, the breeding program has received funding from the apple industry, so it was agreed, over some objections by people who worried that quality would be diluted, that the variety should be universally and exclusively available to Washington growers. (Growers of Cosmic Crisp pay royalties both on every tree they buy and on every box they sell, money that will fund future breeding projects as well as the shared marketing campaign.) The apple tested so well that WSU, in collaboration with commercial nurseries, began producing apple saplings as fast as possible; the plan was to start with 300,000 trees, but growers requested 4 million, leading to a lottery for divvying up the first available trees. Within three years, the industry had sunk 13 million of them, plus more than half a billion dollars, into the ground. Proprietary Variety Management expects that the number of Cosmic Crisp apples on the market will grow by millions of boxes every year, outpacing Pink Lady and Honeycrisp within about five years of its launch.
2010-kean.pdf: “Besting Johnny Appleseed: With a few tricks, and a lot of patience, fruit geneticists are undoing the work of an American legend”, (2010-04-16; ):
[Review of modern apple breeding techniques: genome sequencing enables selecting on seeds rather than trees by predicting taste & robustness, saving years of delay; this also allows avoiding the ‘GMO’ stigma by crossbreeding (quickly moving genes into new apple trees without direct genetic editing using genomic selection), such as a “fast-flowering gene” to accelerate maturation during evaluation but then select it out for the final tree; the creation of “The Gauntlet”, a greenhouse deliberately stocked with as many pathogens as possible, provides a stress test to weed out weak sapling as quickly as possible; and buds can be cryogenically preserved to cut down storage costs by more than an order of magnitude.]
Until recently, geneticists, their skills honed on Arabidopsis and other quick-breeding flora, avoided fruit-tree research like a blight. Of the 11,000 U.S. field tests on plants with transgenic genes between 1987 and 2004, just 1% focused on fruit trees. That’s partly because of the slow pace. Whereas vegetables like corn might produce two harvests each summer, apple trees need eons—around 5 years—to produce their first fruit, most of which will be disregarded as ugly, bitter, or squishy. But everything in apple breeding is about to change. An Italian team plans to publish the decoded apple genome this summer, and scientists are starting to single out complex genetic markers for taste and heartiness. In some cases the scientists even plan, by inserting genes from other species, to eliminate the barren juvenile stage and push fruit trees to mature rapidly, greatly reducing generation times.
2014-turkheimer.pdf: “Behavior Genetic Research Methods: Testing Quasi-Causal Hypotheses Using Multivariate Twin Data”, Eric Turkheimer, K. Paige Harden
Animal cloning has gained popularity as a method to produce genetically identical animals or superior animals for research or industrial uses. However, the long-standing question of whether a cloned animal undergoes an accelerated aging process is yet to be answered. As a step towards answering this question, we compared longevity and health of Snuppy, the world’s first cloned dog, and its somatic cell donor, Tai, a male Afghan hound. Briefly, both Snuppy and Tai were generally healthy until both developed cancer to which they succumbed at the ages of 10 and 12 years, respectively. The longevity of both the donor and the cloned dog was close to the median lifespan of Afghan hounds which is reported to be 11.9 years. Here, we report creation of 4 clones using adipose-derived mesenchymal stem cells from Snuppy as donor cells. Clinical and molecular follow-up of these reclones over their lives will provide us with a unique opportunity to study the health and longevity of cloned animals compared with their cell donors.
2014-choi.pdf: “Behavioral Analysis of Cloned Puppies Derived from an Elite Drug-Detection Dog”, Jin Choi, Ji Hyun Lee, Hyun Ju Oh, Min Jung Kim, Geon A. Kim, Eun Jung Park, Young Kwang Jo, Sang Im Lee, Do Gyo Hong, Byeong Chun Lee
2007-maejima.pdf: “Traits and genotypes may predict the successful training of drug detection dogs”, (2007; ):
In Japan, approximately 30% of dogs that enter training programs to become drug detection dogs successfully complete training. To clarify factors related to the aptitude of drug detection dogs and develop an assessment tool, we evaluated genotypes and behavioural traits of 197 candidate dogs. The behavioural traits were evaluated within 2 weeks from the start of training and included general activity, obedience training, concentration, affection demand, aggression toward dogs, anxiety, and interest in target. Principal components analysis of these ratings yielded two components: Desire for Work and Distractibility. Desire for Work was statistically-significantly related to successful completion of training (p < 0.001). Since 93.3% of dogs that passed training and 53.3% of the dogs that failed training had Desire for Work scores of 45 or higher, we will be able to reject about half of inappropriate dogs before 3 months of training by adopting this cut-off point. We also surveyed eight polymorphic regions of four genes that have been related to human personality dimensions. Genotypes were not related to whether dogs passed, but there was a weak relationship between Distractibility and a 5HTT haplotype (p < 0.05).
1997-weiss.pdf: “Service dog selection tests: Effectiveness for dogs from animal shelters”, Emily Weiss, Gary Greenberg
2018-oh.pdf: “The promise of dog cloning”, (2018-01-01; ):
Dog cloning as a concept is no longer infeasible. Starting with Snuppy, the first cloned dog in the world, somatic cell nuclear transfer (SCNT) has been continuously developed and used for diverse purposes. In this article we summarise the current method for SCNT, the normality of cloned dogs and the application of dog cloning not only for personal reasons, but also for public purposes.
2019-05-06-theexpresstribune-80percentofsouthkoreassnifferdogsarecloned.html: “Amid animal cruelty debate, 80% of South Korea’s sniffer dogs are cloned”, (2019-05-06; ):
Some 80% of active sniffer dogs deployed by South Korea’s quarantine agency are cloned, data showed Monday, as activists express their concerns over potential animal abuse. According to the Animal and Plant Quarantine Agency, 42 of its 51 sniffer dogs were cloned from parent animals as of April, indicating such cloned detection dogs are already making substantial contributions to the country’s quarantine activities. The number of cloned dogs first outpaced their naturally born counterparts in 2014, the agency said. Of the active cloned dogs, 39 are currently deployed at Incheon International Airport, the country’s main gateway.
Deploying cloned dogs can save time and money over training naturally born puppies as they maintain the outstanding traits of their parents, whose capabilities have already been verified in the field, according to experts. While the average cost of raising one detection dog is over 100 million won (US$85,600), it is less than half that when utilising cloned puppies, they said.
2010-sinn.pdf: “Personality and performance in military working dogs: Reliability and predictive validity of behavioral tests”, (2010-10-01; ):
Quantification and description of individual differences in behavior, or personality differences, is now well-established in the working dog literature. What is less well-known is the predictive relationship between particular dog behavioral traits (if any) and important working outcomes.
Here we evaluate the validity of a dog behavioral test instrument given to military working dogs (MWDs) from the 341st Training Squadron, USA Department of Defense (DoD); the test instrument has been used historically to select dogs to be trained for deployment.
A 15-item instrument was applied on three separate occasions prior to training in patrol and detection tasks, after which dogs were given patrol-only, detection-only, or dual-certification status. On average, inter-rater reliability for all 15 items was high (mean = 0.77), but within this overall pattern, some behavioral items showed lower inter-rater reliability at some time points (<0.40). Test-retest reliability for most (but not all) single item behaviors was strong (>0.50) across shorter test intervals, but decreased with increasing test interval (<0.40). Principal components analysis revealed four underlying dimensions that summarized test behavior, termed here ‘object focus’, ‘sharpness’, ‘human focus’, and ‘search focus’. These four aggregate behavioral traits also had the same pattern of short-term, but not long-term test-retest reliability as that observed for single item behaviors.
Prediction of certification outcomes using an independent test data set revealed that certification outcomes could not be predicted by breed, sex, or early test behaviors. However, prediction was improved by models that included two aggregate behavioral trait scores and three single item behaviors measured at the final test period, with 1 unit increases in these scores resulting in 1.7–2.8 increased odds of successful dual-certification and patrol-only certification outcomes. No improvements to odor-detection certification outcomes were made by any model. While only modest model improvements in prediction error were made by using behavioral parameters (2–7%), model predictions were based on data from dogs that had successfully completed all three test periods only, and therefore did not include data from dogs that were rejected during testing or training due to behavioral or medical reasons.
Thus, future improvements to predictive models may be more substantial using independent predictors with less restrictions in range. Reports of the reliability and validity estimates of behavioral instruments currently used to select MWDs are scarce, and we discuss these results in terms of improving the efficiency by which working dog programs may select dogs for patrol and odor-detection duties using behavioral pre-screening instruments.
[Keywords: military dog, personality, reliability, predictive validity, behavioral instrument]
1997-wilsson.pdf: “The use of a behaviour test for the selection of dogs for service and breeding, I: Method of testing and evaluating test results in the adult dog, demands on different kinds of service dogs, sex and breed differences”, Erik Wilsson, Per-Erik Sundgren
Variation across dog breeds presents a unique opportunity for investigating the evolution and biological basis of complex behavioral traits. We integrated behavioral data from more than 17,000 dogs from 101 breeds with breed-averaged genotypic data (n = 5,697 dogs) from over 100,000 loci in the dog genome. Across 14 traits, we found that breed differences in behavior are highly heritable, and that clustering of breeds based on behavior accurately recapitulates genetic relationships. We identify 131 single nucleotide polymorphisms associated with breed differences in behavior, which are found in genes that are highly expressed in the brain and enriched for neurobiological functions and developmental processes. Our results provide insight into the heritability and genetic architecture of complex behavioral traits, and suggest that dogs provide a powerful model for these questions.
2012-duffy.pdf: “Predictive validity of a method for evaluating temperament in young guide and service dogs”, Deborah L. Duffy, James A. Serpell
“Dog Behavior Co-Varies with Height, Bodyweight and Skull Shape”, (2013-10-14):
Dogs offer unique opportunities to study correlations between morphology and behavior because skull shapes and body shape are so diverse among breeds. Several studies have shown relationships between canine cephalic index (CI: the ratio of skull width to skull length) and neural architecture. Data on the of adult, show-quality dogs (six males and six females) were sourced in Australia along with existing data on the breeds’ height, bodyweight and related to data on 36 behavioral traits of companion dogs (n = 8,301) of various common breeds (n = 49) collected internationally using the Canine Behavioral Assessment and Research Questionnaire (C-BARQ). Stepwise backward elimination regressions revealed that, across the breeds, 33 behavioral traits all but one of which are undesirable in companion animals correlated with either height alone (n = 14), bodyweight alone (n = 5), alone (n = 3), bodyweight-and-skull shape combined (n = 2), height-and-skull shape combined (n = 3) or height-and-bodyweight combined (n = 6). For example, breed average height showed strongly statistically-significant inverse relationships (p < 0.001) with mounting persons or objects, touch sensitivity, urination when left alone, dog-directed fear, separation-related problems, non-social fear, defecation when left alone, owner-directed aggression, begging for food, urine marking and attachment/attention-seeking, while bodyweight showed strongly statistically-significant inverse relationships (p < 0.001) with excitability and being reported as hyperactive. Apart from trainability, all regression coefficients with height were negative indicating that, across the breeds, behavior becomes more problematic as height decreases. Allogrooming increased strongly (p < 0.001) with and inversely with height. alone showed a strong positive relationship with self-grooming (p < 0.001) but a negative relationship with chasing (p = 0.020). The current study demonstrates how aspects of (and therefore brain shape), bodyweight and height co-vary with behavior. The biological basis for, and statistical-significance of, these associations remain to be determined.
2019-horschler.pdf: “Absolute brain size predicts dog breed differences in executive function”, (2019-01-03; ):
Large-scale phylogenetic studies of animal cognition have revealed robust links between absolute brain volume and species differences in executive function. However, past comparative samples have been composed largely of primates, which are characterized by evolutionarily derived neural scaling rules. Therefore, it is currently unknown whether positive associations between brain volume and reflect a broad-scale evolutionary phenomenon, or alternatively, a unique consequence of primate brain evolution. Domestic dogs provide a powerful opportunity for investigating this question due to their close genetic relatedness, but vast intraspecific variation. Using citizen science data on more than 7000 purebred dogs from 74 breeds, and controlling for genetic relatedness between breeds, we identify strong relationships between estimated absolute brain weight and breed differences in cognition. Specifically, larger-brained breeds performed statistically-significantly better on measures of short-term memory and self-control. However, the relationships between estimated brain weight and other cognitive measures varied widely, supporting domain-specific accounts of cognitive evolution. Our results suggest that evolutionary increases in brain size are positively associated with taxonomic differences in executive function, even in the absence of primate-like neuroanatomy. These findings also suggest that variation between dog breeds may present a powerful model for investigating correlated changes in neuroanatomy and cognition among closely related taxa.
2015-hradecka.pdf: “Heritability of behavioural traits in domestic dogs: A meta-analysis”, Lenka Hradecká, Luděk Bartoš, Ivona Svobodová, James Sales
1957-shockley.pdf: “On the Statistics of Individual Variations of Productivity in Research Laboratories”, (1957; ):
It is well-known that some workers in scientific research laboratories are enormously more creative than others. If the number of scientific publications is used as a measure of productivity, it is found that some individuals create new science at a rate at least 50 times greater than others. Thus differences in rates of scientific production are much bigger than differences in the rates of performing simpler acts, such as the rate of running the mile, or the number of words a man can speak per minute. On the basis of statistical studies of rates of publication, it is found that it is more appropriate to consider not simply the rate of publication but its logarithm. The logarithm appears to have a normal distribution over the population of typical research laboratories. The existence of a “log-normal distribution” suggests that the logarithm of the rate of production is a manifestation of some fairly fundamental mental attribute. The great variation in rate of production from one individual to another can be explained on the basis of simplified models of the mental processes concerned. The common feature in the models is that a large number of factors are involved so that small changes in each, all in the same direction, may result in a very large [multiplicative] change in output. For example, the number of ideas a scientist can bring into awareness at one time may control his ability to make an invention and his rate of invention may increase very rapidly with this number.
1982-goddard.pdf: “Genetic and environmental factors affecting the suitability of dogs as Guide Dogs for the Blind”, M. E. Goddard, R. G. Beilharz
“Theory of Index Selection”, (1997-08-04):
While Chapters 28 and 29 present the basic theory for multivariate response, how, in practice, does one perform artificial selection on multiple traits? One of the commonest schemes is to construct some sort of index, wherein the investigator assigns (either explicitly or implicitly) a weighting scheme to each trait, creating a univariate character that becomes the target of selection. For example, if z is the vector of character values measured in an individual, the most common index is a linear combination Pbizi = bT z and most of our discussion focuses on such linear indices. We start with a general review of the theory of selection on a linear index and then cover in great detail the Smith-Hazel index (the index giving the largest expected response in a specified linear combination of characters) and its extensions. We also discuss a number of other indices for different purposes, such as restricted (constraining changes in specified traits) and desired-gains (specifying how the components, rather than the index, will evolve) indices. We conclude our discussion of index selection by considering how to best handle nonlinear indices. We finish the chapter by examining the other approach for selecting on multiple traits, namely choosing traits sequentially. Tandem selection, focusing on a single trait each generation (where the focal trait changes over generations) is one such approach, while the other is to select different traits at different times within the life span of single individuals (independent culling and multistage index selection).
“Applications of Index Selection”, (1997-08-04):
The first topic, which consists of the bulk of this chapter, is using index selection to improve a single trait. One can have a number of measures of the same trait in either relatives of a focal individual or as multiple measures of the same trait in a single individual, or both. How does one best use this information? We start by developing the general theory for using an index to improve the response in a single trait (which follows as a simplification of the Smith-Hazel index). We then apply these results to several important cases—a general analysis when either phenotypic or genotypic correlations are zero, improving response using repeated measurements of a characters over time, and using information from relatives to improve response with a special focus on combined selection (the optimal weighting of individual and family information, proving many of the details first presented in Chapter 17). As we will see in Chapter 35, the mixed-model power of BLUP provides a better solution to many of these problems, but index selection is both historically important as well as providing clean analytic results. In contrast to the first topic, the final three are essentially independent of each other and we try to present them as such (so that the reader can simply turn to the section of interest without regard to previous material in this chapter). They include selection on a ratio, selection on sex-specific and sexually-dimorphic traits, and finally selection on the environmental σ2E when it shows heritable variation (expanding upon results from Chapter 13).
2016-oh.pdf: “Propagation of elite rescue dogs by somatic cell nuclear transfer”, Hyun Ju Oh, Jin Choi, Min Jung Kim, Geon A. Kim, Young Kwang Jo, Yoo Bin Choi, Byeong Chun Lee
1999-lemish-wardogs.pdf: “War Dogs: A History of Loyalty and Heroism”, Michael G. Lemish
2015-polderman.pdf: “Meta-analysis of the heritability of human traits based on fifty years of twin studies”, (2015-05-18; ):
Despite a century of research on complex traits in humans, the relative importance and specific nature of the influences of genes and environment on human traits remain controversial. We report a meta-analysis of twin correlations and reported variance components for 17,804 traits from 2,748 publications including 14,558,903 partly dependent twin pairs, virtually all published twin studies of complex traits. Estimates of heritability cluster strongly within functional domains, and across all traits the reported heritability is 49%. For a majority (69%) of traits, the observed twin correlations are consistent with a simple and parsimonious model where twin resemblance is solely due to additive genetic variation. The data are inconsistent with substantial influences from shared environment or non-additive genetic variation. This study provides the most comprehensive analysis of the causes of individual differences in human traits thus far and will guide future gene-mapping efforts. All the results can be visualized using the MaTCH webtool.
1997-wilsson.pdf: “The use of a behaviour test for selection of dogs for service and breeding. II. Heritability for tested parameters and effect of selection based on service dog characteristics”, Erik Wilsson, Per-Erik Sundgren
1985-mackenzie.pdf: “Heritability estimate for temperament scores in German shepherd dogs and its genetic correlation with hip dysplasia”, Stephen A. Mackenzie, Elizabeth A. B. Oltenacu, Eldin Leighton
1996-karjalainen.pdf: “Environmental effects and genetic parameters for measurements of hunting performance in the Finnish Spitz”, L. Karjalainen, M. Ojala, V. Vilva
2017-vandenberg.pdf: “Genetics of dog behavior”, Linda van den Berg
1986-mackenzie.pdf: “Canine Behavioral Genetics - A Review”, Stephen A. Mackenzie, E. A. B. Oltenacu, K. A. Houpt
1995-willis.pdf: “Genetic aspects of dog behaviour with particular reference to working ability”, M. B. Willis
2001-houpt.pdf: “Genetics of Behaviour”, Katherine A. Houpt, M. B. Willis
2003-takeuchi.pdf: “Behavior genetics”, Yukari Takeuchi, Katherine A. Houpt
2003-vandenberg.pdf: “Behavior genetics of canine aggression: behavioral phenotyping of golden retrievers by means of an aggression test”, Guinness
2006-vandenberg.pdf: “Phenotyping of Aggressive Behavior in Golden Retriever Dogs with a Questionnaire”, L. van den Berg, M. B. H. Schilder, H. de Vries, P. A. J. Leegwater, B. A. van Oost
2007-liinamo.pdf: “Genetic variation in aggression–related traits in Golden Retriever dogs”, (2007-04-01; ):
In this study, heritabilities of several measures of aggression were estimated in a group of 325 Golden Retrievers, using the Restricted Maximum Likelihood method. The studied measures were obtained either through owner opinions or by using the Canine Behavioural Assessment and Research Questionnaire (CBARQ). The aim of the study was to determine which of the aggression measures showed sufficient genetic variation to be useful as phenotypes for future molecular genetic studies on aggression in this population.
The most reliable heritability estimates seemed to be those for simple dog owner impressions of human-directed and dog-directed aggression, with heritability estimates of 0.77 (S.E. 0.09) and 0.81 (S.E. 0.09), respectively. In addition, several CBARQ-derived measures related to human-directed aggression showed clear genetic differences between the dogs. The correlation between the estimated breeding values for owner impressions on human-directed and dog-directed aggression was relatively low. The low correlation suggests that these two traits have a partially different genetic background. They will therefore have to be treated as separate traits in further genetic studies.
[Keywords: dogs, aggressive behaviour, questionnaire, heritability, estimated breeding values]
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide ~3,700 and ~9,500 SNPs explained ~21%, ~24% and ~29% of phenotypic . Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ~2,000,
“Accurate Genomic Prediction Of Human Height”, (2017-10-07):
We construct genomic predictors for heritable and extremely complex human quan-titative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively, ~40, 20, and 9 percent of total SNP heritability from GCTA (GREML) analysis, and seems to be close to its asymptotic value (i.e., as sample size goes to infinity), suggesting that we have captured most of the heritability for the SNPs used. Thus, our results resolve the common portion of the “missing heritability” problem—i.e., the gap between prediction R-squared and heritability. The ~20k activated SNPs in our height predictor reveal the genetic architecture of human height, at least for common SNPs. Our primary dataset is the UK Biobank cohort, comprised of almost 500k individual genotypes with multiple phenotypes. We also use other datasets and SNPs found in earlier GWAS for out-of-sample validation of our results.for the three traits. For example, predicted heights correlate ~0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction. The captured for height is comparable to the estimated
Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional enrichments to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-related anno-tations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes. We applied LDpred-funct to predict 16 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg JV = 365K) and samples of other European ancestries as validation data (avg 7V = 22K), to minimize confounding. LDpred-funct attained a +27% relative improvement in prediction accuracy (avg prediction R2=0.173; highest R2=0.417 for height) compared to existing methods that do not incorporate functional information, consistent with simulations. For height, meta-analyzing training data from and 23andMe cohorts (total iV = 1107K; higher heritability in cohort) increased prediction R2 to 0.429. Our results show that modeling functional enrichment substantially improves polygenic prediction accuracy, bringing polygenic prediction of complex traits closer to clinical utility.
“Recovery of trait heritability from whole genome sequence data”, (2019-03-25):
Heritability, the proportion of phenotypic two-thirds of heritability is captured by common SNPs 2–5. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as over-estimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be fully recovered from whole-genome sequence (WGS) data on 21,620 unrelated individuals of European ancestry. We assigned 47.1 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned variation accordingly. The estimated heritability was 0.79 (SE 0.09) for height and 0.40 (SE 0.09) for , consistent with pedigree estimates. Low-MAF variants in low with neighbouring variants were enriched for heritability, to a greater extent for protein altering variants, consistent with negative selection thereon. Cumulatively variants in the MAF range of 0.0001 to 0.1 explained 0.54 (SE 0.05) and 0.51 (SE 0.11) of heritability for height and , respectively. Our results imply that the still missing heritability of complex traits and disease is accounted for by rare variants, in particular those in regions of low .explained by genetic factors, can be estimated from pedigree data 1, but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies ( ) on unrelated individuals have shown that for human traits and disease, approximately one-third to
2015-gianola.pdf: “One Hundred Years of Statistical Developments in Animal Breeding”, (2014-11-03; ):
Statistical methodology has played a key role in scientific animal breeding. Approximately one hundred years of statistical developments in animal breeding are reviewed. Some of the scientific foundations of the field are discussed, and many milestones are examined from historical and critical perspectives. The review concludes with a discussion of some future challenges and opportunities arising from the massive amount of data generated by livestock, plant, and human genome projects.
“Mixed model association for biobank-scale data sets”, (2018-01-04):
Biobank-based Here, we introduce a much faster version of our BOLT-LMM Bayesian mixed model association method— capable of running analyses of the full cohort in a few days on a single compute node—and show that it produces highly powered, robust test statistics when run on all 459K European samples (retaining related individuals). When used to conduct a for height in UK Biobank, BOLT-LMM achieved power equivalent to linear regression on 650K samples—a 93% increase in effective sample size versus the common practice of analyzing unrelated British samples using linear regression ( documentation; Bycroft et al bioRxiv). Across a broader set of 23 highly heritable traits, the total number of independent loci detected increased from 5,839 to 10,759, an 84% increase. We recommend the use of BOLT-LMM (retaining related individuals) for biobank-scale analyses, and we have publicly released BOLT-LMM summary association statistics for the 23 traits analyzed as a resource for all researchers.are enabling exciting insights in complex trait genetics, but much uncertainty remains over best practices for optimizing statistical power and computational efficiency in while controlling confounders.
Polygenic risk scores are emerging as a potentially powerful tool to predict future phenotypes of target individuals, typically using unrelated individuals, thereby devaluing information from relatives. Here, for 50 traits from thedata, we show that a design of 5,000 individuals with first-degree relatives of target individuals can achieve a prediction accuracy similar to that of around 220,000 unrelated individuals (mean prediction accuracy = 0.26 vs. 0.24, mean fold-change = 1.06 (95% : 0.99–1.13), p = 0.08), despite a 44-fold difference in sample size. For lifestyle traits, the prediction accuracy with 5,000 individuals including first-degree relatives of target individuals is statistically-significantly higher than that with 220,000 unrelated individuals (mean prediction accuracy = 0.22 vs. 0.16, mean fold-change = 1.40 (1.17–1.62), p = 0.025). Our findings suggest that polygenic prediction integrating family information may help to accelerate precision health and clinical intervention.
…We demonstrated that the polygenic prediction utilising close relatives between reference and target samples outperformed the analyses with unrelated individuals only by using the small-scale design. Compared with the analyses with second-degree or third-degree relatives, or unrelated individuals, a higher prediction accuracy was observed from the analysis with first-degree relatives, which was because of a lower value of Me that required fewer independent parameters to be estimated25,26,27. Moreover, this higher prediction accuracy was also probably due to the fact that close relatives share some unknown (unmodeled) factors in addition to additive genetic effects, which may be dominance, gene-by-family interaction and familial environmental effects. It was also shown that the analyses with second-degree and third-degree relatives outperformed the analysis with unrelated individuals although they were less efficient to improve the prediction accuracy, compared to first-degree relatives.
The approach of including close relatives will be most useful in applications where accuracy matters more than delineating between causal genetic effects and other effects. It is known that family-based heritability estimates can be inflated if nonadditive genetic effects or common environmental effects shared between close relatives are confounded with additive genetic effects3, which can be considered biased according to the concept of narrow-sense heritability that includes the additive genetic effects only. However, this bias should not be an issue when predicting the future phenotypes of target sample (ie., a new-born baby) because such nonadditive genetic and common environmental effects can be a valuable source to improve the prediction accuracy28,42. Indeed, family history has been widely used as a biomarker to predict disease risk43,44, and it can also be used to increase the power to identify causal variants in GWAS45,46,47. We consider that our method is a more systematic approach to utilise information of family history as well as within-family segregation48.
Common genetic variants have been shown to explain a fraction of the inherited variation for many common diseases and quantitative traits, including height, a classic polygenic trait. The extent to which common variation determines the phenotype of highly heritable traits such as height is uncertain, as is the extent to which common variation is relevant to individuals with more extreme phenotypes. To address these questions, we studied 1,214 individuals from the top and bottom extremes of the height distribution (tallest and shortest ~1.5%), drawn from ~78,000 individuals from the HUNT and FINRISK cohorts. We found that common variants still influence height at the extremes of the distribution: common variants (49⁄141) were nominally associated with height in the expected direction more often than is expected by chance (p <5×10−28), and the odds ratios in the extreme samples were consistent with the effects estimated previously in population-based data. To examine more closely whether the common variants have the expected effects, we calculated a weighted allele score (WAS), which is a weighted prediction of height for each individual based on the previously estimated of the common variants in the overall population. The average WAS is consistent with expectation in the tall individuals, but was not as extreme as expected in the shortest individuals (p < 0.006), indicating that some of the short stature is explained by factors other than common genetic variation. The discrepancy was more pronounced (p < 10−6) in the most extreme individuals (height<0.25 percentile). The results at the extreme short tails are consistent with a large number of models incorporating either rare genetic non-additive or rare non-genetic factors that decrease height. We conclude that common genetic variants are associated with height at the extremes as well as across the population, but that additional factors become more prominent at the shorter extreme.
Author Summary: Although there are many loci in the human genome that have been discovered to be statistically-significantly associated with height, it is unclear if these loci have similar effects in extremely tall and short individuals. Here, we examine hundreds of extremely tall and short individuals in two population-based cohorts to see if these known height determining loci are as predictive as expected in these individuals. We found that these loci are generally as predictive of height as expected in these individuals but that they begin to be less predictive in the most extremely short individuals. We showed that this result is consistent with models that not only include the common variants but also multiple low frequency genetic variants that substantially decrease height. However, this result is also consistent with non-additive genetic effects or rare non-genetic factors that substantially decrease height. This finding suggests the possibility of a major role of low frequency variants, particularly in individuals with extreme phenotypes, and has implications on whole-genome or whole-exome sequencing efforts to discover rare genetic variation associated with complex traits.
2013-liu.pdf: “Common DNA variants predict tall stature in Europeans”, Fan Liu, A. Emile J. Hendriks, Arwin Ralf, Annemieke M. Boot, Emelie Benyi, Lars Sävendahl, Ben A. Oostra, Cornelia van Duijn, Albert Hofman, Fernando Rivadeneira, André G. Uitterlinden, Stenvert L. S. Drop, Manfred Kayser
Pharmacogenetics (PGx) has the potential to personalize pharmaceutical treatments. Many relevant gene-drug associations have been discovered, but PGx guided treatment needs to be cost-effective as well as clinically beneficial to be incorporated into standard healthcare. Progress in this area can be assessed by reviewing economic evaluations to determine the cost-effectiveness of PGx testing versus standard treatment. We performed a review of economic evaluations for PGx associations listed in the US Food and Drug Administration (FDA) Table of Pharmacogenomic Biomarkers in Drug Labeling (http://www.fda.gov/Drugs/ScienceResearch/ResearchAreas/Pharmacogenetics/ucm083378.htm). We determined the proportion of evaluations that found PGx guided treatment to be cost-effective or dominant over the alternative strategies, and we estimated the impact on this proportion of removing the cost of genetic testing. Of the 130 PGx associations in the FDA table, 44 economic evaluations, relating to 10 drugs, were identified. Of these evaluations, 57% drew conclusions in favour of PGx testing, of which 30% were cost-effective and 27% were dominant (cost-saving). If genetic information was freely available, 75% of economic evaluations would support PGx guided treatment, of which 25% would be cost-effective and 50% would be dominant. Thus, PGx guided treatment can be a cost-effective and even cost-saving strategy. Having genetic information readily available in the clinical health record is a realistic future prospect, and would make more genetic tests economically worthwhile. However, few drugs with PGx associations have been studied and more economic evaluations are needed to underpin the uptake of genetic testing in clinical practice.
2017-mcrae.pdf: “Prevalence and architecture of de novo mutations in developmental disorders”, (2017-01-25; ):
The genomes of individuals with severe, undiagnosed developmental disorders are enriched in damaging de novo mutations (DNMs) in developmentally important genes. Here we have sequenced the exomes of 4,293 families containing individuals with developmental disorders, and meta-analysed these data with data from another 3,287 individuals with similar disorders. We show that the most important factors influencing the diagnostic yield of DNMs are the sex of the affected individual, the relatedness of their parents, whether close relatives are affected and the parental ages. We identified 94 genes enriched in damaging DNMs, including 14 that previously lacked compelling evidence of involvement in developmental disorders. We have also characterized the phenotypic diversity among these disorders. We estimate that 42% of our cohort carry pathogenic DNMs in coding sequences; approximately half of these DNMs disrupt gene function and the remainder result in altered protein function. We estimate that developmental disorders caused by DNMs have an average prevalence of 1 in 213 to 1 in 448 births, depending on parental age. Given current global demographics, this equates to almost 400,000 children born per year.
2018-torkamani.pdf: “The personal and clinical utility of polygenic risk scores”, Ali Torkamani, Nathan E. Wineinger, Eric J. Topol
2018-khera.pdf: “Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations”, Amit V. Khera, Mark Chaffin, Krishna G. Aragam, Mary E. Haas, Carolina Roselli, Seung Hoan Choi, Pradeep Natarajan, Eric S. Lander, Steven A. Lubitz, Patrick T. Ellinor, Sekar Kathiresan
Background: Coronary artery disease (CAD) has substantial heritability and a polygenic architecture; however, genomic risk scores have not yet leveraged the totality of genetic information available nor been externally tested at population-scale to show potential utility in primary prevention.
Methods: Using a meta-analytic approach to combine large-scale genome-wide and targeted genetic association data, we developed a new genomic risk score for CAD (metaGRS), consisting of 1.7 million genetic variants. We externally tested metaGRS, individually and in combination with available conventional risk factors, in 22,242 CAD cases and 460,387 non-cases from UK Biobank.
In a standard deviation increase in metaGRS had a hazard ratio (HR) of 1.71 (95% 1.68–1.73) for CAD, greater than any other externally tested genetic risk score. Individuals in the top 20% of the metaGRS distribution had a HR of 4.17 (95% CI 3.97–4.38) compared with those in the bottom 20%. The metaGRS had higher C-index (C = 0.623, 95% 0.615–0.631) for incident CAD than any of four conventional factors (smoking, diabetes, hypertension, and ), and addition of the metaGRS to a model of conventional risk factors increased C-index by 3.7%. In individuals on lipid-lowering or anti-hypertensive medications at recruitment, metaGRS hazard for incident CAD was significantly but only partially attenuated with HR of 2.83 (95% CI 2.61– 3.07) between the top and bottom 20% of the metaGRS distribution.,
Recent genetic association studies have yielded enough information to meaningfully stratify individuals using the metaGRS for CAD risk in both early and later life, thus enabling targeted primary intervention in combination with conventional risk factors. The metaGRS effect was partially attenuated by lipid and blood pressure-lowering medication, however other prevention strategies will be required to fully benefit from earlier genomic risk stratification.
National Health and Medical Research Council of Australia, British Heart Foundation, Australian Heart Foundation.
Recent metaGRS for ischaemic stroke (IS) and analysed this score in the UK (n = 395,393; 3075 IS events by age 75). The metaGRS hazard ratio for IS (1.26, 95% 1.22-1.31 per standard deviation increase of the score) doubled that of previous , enabling the identification of a subset of individuals at monogenic levels of risk: individuals in the top 0.25% of metaGRS had a three-fold increased risk of IS. The metaGRS was similarly or more predictive when compared to established risk factors, such as family history, blood pressure, and smoking status. For participants within accepted guideline levels for established stroke risk factors, we found substantial variation in incident stroke rates across genomic risk backgrounds. We further estimated combinations of reductions needed in modifiable risk factors for individuals with different levels of genomic risk and suggest that, for individuals with high metaGRS, achieving currently recommended risk factor levels may be insufficient to mitigate risk.in stroke have enabled the generation of genomic risk scores ( ) but the predictive power of these has been modest in comparison to established stroke risk factors. Here, using a meta-scoring approach, we developed a
Background: Polygenic risk scores () have shown promise in predicting susceptibility to common diseases. However, the extent to which and clinical risk factors act jointly and identify high-risk individuals for early onset of disease is unknown.
Methods: We used large-scale the FINRISK study with standardized clinical risk factor measurements to build genome-wide PRSs with >6M variants for coronary heart disease (CHD), type 2 diabetes (T2D), atrial fibrillation (AF), and breast and prostate cancer. We evaluated their associations with first disease events, age at disease onset, and impact together with routinely used clinical risk scores for predicting future disease.data (the FinnGen study; n = 135,300), with up to 46 years of prospective follow-up, and
Results: Compared to the 20–80th percentiles, a top and bottom 2.5% of PRSs was 6 to 13 years. Among early-onset cases, 21.3-32.9% had a in the highest decile and in CHD and AF.in the top 2.5% translated into hazard ratios (HRs) for incident disease ranging from 2.03 to 4.28 (p-values 1.96×10−59 to <1.00×10−100) and the bottom 2.5% into HRs ranging from 0.20 to 0.61. The estimated difference in age at disease onset between
Conclusion: The properties ofwere similar in all five diseases. identified a considerable proportion early-onset cases, and for all ages the performance of was comparable to established clinical risk scores. These findings warrant further clinical studies on application of polygenic risk information for stratified screening or for guiding lifestyle and preventive medical interventions.
In the last decade the scientific community witnessed a large increase insample size, in the availability of large and in the improvements of statistical methods to model genomes features. This have paved the way for the development of new prediction medicine tools that use genomic data to estimate disease risk. One of these tools is the ( ), a metric that estimates the genetic risk of an individual to develop a disease, based on a combination of a large number of genetic variants.
Using the largest prospective genotyped cohort available to date, the (CAD) and assessed its predictive performances along with two additional for Breast Cancer (BC), and Prostate Cancer (PC). When compared with previously published , the newly developed PRS for CAD displayed higher AUC and positive predictive value. PRSs were able to stratify disease risks from 1.34% to 25.7% (CAD in men), from 0.26% to 8.62% (CAD in women), from 1.6% to 24.6% (BC), and from 1.4% to 24.3% (PC) in the lowest and highest percentiles, respectively. Additionally, the three PRSs were able to identify the 5% of the population with a relative risk for the diseases at least 3 times higher than the average., we built a new for Coronary Artery Disease
Family history is a well recognised risk factor of CAD, BC, and PC and it is currently used to identify individuals at high risk of developing the diseases. We show that individuals with family history can have completely different disease risks based on PRS stratification: from 2.1% to 33% (CAD in men), from 0.56% to 10% (CAD in women), from 2.3% to 35.8% (BC), and from 1.0% to 34.0% (PC) in the lowest and highest percentiles, respectively. Additionally, the PRSs demonstrated higher predictive performance (AUCs (including age) CAD: 0.81, PC: 0.80, and BC: 0.68) than family history (AUCs (including age) CAD: 0.79, PC: 0.73, and BC: 0.61) in predicting the onset of diseases.
Hyperlipidemia is well known to be associated with higher CAD risk, but a predictive performance comparison between each lipoprotein and CAD has never been assessed. shows higher discrimination capacity and Odds ratio per Standard deviation than LDL, HDL, total cholesterol-HDL ratio, ApoA, ApoB, ApoB-ApoA ratio, and Lipoprotein(a). Comparing the empirical risk distribution between and each lipoprotein, we show that lipoprotein thresholds, currently used in clinical practice, identify a population equal to or smaller than what can be identified with the at the same CAD risk threshold. Moreover, there is not correlation (max ρ: 0.137) between and each lipoprotein, indicating that captures different component of CAD etiology and identifies different people at high risk than those identified by lipoproteins, demonstrating to be an invaluable tool in CAD prevention.
One of the major impairment of the computational complexity needed to calculate per-individual PRSs. Deep bioinformatics expertise is required to run the entire pipeline, from imputing genomic data, through quality control to result visualisation. For these reasons we developed a Software as a Service (SaaS) for genomic risk prediction of complex diseases. The SaaS is fully automated, GDPR complaint and has been certified as a CE marked medical device. We made the SaaS purposes. Researchers willing to use the SaaS can contact email@example.com in clinical practice is the
Background: There is considerable interest in whether genetic data can be used to improve standard cardiovascular disease risk calculators, as the latter are routinely used in clinical practice to manage preventative treatment.
Methods: This research has been conducted using the UK Biobank (UKB) resource. We developed our own polygenic risk score ( ) for coronary artery disease (CAD), using novel and established methods to combine published genome-wide association study ( ) data with data from 114,196 individuals, also leveraging a large resource of other datasets along with functional information, to aid in the identification of causal variants, and thence define weights for > 8M genetic variants. We utilised a further 60,000 UKB individuals to develop an integrated risk tool (IRT) that combined our with established risk tools (either the American Heart Association/American College of Cardiology’s pooled cohort equations (PCE) or the UK’s QRISK3) which was then tested in an additional, independent, set of 212,563 UKB individuals. We evaluated prediction performance in individuals of European ancestry, both as a whole and stratified by age and sex.
Findings: The novel CAD showed superior predictive power for CAD events, compared to other published PRSs. As an individual risk factor, it has similar predictive power to each of systolic blood pressure, HDL cholesterol, and LDL cholesterol, but is more predictive than total cholesterol and smoking history. Our novel CAD is largely uncorrelated with PCE, QRISK3, and family history, and, when combined with PCE into an integrated risk tool, had superior predictive accuracy. In individuals reclassified as high risk, CAD event rates were markedly and statistically-significantly higher compared to those reclassified as low risk. Overall, 9.7% of incident CAD cases were misclassified as low risk by PCE and correctly classified as high risk by the IRT, in contrast to 3.7% misclassified by the IRT and correctly classified by PCE. The overall net reclassification improvement for the IRT was 5.7% (95% CI 4.4–7.0), but when individuals were stratified into four age-by-sex subgroups the improvement was larger for all subgroups (range 7.7%-17.3%), with best performance in younger middle-aged men aged 40–54yo (17.3%, 95% 13.0–21.5). Broadly similar results were found using a different risk tool (QRISK3), and also for cardiovascular disease events defined more broadly.
Interpretation: An integrated risk tool that includes polygenic risk outperforms current, clinical risk stratification tools, and offers greater opportunity for early interventions. Given the plummeting costs of genetic tests, future iterations of CAD risk tools would be enhanced with the addition of a person’s polygenic risk.