Statistics Seminar

Amy WilliamsCornell University
Empowering genetic analyses within family and admixed datasets through refined modeling approaches

Wednesday, September 28, 2016 - 4:15pm
Biotech G01

Available human genetic datasets range in relatedness from sets of close relatives that are members of the same family, to admixed individuals whose ancestry descends from multiple populations, and to individuals from highly diverged populations. While many analyses can be applied to all datasets, family data enable deeper insight into the samples’ genetic heritage. In particular, we leverage differences in the recombination patterns between males and females to infer the parent-of-origin (father or mother) of variants that a set of siblings received, obtaining a high accuracy of >91%. We also analyze the unique features of admixed individuals in which the population of origin of chromosomal regions varies within each individual. Many methods exist to infer this local (i.e., locus-specific) ancestry, but at present the most accurate methods require the use of panels of unadmixed individuals to model the genetics of the ancestral populations. Collecting unadmixed panels is not always feasible and we therefore outline an approach that leverages information from within the admixed samples themselves in order to model ancestral using data that directly descend from the populations of interest. We show that our approach is has comparable accuracy to a state-of-the-art method when panels are available and that it is also effective in when panels are unavailable.