Genetics, Lecture 1 &2 Purines & Pyrimidines (Lecture notes)
Lecture 24: Asscociation Genetics November 20, 2015.
-
Upload
vanessa-hampton -
Category
Documents
-
view
222 -
download
4
Transcript of Lecture 24: Asscociation Genetics November 20, 2015.
Lecture 24 : Asscociation Genetics
November 20, 2015
Last Time
Coalescence and human origins
Human origins: Neanderthals and Denisovans
Coalescent simulations and hybridization
Adaptive significance of ancient introgression
Today Quantitative traits
Genetic basis
Heritability
Linking phenotype to genotype
QTL analysis introduction
Limitations of QTL
Association genetics
Quantitative traitQuantitative trait
16 64 76 8828 40 52Height
Mendelian traitMendelian traitIndividual
10987654321
12 11 22 22 11 22 12 11 22 12Genotype =
Allele A1
Allele A2
Courtesy of Glenn Howe
Hartl and Clark 2007
3 loci, 2 additive alleles
Uppercase alleles contribute 1 unit to phenotype (e.g., shade of color)
Hartl, D. 1987. A primer of Population Genetics.
Quantitative traits are polygenic
Students at Connecticut Agricultural College, 1914
50 55 60 65 70 75 80 850
1
2
3
4
5
6
7x 10
4
As the number of loci controlling a trait increases, the distribution of trait values in a population becomes bell-shaped
Height vs GDP (1925-1949)
Baten 2006
1914
1996
Schilling et al. 2002. Amer. Stat. 56: 223-229
Influence of Environment on Human Height
By Country
Mean = 67 2.7 in.
Mean = 70 3 in.
6:54:10
Environment
+
Phenotype
=
Genotype
The phenotype is the outward manifestation of the genotype
The phenotype is the outward manifestation of the genotype
σ2P σ2
Eσ2G
Courtesy of Glenn Howe
Types of genetic variance (σ2G)
Additive (σ2A): effects of individual alleles
Dominance (σ2D): effects of allele interactions within
locus
Interaction (σ2I): effects of interactions among loci
(epistasis)
σ2G = σ2
A + σ2D + σ2
INon-additive
Main cause for resemblance between relatives
Heritability Phenotype vs Genotype
Var(phenotype) = Var(genotype) + Var(environment)
Heritability:
Var(genotype) / Var(phenotype)
Two types of heritability
Broad-Sense Heritability includes all genetic effects: dominance, epistasis, and additivity
− For example, the degree to which clones or monozygotic twins have the same phenotype
Narrow-Sense Heritability includes only additive effects
− For example, degree to which offspring resemble their parents
Heritability (continued) Characteristic of a trait measured in a particular population in a particular
environment
Best estimated in experiments (controlled environments)
Estimated from resemblance between relatives
The higher the heritability, the better the prediction of genotype from phenotype (and vice versa)
h² = 0.1 h² = 0.5 h² = 0.9
http://psych.colorado.edu/~carey/hgss/hgssapplets/heritability/heritability1/heritability1.html
P P P
G G G
Identifying Genes Underlying Quantitative Traits Many individual loci are responsible for quantitative
traits, even those with high heritability
Identification of these loci is a major goal of breeding programs
Allows mechanistic understanding of adaptive variation
Methods usually rely on correlations between molecular marker polymorphisms and phenotypes
Quantitative Trait Locus Mapping
HEIG
HT
GENOTYPEBBBbbb
modified from D. Neale
abc
ABC
ABC
Parent 1 Parent 2
Xabc
F1 F1
X
ABC
abc
ABC
abc
ABc
aBc
aBc
Abc
ABc
aBc
Abc
Abc
abc
Abc
ABC
ABc
Abc
aBc
aBc
Abc
aBc
aBc
Bb
BbBB BB BBbb bbBB Bb Bb
Quantitative Trait Locus Analysis
Step 1: Make a controlled cross to create a large family (or a collection of families)
Parents should differ for phenotypes of interest
Segregation of trait in the progeny
Step 2: Create a genetic map
Large number of markers phenotyped for all progeny
Step 3: Measure phenotypes
Need phenotypes with high heritability
Step 1: Construct Pedigree Cross two individuals with
contrasting characteristics
Create population with segregating traits
Ideally: inbred parents crossed to produce F1s, which are intercrossed to produce F2s
Recombinant Inbred Lines created by repeated intercrossing
Allows precise phenotyping, isolation of allelic effects
Grisel 2000 Alchohol Research & Health 24:169
Step 2: Construct Genetic Map Number of recombinations between
markers is a function of map distance
Gives overview of structure of entire genome
Anonymous markers are cheap and efficient: AFLP, Genotyping by Sequencing
Codominant markers much more informative: SSR, SNP
Genotyping by Sequencing gives best of both worlds: cheap, abundant, codominant markers!
Step 3: Determine Phenotypes of Offspring
Phenotype must be segregating in pedigree
Must differentiate genotype and environment effects
How?
Works best with phenotypes with high heritability
0.1
0.5
0.9
Step 4: Detect Associations between Markers and Phenotypes Single-marker associations are
simplest
Simple ANOVA, correcting for multiple comparisons
Log likelihood ratio: LOD (Log10 of odds)
If QTL is between two markers, situation more complex
Recombination between QTL and markers (genotype doesn't predict phenotype)
'Ghost' QTL due to adjacent QTL
Use interval mapping or composite interval mapping
Simultaneously consider pairs of loci across the genome
Step 5: Identify underlying molecular mechanisms
QTG: Quantitative Trait Gene
QTN: Quantitative Trait Nucleotide
chromosome
Genetic Marker
Adapted from Richard Mott, Wellcome Trust Center for Human Genetics
QTL
QTL Limitations
Huge regions of genome underly QTL, usually hundreds of genes
How to distinguish among candidates?
Biased toward detection of large-effect loci
Need very large pedigrees to do this properly
Limited genetic base: QTL may only apply to the two individuals in the cross!
Genotype x Environment interactions rampant: some QTL only appear in certain environments
Linkage Disequilibrium and Quantitative Trait Mapping
Linkage and quantitative trait locus (QTL) analysis
Need a pedigree and moderate number of molecular markers
Very large regions of chromosomes represented by markers
Association Studies with Natural Populations
No pedigree required
Need large numbers of genetic markers
Small chromosomal segments can be localized
Many more markers are required than in traditional QTL analysis
Cardon and Bell 2001, Nat. Rev. Genet. 2: 91-99
Association Mapping
ancestral chromosomes
*TG
recombination throughevolutionary history
present-daychromosomesin natural population
*TG
*TA
CG
CA*TG
CA
Slide courtesy of Dave Neale
HEIG
HT
GENOTYPECCTCTT
Next-Generation Sequencing and Whole Genome Scans
The $1000 genome is here
Current cost with Illumina HiSeq X10 is about $1000 for 30X depth
Tens of thousands of human genomes have now been sequenced at low depth
Can detect most polymorphisms with frequency >0.01
True whole genome association studies now possible at a very large scale
Direct to Consumer Genomics: 23 & Me and other genotyping services
http://www.1000genomes.org/
Commercial Services for Human Genome-Wide SNP Characterization
NATURE|Vol 437|27 October 2005
Assay 1.2 million “tag SNPs” scattered across genome using Illumina BeadArray technology
Ancestry analyses and disease/behavioral susceptibility
Identifying genetic mechanisms of simple vs. complex diseases
Simple (Mendelian) diseases: Caused by a single major gene
High heritability; often can be recognized in pedigrees Example: Huntington’s, Achondroplasia, Cystic fibrosis, Sickle Cell Anemia Tools: Linkage analysis, positional cloning Over 2900 disease-causing genes have been identified thus far: Human Gene Mutation
Database: www.hgmd.cf.ac.uk
Complex (non-Mendelian) diseases: Caused by the interaction between environmental factors and multiple genes with minor effects
Interactions between genes, Low heritability Example: Heart disease, Type II diabetes, Cancer, Asthma Tools: Association mapping, SNPs !! Over 35,000 SNP associations have been identified thus far:
http://www.snpedia.com
Slide adapted from Kermit Ritland
Complicating factor: Trait HeterogeneitySame phenotype has multiple genetic mechanisms underlying it
Slide adapted from Kermit Ritland
Case-Control Example: Diabetes
Knowler et al. (1988) collected data on 4920 Pima and Papago Native American populations in Southwestern United States
High rate of Type II diabetes in these populations
Found significant associations with Immunoglobin G marker (Gm)
Does this indicate underlying mechanisms of disease?
Knowler et al. (1988) Am. J. Hum. Genet. 43: 520
Type 2 Diabetes present absent Total
present 8 29 37
absent 92 71 163
Total 100 100 200
Gm Haplotype
(1) Test for an association
21 = (ad - bc)2N .
(a+c)(b+d)(a+b)(c+d)
Case-control test for association (case=diabetic, control=not diabetic)
Question: Is the Gm haplotype associated with risk of Type 2 diabetes???
(2) Chi-square is significant. Therefore presence of GM haplotype seems to confer reduced occurence of diabetes
= [(8x71)-(29x92)]2 (200) (100)(100)(37)(163)
= 14.62
Slide adapted from Kermit Ritland
Index of indian Heritage
Gm Haplotype
Percent with diabetes
0 Present
Absent
17.8
19.9
4 Present
Absent
28.3
28.8
8 Present
Absent
35.9
39.3
Case-control test for association (continued)
Question: Is the Gm haplotype actually associated with risk of Type 2 diabetes???
The real story: Stratify by American Indian heritage
0 = little or no indian heritage; 8 = complete indian heritage
Conclusion: The Gm haplotype is NOT a risk factor for Type 2 diabetes, but is a marker of American Indian heritage
Slide adapted from Kermit Ritland
Assume populations are historically isolated
One has higher disease frequency by chance
Unlinked loci are differentiated between populations also
Unlinked loci show disease association when populations are lumped together
Population structure and spurious association
Alleles at neutral locus
Alleles causing susceptibility to disease
Population with low disease frequency
Population with high disease frequency
Gen
e fl
ow b
arri
er
Association Study Limitations
Population structure: differences between cases and controls
Genetic heterogeneity underlying trait
Random error/false positives
Inadequate genome coverage
Poorly-estimated linkage disequilibrium