Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

26
Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011

Transcript of Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Page 1: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Genetics: Chapter 3

Epidemiology 217January 16, 2011

Page 2: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Outline

Allele Frequency Estimation

Hardy-Weinberg equilibrium (HWE)

HWE Game

Population Substructure

Page 3: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Allele Frequency

Diploid, autosomal locus with 2 alleles: A and aAllele frequency is the fraction:

No. of particular allele

No. of all alleles in population

0.86 0.14

0.64 0.36

0.53 0.47

0.93 0.08

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

CEU

CHB

JPT

YRI

rs1036819: longevity SNP

AC

Page 4: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Allele (Gamete) Frequency

Let p = Freq(A) frequency of the dominant allele

Let q = Freq(a) frequency of the recessive allele

Then, p + q =1

Page 5: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Genotype Frequency

p2 = frequency of homozygous dominant genotype q2 = frequency of homozygous recessive

genotype 2pq = frequency of heterozygous genotype

Then, p2 +2pq + q2 =1

Page 6: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Estimating Allele Frequencies from Genotype Frequencies

Frequency of A allele = p2 + ½ (2pq)Frequency of a allele = q2 + ½ (2pq)

Genotypes: AA Aa aaFrequency: p2 2pq q2

Page 7: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Ex. Calculation: Allele Frequencies

In Pop 1:

Assume N=200 in each of two populationsPop 1: 90 AA 40 Aa 70 aa (N=200)Pop 2: 45 AA 130Aa 25 aa (N=200)

p = 90/200 + ½ (40/200) = 0.45 + 0.10 = 0.55q = 70/200 + ½ (40/200) = 0.35 + 0.10 = 0.45

In Pop 2:p = 45/200 + ½ (130/200) = 0.225 + 0.325 = 0.55q = 25/200 + ½ (130/200) = 0.125 + 0.325 = 0.45

Page 8: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Take home points

p + q =1 (sum of the allele frequencies = 1)

p2 + 2pq + q2 =1 (sum of the genotype frequencies = 1)

Two populations with markedly different genotype

frequencies can have the same allele frequencies

Page 9: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Hardy-Weinberg

The Hardy–Weinberg principle states that both allele and genotype frequencies in a population remain constant—that is, they are in equilibrium—from generation to generation unless specific disturbing influences are introduced

p2 + 2pq + q2 = 1

Page 10: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Hardy-Weinberg Assumptions

Allele frequencies do not vary IF:Large populationRandom matingNo in or out migrationNo isolated groups within the populationNo mutationNo selection (no allele is advantageous)

Page 11: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Test of Hardy-Weinberg Equilibrium

Allele frequencies

G alleles = 100*2 + 30 = 230

A alleles =20*2 + 30 = 70

Total alleles = 300

1. Calculate observed allele & genotype frequencies

Genotype frequencies

GG = 100/150 = 0.67

AG =30/150 = 0.20

AA = 20/150 = 0.13

100 GG

30 AG

20 AA

G afq (p) = 230/300 = 0.71

A afq (q) = 1-p = 0.23

Page 12: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Test of Hardy-Weinberg Equilibrium

p2 (GG) = 0.77 * 0.77 = 0.59

2pq (AG) = 2 * 0.77 * 0.23 = 0.35

q2 (AA) = 0.23 * 0.23 = 0.05

2. Calculate expected genotype frequencies based on HW: p2 + 2pq + q2 = 1

Page 13: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Test of Hardy-Weinberg Equilibrium

expected observed

GG 0.59 0.67

AG 0.35 0.20

AA 0.06 0.13

3. Compare expected genotype frequencies to observed frequencies

Chi-square test = Σ(observed – expected)2/expected

= 29.17 with 1 degree of freedom

p = 6.6 x 10-8 > Out of H-W

Page 14: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

HWE can be easily expanded to account for any number of alleles at a locus

3 allele case (p1, p2, p3)

Allele frequencies: p1 + p2 + p3 = 1

Genotype frequencies:p1

2 + p22 + p3

2 + 2p1p2 + 2p1p3 + 2p2p3= 1

4 allele case (p1, p2, p3, p4)Allele frequencies: p1 + p2 + p3 + p4= 1

Genotype frequencies:p12 + p22 + p32 + p4

2 + 2p1p2 + 2p1p3 + 2p2p3 + 2p3p4=

1

Page 15: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Application of Hardy-Weinberg Equilibrium

For genetic association studies:

Used as QC measure to assess the accuracy of the genotyping method

Expect SNPs to be in HWE among control populations (ethnic-specific)

Violations of HWE could indicate genotyping errors or bias in data

Page 16: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

HWE Game

1. Everyone receives ~5 pairs of cards

2. Two allele model: Red (R allele) & Black (B allele)

3. Random Mating: Exchange one card from each pair with another person (keep cards face down)

4. Determine genotype frequency: RR, RB, BB

5. Determine allele frequency: R, B

Page 17: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Stratification

Population stratification is a form of confounding in genetic studies where a gene under study shows marked variation in allele frequency across subgroups of a population and these subgroups differ in their baseline risk of disease

Page 18: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Stratification: Confounding

Exposure of Interest

True Risk Factor Disease

Genotype of Interest

Disease

Ethnicity

True Risk Factor

Wacholder, JNCI, 2000

Page 19: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Stratification: Gm3;5,13,14 in admixed sample of Native Americans of the Pima and Papago tribes

Study Population: 4,290 Pima and Papago Indians

Genetic Variant: Gm 3;5,13, 15 haplotype (Gm system of human immunoglobulin G)

Outcome: Type 2 diabetes

Question: Is the Gm 3; 5,13, 15 haplotype associated with Type 2 diabetes?

Knowler, AJHG, 1998

Page 20: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Stratification: Gm3;5,13,14 in admixed sample of Native Americans of the Pima and Papago tribes

Unadjusted for ethnic backgroundOR = 0.27 (95% 0.18-0.40)

Full heritage American Indian population

+ -

Gm3;5,13,14 ~1% ~99%

NIDDM prevalence ~40%

Caucasian population

+ -

Gm3;5,13,14 ~66% ~34%

NIDDM prevalence ~15%

Gm3,5,13,14 haplotype Cases Controls

+ 7.80% 29.00%

- 92.20% 71.00%

Page 21: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Stratification: Gm3;5,13,14 in admixed sample of Native Americans of the Pima and Papago tribes

Gm3,5,13,14 haplotype Cases Controls

+ 7.80% 29.00%

- 92.20% 71.00%

Adjusted for ethnic background OR = 0.83 (95% 0.58-1.18)

Index of Indian heritage

Gm3;5,13,14 haplotype

% Diabetes

0 65.8% 18.5%

4 42.1% 28.5%

8 1.6% 39.2%

Page 22: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Ancestry Informative Markers

Polymorphisms with known allele frequency differences across ancestral groups

Useful in estimating ancestry in admixed individuals

Example: Duffy locus (codes for blood group)

100% sub-Saharan Africans vs. other groups

protects P. vivax (malaria)

Page 23: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Example AIM: Duffy locus

0.15 0.85

0.99 0.01

0.77 0.24

1.00

1.00

0.53 0.47

1.00

1.00

0.00 0.20 0.40 0.60 0.80 1.00

U.S. Hispanics

Mayan

U.S. Whites

England

Germany

U.S. Blacks

Central Africa

Nigeria

rs1814778: Duffy locus

A

G

http://www.ncbi.nlm.nih.gov/projects/SNP

Page 24: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Inbreeding

Population inbreeding occurs when there is a preference of mating between close relatives or because of geographic isolation in a population. This will cause deviations in HWE by causing a deficit of heterozygotes.

Page 25: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

How to quantify the amount of inbreeding in a population?

Inbreeding coefficient, F

The probability that a random individual in the population inherits two copies of the same allele from a common ancestor

F ranges 0 to 1:

F is low in random mating populations

F close to 1 in self-breeding population (plants)

Page 26: Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Helgason, Science, 2008

Kinship & Reproduction: Icelandic couples

# of children

# of children

that reproduce

# of grandchildren

mean lifespan of children