Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Post on 16-Jan-2016

214 views 0 download

Tags:

Transcript of Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.

Population Genetics: Chapter 3

Epidemiology 217January 16, 2011

Outline

Allele Frequency Estimation

Hardy-Weinberg equilibrium (HWE)

HWE Game

Population Substructure

Allele Frequency

Diploid, autosomal locus with 2 alleles: A and aAllele frequency is the fraction:

No. of particular allele

No. of all alleles in population

0.86 0.14

0.64 0.36

0.53 0.47

0.93 0.08

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

CEU

CHB

JPT

YRI

rs1036819: longevity SNP

AC

Allele (Gamete) Frequency

Let p = Freq(A) frequency of the dominant allele

Let q = Freq(a) frequency of the recessive allele

Then, p + q =1

Genotype Frequency

p2 = frequency of homozygous dominant genotype q2 = frequency of homozygous recessive

genotype 2pq = frequency of heterozygous genotype

Then, p2 +2pq + q2 =1

Estimating Allele Frequencies from Genotype Frequencies

Frequency of A allele = p2 + ½ (2pq)Frequency of a allele = q2 + ½ (2pq)

Genotypes: AA Aa aaFrequency: p2 2pq q2

Ex. Calculation: Allele Frequencies

In Pop 1:

Assume N=200 in each of two populationsPop 1: 90 AA 40 Aa 70 aa (N=200)Pop 2: 45 AA 130Aa 25 aa (N=200)

p = 90/200 + ½ (40/200) = 0.45 + 0.10 = 0.55q = 70/200 + ½ (40/200) = 0.35 + 0.10 = 0.45

In Pop 2:p = 45/200 + ½ (130/200) = 0.225 + 0.325 = 0.55q = 25/200 + ½ (130/200) = 0.125 + 0.325 = 0.45

Take home points

p + q =1 (sum of the allele frequencies = 1)

p2 + 2pq + q2 =1 (sum of the genotype frequencies = 1)

Two populations with markedly different genotype

frequencies can have the same allele frequencies

Hardy-Weinberg

The Hardy–Weinberg principle states that both allele and genotype frequencies in a population remain constant—that is, they are in equilibrium—from generation to generation unless specific disturbing influences are introduced

p2 + 2pq + q2 = 1

Hardy-Weinberg Assumptions

Allele frequencies do not vary IF:Large populationRandom matingNo in or out migrationNo isolated groups within the populationNo mutationNo selection (no allele is advantageous)

Test of Hardy-Weinberg Equilibrium

Allele frequencies

G alleles = 100*2 + 30 = 230

A alleles =20*2 + 30 = 70

Total alleles = 300

1. Calculate observed allele & genotype frequencies

Genotype frequencies

GG = 100/150 = 0.67

AG =30/150 = 0.20

AA = 20/150 = 0.13

100 GG

30 AG

20 AA

G afq (p) = 230/300 = 0.71

A afq (q) = 1-p = 0.23

Test of Hardy-Weinberg Equilibrium

p2 (GG) = 0.77 * 0.77 = 0.59

2pq (AG) = 2 * 0.77 * 0.23 = 0.35

q2 (AA) = 0.23 * 0.23 = 0.05

2. Calculate expected genotype frequencies based on HW: p2 + 2pq + q2 = 1

Test of Hardy-Weinberg Equilibrium

expected observed

GG 0.59 0.67

AG 0.35 0.20

AA 0.06 0.13

3. Compare expected genotype frequencies to observed frequencies

Chi-square test = Σ(observed – expected)2/expected

= 29.17 with 1 degree of freedom

p = 6.6 x 10-8 > Out of H-W

HWE can be easily expanded to account for any number of alleles at a locus

3 allele case (p1, p2, p3)

Allele frequencies: p1 + p2 + p3 = 1

Genotype frequencies:p1

2 + p22 + p3

2 + 2p1p2 + 2p1p3 + 2p2p3= 1

4 allele case (p1, p2, p3, p4)Allele frequencies: p1 + p2 + p3 + p4= 1

Genotype frequencies:p12 + p22 + p32 + p4

2 + 2p1p2 + 2p1p3 + 2p2p3 + 2p3p4=

1

Application of Hardy-Weinberg Equilibrium

For genetic association studies:

Used as QC measure to assess the accuracy of the genotyping method

Expect SNPs to be in HWE among control populations (ethnic-specific)

Violations of HWE could indicate genotyping errors or bias in data

HWE Game

1. Everyone receives ~5 pairs of cards

2. Two allele model: Red (R allele) & Black (B allele)

3. Random Mating: Exchange one card from each pair with another person (keep cards face down)

4. Determine genotype frequency: RR, RB, BB

5. Determine allele frequency: R, B

Population Stratification

Population stratification is a form of confounding in genetic studies where a gene under study shows marked variation in allele frequency across subgroups of a population and these subgroups differ in their baseline risk of disease

Population Stratification: Confounding

Exposure of Interest

True Risk Factor Disease

Genotype of Interest

Disease

Ethnicity

True Risk Factor

Wacholder, JNCI, 2000

Population Stratification: Gm3;5,13,14 in admixed sample of Native Americans of the Pima and Papago tribes

Study Population: 4,290 Pima and Papago Indians

Genetic Variant: Gm 3;5,13, 15 haplotype (Gm system of human immunoglobulin G)

Outcome: Type 2 diabetes

Question: Is the Gm 3; 5,13, 15 haplotype associated with Type 2 diabetes?

Knowler, AJHG, 1998

Population Stratification: Gm3;5,13,14 in admixed sample of Native Americans of the Pima and Papago tribes

Unadjusted for ethnic backgroundOR = 0.27 (95% 0.18-0.40)

Full heritage American Indian population

+ -

Gm3;5,13,14 ~1% ~99%

NIDDM prevalence ~40%

Caucasian population

+ -

Gm3;5,13,14 ~66% ~34%

NIDDM prevalence ~15%

Gm3,5,13,14 haplotype Cases Controls

+ 7.80% 29.00%

- 92.20% 71.00%

Population Stratification: Gm3;5,13,14 in admixed sample of Native Americans of the Pima and Papago tribes

Gm3,5,13,14 haplotype Cases Controls

+ 7.80% 29.00%

- 92.20% 71.00%

Adjusted for ethnic background OR = 0.83 (95% 0.58-1.18)

Index of Indian heritage

Gm3;5,13,14 haplotype

% Diabetes

0 65.8% 18.5%

4 42.1% 28.5%

8 1.6% 39.2%

Ancestry Informative Markers

Polymorphisms with known allele frequency differences across ancestral groups

Useful in estimating ancestry in admixed individuals

Example: Duffy locus (codes for blood group)

100% sub-Saharan Africans vs. other groups

protects P. vivax (malaria)

Example AIM: Duffy locus

0.15 0.85

0.99 0.01

0.77 0.24

1.00

1.00

0.53 0.47

1.00

1.00

0.00 0.20 0.40 0.60 0.80 1.00

U.S. Hispanics

Mayan

U.S. Whites

England

Germany

U.S. Blacks

Central Africa

Nigeria

rs1814778: Duffy locus

A

G

http://www.ncbi.nlm.nih.gov/projects/SNP

Population Inbreeding

Population inbreeding occurs when there is a preference of mating between close relatives or because of geographic isolation in a population. This will cause deviations in HWE by causing a deficit of heterozygotes.

How to quantify the amount of inbreeding in a population?

Inbreeding coefficient, F

The probability that a random individual in the population inherits two copies of the same allele from a common ancestor

F ranges 0 to 1:

F is low in random mating populations

F close to 1 in self-breeding population (plants)

Helgason, Science, 2008

Kinship & Reproduction: Icelandic couples

# of children

# of children

that reproduce

# of grandchildren

mean lifespan of children