Evolution and conservation genetics

Neutral model of evolution

What governs heterogyzosity levels?Neutral model of drift and mutationSingle populationConstant sizeDrift occurs at rate 1/2N per generationMutation creates new or alternative

alleles and prevents fixation of alleles

What model of mutation does a gene locus follow under the neutral model?

Infinite Alleles Model

Stepwise-Mutation Model

Average protein contains about 300 amino acids (900 nucleotides)

Mutations always occur to new alleles

Finite population size (drift) How is loss of alleles due to drift

balanced by new mutations

Infinite Alleles Model (IAM)(Crow-Kimura Model)

542900 104

Do allozymes really fall under a mutation-drift process?

What is the equilibrium heterozygosity predicted by IAM?

21t

e

2

et 1F

N2

11u1

N2

1F )()(

Probability that two alleles are IBD. No mutation.

Probability that you are not identical by descent and neither allele has mutated

Both alleles do not mutate

F = probability that two alleles are both copies of the same ancestral allele (identical by descent)

At equilibrium then…

1N4

1Ft

1N4

1Fp t

2i

But we have two measures of homozygosity both measure the same thing thus equal each other

If H=1-F, then what is H at a mutation drift equilibrium?

1tt FF

Can you derive this?

Population Size

1e+1 1e+2 1e+3 1e+4 1e+5 1e+6 1e+7

Het

eroz

ygos

ity

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Heterozygosity at a mutation drift equilibrium, given an IAM is…

14

4

e

e

N

NH

μ=0.001

μ=10-5

μ=10-7

When mutation rates are high and population size is held constant: higher equilibrium heterozygosity.

When mutation rates are held constant then as population size increases: higher equilibrium heterozygosity

Stepwise-mutation model (SMM)(Ohta and Kimura)

Generated by slipped strand mispairing, mutations occur only at adjacent sites.

Mutation can produce alleles already present in the population.

Expect that the equilibrium level of heterozygosity under SMM to be lower than that of IAM.

18

11

eNH

Genetic diversity and population size

What is the effect of “finite” population size on gene frequencies

The various ways to mathematically study it

Effective population size

Drift defined

Random changes of gene frequencies among generations

More important withSmall population sizesFluctuation in population sizeLow selection and migrationLong time periods

A simple simulation of drift: “replicated

outcomes” (mean frequency is dotted)

Buri’s (1956) classic genetic drift experiment showing the number of wildtype versus neutral mutant alleles in populations of 16 Drosophila followed through time: “gene frequency distribution”

Generalized effect of drift

Allele frequencies do not change (much) on the landscape scale Within populations, drift decreases genetic variance Between populations, drift increases genetic variance

Consider the following to simply illustrate the principle: In a Buri-like experiment on 4 lines of n=4 hermaphrodite

snails, the frequency of an albinism allele was as follows at generations 2 and 6.

0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

21.33

Generation 2 Generation 6

2.67variance = variance =

Observed vs. expected changes of mean and variance of gene frequency

Loss of heterozygosity due to drift

Buri used a population of size nine

Effective population size

Governs random change of gene frequency, p Depends on several factors

All those that reduce the size of the breeding population

Ne = number of individuals in an ideal population which has the same magnitude of genetic drift as the actual population.

Wright-Fisher model

Assume that the number of offspring is distributed as a Poisson variable with

Mean = 2 ; Variance = 2

In this case, Ne = N

No selection, Random mating, random number of offspring

Factors reducing N to Ne

Only adults of reproductive age countSex ratio Variation in size over timeVariation in offspring numberInbreeding (self-fertilization)

Factors reducing N to Ne -- 1

Ne usually less than census population size

Non-breeding individuals do not contribute juveniles

“bachelor males” post-reproductives


different number of breeding individuals in the two sexes – one sex represented by a small number of breeding individuals

example:

Captive bred animals – only one male used for breeding

Different numbers of males and females

Unequal sex ration

Analogous to having two different population sizes

The effective population size is strongly influenced by the rarer of the two sexes.


Variation in number of offspring produced by different individuals

Ne smaller when offspring numbers are more unequal

Ne can be larger when variation in offspring number is reduced

V is the variance of reproductive success

What is upper limit for effective population size?

2

24

V

NNe


• Variation of population size in different generations

• Consider the effect on loss of variation caused by the specific population in size in generations 1, 2, 3, .... ,t.

harmonic mean: occasional severe reductions in population size will predominate over long stretches of stable large population size in reducing variability

N=1000, 10, 1000


Self-fertilization causes increases of homozygosity (most extreme form of close inbreeding, or mating between relatives)

f = fraction of loci in which both alleles are copies of an immediate ancestor

Ne = N / (1+f)

Effective size in continuous populations

What if there is one population, and mating occurs to nearby individuals progeny are dispersed a short distance

“Neighborhood size” (Wright 1943) Number of individuals within which 95% of the alleles derive

from the previous generation (twice the standard deviation of gene flow in one direction, … don’t

worry about the formula…)

Mainly applied to plants, Ne= 500-1000; why?

Estimation of effective population size

Demographic data (variance of number of offspring, variation of population size direct… but usually difficult to obtain

Can use genetic data reconstruct parentage of current population

(paternity analysis, in a few weeks) temporal changes of gene frequency

(need to separate from sampling variance) heterozygote excess, between few parents

(only applicable to very small populations)

Heterozygosity vs. allele number as indicators of variation

rarer genes lost faster than predicted by heterozygosity model!

(n)

0.001 0.0000.005 0.0240.050 0.5180.100 0.6640.500 0.831

Predicted Observedreduction of reduction ofH = (1/N) Na =(8-n)/8

______________________

Rarer alleles are lost in bigger bottlenecks

Bottlenecks and founding effects

These are special cases of genetic drift

Especially important in conservation genetics

The Founder effect

•New populations often started by small numbers of migrants (analogous to bottleneck)• Carry only a fraction of the genetic variability of the parental population• New populations tend to differ randomly both from the parent population and from each other, tend to be “inbred”•Applies to:

•Invasive species•Island colonists

•Examples…•Amish of Lancaster Co., PA (Ellis-van Creveld syndrome)•Pirates of Pitcairn Isle

The Cheetah bottleneck

15,000 to 20,000 cats in the wild All sampled cheetah share the same allozymes (Cohn 1996)

homozygosity of 100%, population 0% polymorphic For genes mediating immune response, foreign skin is

recognized as their own Why? Two bottlenecks – 10,000 years ago and another in the

last two centuries Work of Stephen J. O’Brian and collaborators (Cat genome

project)

Solid line: N=2

Dotted line: N=10

Intrinsic rate of growth affects H after a bottleneck

Loss of alleles mainly depends on bottleneck size, not rate of growth following bottleneck

Genetics 144: 2001-2014 (December, 1996)

Heterozygosity excess: difference betweenthe observed heterozygosity and the heterozygosity expected from the observed number of alleles.

Journal of Heredity, 1998

Data from real populations

Inference of colonization history: the Northern elephant seal

Formerly ranged from Mexico-California Hunted and collected to death Few survivors on Isla Guadalupe, Mexico (10-100?) Currently 200,000, many in Central/Southern Calif. How small was the bottlenecked population?

Attempted reconstruction of the bottleneck of Northern Elephant Seals

Currently, two mitochondrial DNA haplotypes have frequency 0.27, 0.73, giving He=0.40

Museum sample of pre-bottleneck samples gave He=0.80

Use Ht=H0 (1-1/2Ne1) (1-1/2Ne2)…(1-1/2Net) This allows Ne to increase following the bottleneck (1922-1960) Rate of increase about 1.7 per generation Allows population to grow from 15 to 200,000 in 38 years

One generation bottlenect of 15 gives H0 =.80, H1 =.59, H2 =.50, H3 =.45… to H=.40 very shortly

But microsatellites don’t show such a reduction of diversity, why?

Inbreeding due to small population size

– Has predictable consequences for allele frequencies and genotype frequencies:

• Increases the frequency of homozygous genotypes

– Similar in effect to:• Genetic drift• Variation in population size over time• Skewed sex ratios, etc.

– Two “kinds” of inbreeding:nonrandom – self-fertilizationrandom

Random inbreeding

Populations enter a positive feedback loop

– Inbreeding depression increases, population size decreases

– Effect of drift increases: deleterious mutations become fixed

– As deleterious mutations become fixed, inbreeding depression increases

–Maybe the population dies!

Mutational meltdown?

Among-population gene diversity

Within populations (so far)

Between populations

Genetic variation in space and time in populations

• Genetic structure of populations and frequency of alleles varies in space or time

• Space:

Allele frequency clines in the blue mussel.

Variation across time: temporal variation in a prairie vole (Microtus ochrogaster) esterase gene.

Measuring Genetic Differentiation: Fst

Fst= normalized variance in allele frequencies among populations Fst = Var(p)/p*(1-p*), where Var(p) is the variance in the

frequencies of allele p among populations and p* is the observed mean allele frequency across populations

Or Fst= the relative reduction in gene diversity in a single population compared to pooling all populations Fst = (Ht - Hs)/Ht, where Ht is the expected heterozygosity for a

pooled sample of alleles and Hs is the average expected heterozygosity within each sub-population

Hs = 1 - pi2 = 1 - (12 + 02) = 0 Hs = 0

Mean Hs = 0

Expected Ht = 1 - pi2 = 1- (0.52 + 0.52) = 0.5

Fst = (Ht-Hs)/Ht = (0.5-0)/0.5 = 1

AA AA

AAAA

aa

aaaa

aa

P(A) = 1P(a) = 0

P(A) = 0P(a) = 1

Population 1 Population 2

Average gene frequencies are pA* = 0.5, pa*=0.5

Aa

Aa Aa

Aa AaAa

Aa

Aa

P(A) = 0.5P(a) = 0.5

P(A) = 0.5P(a) = 0.5

Hs = 1 - pi2 = 1 - (0.52 + 0.52) = 0.5 Hs = 0.5

Mean Hs = 0.5

Expected Ht = 1 - pi2 = 1- (0.52 + 0.52) = 0.5

Fst = (Ht-Hs)/Ht = (0.5-0.5)/0.5 = 0


Hs = 1 - pi2 = 1 - (0.82 + 0.22) = 0.32 Hs = 0.5

Mean Hs = 0.41

Expected Ht = 1 - pi*2 = 1- (0.652 + 0.352) = 0.455

Fst = (Ht-Hs)/Ht = (0.455-0.41)/0.455 = 0.0989

Aa

AA Aa

AA Aa AA

Aa

pA = 0.8pa = 0.2

pA = 0.5pa = 0.5

AA aa


Summary: Wright’s FST

Reduction of heterozygosity compared to random mating

Fst = inbreeding in subpopulation due to differences from other subpopulations

FST = (HT - HS) / HT = Var(p)/[p(1-p)]

Wright’s F statistics

Separate components of genetic variation into a hierarchy:How much genetic variation is contained in

a subpopulation compared to regiona region compared to totala subpopulation compared to total

Partition of Wright’s F

In general sense, F is the probability that two alleles share a common ancestor (identity by descent)

Total F = Fit (individual-total) Local F = Fis (individual-subpopulation) Regional F = Fst (subpopulation-total)

Fit = Fis + (1- Fis ) Fst If it ain’t locally inbred, then maybe it is regionally

Fundamental concept; can be defined for any number of levels

Stepwise mutation model (for SSRs=simple sequence repeats=microsatellites)

Mutation is a progressive change so fragments that migrate similar distances have had few mutations.

In the case of SSRs, mutation is assumed to change the number of repeats, increasing or decreasing step by step.

The square of the difference in the number of repeats between 2 microsatellites is proportional to the time of divergence from a common ancestor.

Partitioning variation of SSRs: Rst - differentiation based on variance in allele sizes between populations (Slatkin 1995)

Microsatellite analog of Fst that explicitly takes into account mutational differences among alleles

Rst = (S - Sw)/S, where S is the average squared difference in size of all alleles and Sw is the average sum of squares of the differences in allele sizes within each population

Analogous to Fst = (Ht-Hs)/(1-Ht)

Assumes step-wise mutation model and weights differences between alleles by size (= # repeats) differences

Issues with the use of the marker type(Hedrick. 1999. Evolution 53:313-318)

High level of variation constrains maximum value of Fst that is possible Max Fst < 1 - Hs or the observed level of homozygosity

Complicates interpretation of significance of Fst values

Biological significance of statistically significant but small values of Fst (e.g. 0.01) from microsatellite data

Genetic distance

Measures the genetic “difference” between populations; alternative to variance partitioning

Proportional to the time of separation from a common ancestor Between-population distance increases with time

Due to genetic drift, mutation

Four major models: Mutation to infinite alleles

isozymes, sometimes microsatellites Stepwise mutation

microsatellites Genetic drift causes random changes of gene frequencies Mutation in the nucleotide sequence

Expected homozygosities within and between populations Two populations, "x" and "y" Jx = probability that two alleles from population x are the same

(expected homozygosity) = ip2ix

Jy likewise defined for population y

Jxy = probability that two alleles chosen from different populations x and y are the same = ipixpiy

Nei's gene identity I=Jxy/√(JxJy) Analogous to a correlation coefficient With multiple loci, take average of Jx , Jy, Jxy over loci

Nei’s genetic distance D = -ln(I) Increases linearly with time under infinite allele mutation model

Genetic distance: infinite allele model of Nei

Genetic distance: stepwise mutation model

Based on squared difference of mean allele size

ux = mean for population x uy =mean for population y 2

u = (ux-uy)2

Take average over multiple loci Increases linearly with time with stepwise mutation Highly dependent on allele size distribution

Often Nei’s infinite allele model better

Kermode bear microsatellite study

SSRs transferred from grizzly bears to black bears (11 loci)

hairs "trapped"

DNA extracted from hair roots

Hypothesis: are Kermode bear populations genetically distinct? Is the pattern of "neutral variation" different than the

MC1R (coat color gene) pattern?

Populations sampled

Genetic distances and Nm between populations

Relationship of populations based upon pairwise genetic divergence (previous table); gene frequencies of white phase given in parenthesis

(0.56)

(0.05)

(0.08)

(0.00)

(0.02)

(.013)

(0.05)

(0.04)

(0.21)

(0.10)(0.00)

(0.33)Kermode populationsare not closely relatedto each other, somesuggestion of complexinterrelations

PNAS December 2, 2008

Evolution and conservation genetics

Documents

Transcript of Evolution and conservation genetics