Evolution and conservation genetics
description
Transcript of Evolution and conservation genetics
Evolution and conservation genetics
Neutral model of evolution
What governs heterogyzosity levels?Neutral model of drift and mutationSingle populationConstant sizeDrift occurs at rate 1/2N per generationMutation creates new or alternative
alleles and prevents fixation of alleles
What model of mutation does a gene locus follow under the neutral model?
Infinite Alleles Model
Stepwise-Mutation Model
Average protein contains about 300 amino acids (900 nucleotides)
Mutations always occur to new alleles
Finite population size (drift) How is loss of alleles due to drift
balanced by new mutations
Infinite Alleles Model (IAM)(Crow-Kimura Model)
542900 104
Do allozymes really fall under a mutation-drift process?
What is the equilibrium heterozygosity predicted by IAM?
21t
e
2
et 1F
N2
11u1
N2
1F )()(
Probability that two alleles are IBD. No mutation.
Probability that you are not identical by descent and neither allele has mutated
Both alleles do not mutate
F = probability that two alleles are both copies of the same ancestral allele (identical by descent)
At equilibrium then…
1N4
1Ft
1N4
1Fp t
2i
But we have two measures of homozygosity both measure the same thing thus equal each other
If H=1-F, then what is H at a mutation drift equilibrium?
1tt FF
Can you derive this?
Population Size
1e+1 1e+2 1e+3 1e+4 1e+5 1e+6 1e+7
Het
eroz
ygos
ity
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Heterozygosity at a mutation drift equilibrium, given an IAM is…
14
4
e
e
N
NH
μ=0.001
μ=10-5
μ=10-7
When mutation rates are high and population size is held constant: higher equilibrium heterozygosity.
When mutation rates are held constant then as population size increases: higher equilibrium heterozygosity
Stepwise-mutation model (SMM)(Ohta and Kimura)
Generated by slipped strand mispairing, mutations occur only at adjacent sites.
Mutation can produce alleles already present in the population.
Expect that the equilibrium level of heterozygosity under SMM to be lower than that of IAM.
18
11
eNH
Genetic diversity and population size
What is the effect of “finite” population size on gene frequencies
The various ways to mathematically study it
Effective population size
Drift defined
Random changes of gene frequencies among generations
More important withSmall population sizesFluctuation in population sizeLow selection and migrationLong time periods
A simple simulation of drift: “replicated
outcomes” (mean frequency is dotted)
Buri’s (1956) classic genetic drift experiment showing the number of wildtype versus neutral mutant alleles in populations of 16 Drosophila followed through time: “gene frequency distribution”
Generalized effect of drift
Allele frequencies do not change (much) on the landscape scale Within populations, drift decreases genetic variance Between populations, drift increases genetic variance
Consider the following to simply illustrate the principle: In a Buri-like experiment on 4 lines of n=4 hermaphrodite
snails, the frequency of an albinism allele was as follows at generations 2 and 6.
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
21.33
Generation 2 Generation 6
2.67variance = variance =
Observed vs. expected changes of mean and variance of gene frequency
Loss of heterozygosity due to drift
Buri used a population of size nine
Effective population size
Governs random change of gene frequency, p Depends on several factors
All those that reduce the size of the breeding population
Ne = number of individuals in an ideal population which has the same magnitude of genetic drift as the actual population.
Wright-Fisher model
Assume that the number of offspring is distributed as a Poisson variable with
Mean = 2 ; Variance = 2
In this case, Ne = N
No selection, Random mating, random number of offspring
Factors reducing N to Ne
Only adults of reproductive age countSex ratio Variation in size over timeVariation in offspring numberInbreeding (self-fertilization)
Factors reducing N to Ne -- 1
Ne usually less than census population size
Non-breeding individuals do not contribute juveniles
“bachelor males” post-reproductives
Factors reducing N to Ne -- 2
different number of breeding individuals in the two sexes – one sex represented by a small number of breeding individuals
example:
Captive bred animals – only one male used for breeding
Different numbers of males and females
Unequal sex ration
Analogous to having two different population sizes
The effective population size is strongly influenced by the rarer of the two sexes.
Factors reducing N to Ne -- 3
Variation in number of offspring produced by different individuals
Ne smaller when offspring numbers are more unequal
Ne can be larger when variation in offspring number is reduced
V is the variance of reproductive success
What is upper limit for effective population size?
2
24
V
NNe
Factors reducing N to Ne -- 4
• Variation of population size in different generations
• Consider the effect on loss of variation caused by the specific population in size in generations 1, 2, 3, .... ,t.
harmonic mean: occasional severe reductions in population size will predominate over long stretches of stable large population size in reducing variability
N=1000, 10, 1000
Factors reducing N to Ne -- 5
Self-fertilization causes increases of homozygosity (most extreme form of close inbreeding, or mating between relatives)
f = fraction of loci in which both alleles are copies of an immediate ancestor
Ne = N / (1+f)
Effective size in continuous populations
What if there is one population, and mating occurs to nearby individuals progeny are dispersed a short distance
“Neighborhood size” (Wright 1943) Number of individuals within which 95% of the alleles derive
from the previous generation (twice the standard deviation of gene flow in one direction, … don’t
worry about the formula…)
Mainly applied to plants, Ne= 500-1000; why?
Estimation of effective population size
Demographic data (variance of number of offspring, variation of population size direct… but usually difficult to obtain
Can use genetic data reconstruct parentage of current population
(paternity analysis, in a few weeks) temporal changes of gene frequency
(need to separate from sampling variance) heterozygote excess, between few parents
(only applicable to very small populations)
Heterozygosity vs. allele number as indicators of variation
rarer genes lost faster than predicted by heterozygosity model!
(n)
0.001 0.0000.005 0.0240.050 0.5180.100 0.6640.500 0.831
Predicted Observedreduction of reduction ofH = (1/N) Na =(8-n)/8
______________________
Rarer alleles are lost in bigger bottlenecks
Bottlenecks and founding effects
These are special cases of genetic drift
Especially important in conservation genetics
The Founder effect
•New populations often started by small numbers of migrants (analogous to bottleneck)• Carry only a fraction of the genetic variability of the parental population• New populations tend to differ randomly both from the parent population and from each other, tend to be “inbred”•Applies to:
•Invasive species•Island colonists
•Examples…•Amish of Lancaster Co., PA (Ellis-van Creveld syndrome)•Pirates of Pitcairn Isle
The Cheetah bottleneck
15,000 to 20,000 cats in the wild All sampled cheetah share the same allozymes (Cohn 1996)
homozygosity of 100%, population 0% polymorphic For genes mediating immune response, foreign skin is
recognized as their own Why? Two bottlenecks – 10,000 years ago and another in the
last two centuries Work of Stephen J. O’Brian and collaborators (Cat genome
project)
Solid line: N=2
Dotted line: N=10
Intrinsic rate of growth affects H after a bottleneck
Loss of alleles mainly depends on bottleneck size, not rate of growth following bottleneck
Genetics 144: 2001-2014 (December, 1996)
Heterozygosity excess: difference betweenthe observed heterozygosity and the heterozygosity expected from the observed number of alleles.
Journal of Heredity, 1998
Data from real populations
Inference of colonization history: the Northern elephant seal
Formerly ranged from Mexico-California Hunted and collected to death Few survivors on Isla Guadalupe, Mexico (10-100?) Currently 200,000, many in Central/Southern Calif. How small was the bottlenecked population?
Attempted reconstruction of the bottleneck of Northern Elephant Seals
Currently, two mitochondrial DNA haplotypes have frequency 0.27, 0.73, giving He=0.40
Museum sample of pre-bottleneck samples gave He=0.80
Use Ht=H0 (1-1/2Ne1) (1-1/2Ne2)…(1-1/2Net) This allows Ne to increase following the bottleneck (1922-1960) Rate of increase about 1.7 per generation Allows population to grow from 15 to 200,000 in 38 years
One generation bottlenect of 15 gives H0 =.80, H1 =.59, H2 =.50, H3 =.45… to H=.40 very shortly
But microsatellites don’t show such a reduction of diversity, why?
Inbreeding due to small population size
– Has predictable consequences for allele frequencies and genotype frequencies:
• Increases the frequency of homozygous genotypes
– Similar in effect to:• Genetic drift• Variation in population size over time• Skewed sex ratios, etc.
– Two “kinds” of inbreeding:nonrandom – self-fertilizationrandom
Random inbreeding
Populations enter a positive feedback loop
– Inbreeding depression increases, population size decreases
– Effect of drift increases: deleterious mutations become fixed
– As deleterious mutations become fixed, inbreeding depression increases
–Maybe the population dies!
Mutational meltdown?
Among-population gene diversity
Within populations (so far)
Between populations
Genetic variation in space and time in populations
• Genetic structure of populations and frequency of alleles varies in space or time
• Space:
Allele frequency clines in the blue mussel.
Variation across time: temporal variation in a prairie vole (Microtus ochrogaster) esterase gene.
Measuring Genetic Differentiation: Fst
Fst= normalized variance in allele frequencies among populations Fst = Var(p)/p*(1-p*), where Var(p) is the variance in the
frequencies of allele p among populations and p* is the observed mean allele frequency across populations
Or Fst= the relative reduction in gene diversity in a single population compared to pooling all populations Fst = (Ht - Hs)/Ht, where Ht is the expected heterozygosity for a
pooled sample of alleles and Hs is the average expected heterozygosity within each sub-population
Hs = 1 - pi2 = 1 - (12 + 02) = 0 Hs = 0
Mean Hs = 0
Expected Ht = 1 - pi2 = 1- (0.52 + 0.52) = 0.5
Fst = (Ht-Hs)/Ht = (0.5-0)/0.5 = 1
AA AA
AAAA
aa
aaaa
aa
P(A) = 1P(a) = 0
P(A) = 0P(a) = 1
Population 1 Population 2
Average gene frequencies are pA* = 0.5, pa*=0.5
Aa
Aa Aa
Aa AaAa
Aa
Aa
P(A) = 0.5P(a) = 0.5
P(A) = 0.5P(a) = 0.5
Hs = 1 - pi2 = 1 - (0.52 + 0.52) = 0.5 Hs = 0.5
Mean Hs = 0.5
Expected Ht = 1 - pi2 = 1- (0.52 + 0.52) = 0.5
Fst = (Ht-Hs)/Ht = (0.5-0.5)/0.5 = 0
Average gene frequencies are pA* = 0.5, pa*=0.5
Hs = 1 - pi2 = 1 - (0.82 + 0.22) = 0.32 Hs = 0.5
Mean Hs = 0.41
Expected Ht = 1 - pi*2 = 1- (0.652 + 0.352) = 0.455
Fst = (Ht-Hs)/Ht = (0.455-0.41)/0.455 = 0.0989
Aa
AA Aa
AA Aa AA
Aa
pA = 0.8pa = 0.2
pA = 0.5pa = 0.5
AA aa
Average gene frequencies are pA* = 0.65, pa*=0.35
Summary: Wright’s FST
Reduction of heterozygosity compared to random mating
Fst = inbreeding in subpopulation due to differences from other subpopulations
FST = (HT - HS) / HT = Var(p)/[p(1-p)]
Wright’s F statistics
Separate components of genetic variation into a hierarchy:How much genetic variation is contained in
a subpopulation compared to regiona region compared to totala subpopulation compared to total
Partition of Wright’s F
In general sense, F is the probability that two alleles share a common ancestor (identity by descent)
Total F = Fit (individual-total) Local F = Fis (individual-subpopulation) Regional F = Fst (subpopulation-total)
Fit = Fis + (1- Fis ) Fst If it ain’t locally inbred, then maybe it is regionally
Fundamental concept; can be defined for any number of levels
Stepwise mutation model (for SSRs=simple sequence repeats=microsatellites)
Mutation is a progressive change so fragments that migrate similar distances have had few mutations.
In the case of SSRs, mutation is assumed to change the number of repeats, increasing or decreasing step by step.
The square of the difference in the number of repeats between 2 microsatellites is proportional to the time of divergence from a common ancestor.
Partitioning variation of SSRs: Rst - differentiation based on variance in allele sizes between populations (Slatkin 1995)
Microsatellite analog of Fst that explicitly takes into account mutational differences among alleles
Rst = (S - Sw)/S, where S is the average squared difference in size of all alleles and Sw is the average sum of squares of the differences in allele sizes within each population
Analogous to Fst = (Ht-Hs)/(1-Ht)
Assumes step-wise mutation model and weights differences between alleles by size (= # repeats) differences
Issues with the use of the marker type(Hedrick. 1999. Evolution 53:313-318)
High level of variation constrains maximum value of Fst that is possible Max Fst < 1 - Hs or the observed level of homozygosity
Complicates interpretation of significance of Fst values
Biological significance of statistically significant but small values of Fst (e.g. 0.01) from microsatellite data
Genetic distance
Measures the genetic “difference” between populations; alternative to variance partitioning
Proportional to the time of separation from a common ancestor Between-population distance increases with time
Due to genetic drift, mutation
Four major models: Mutation to infinite alleles
isozymes, sometimes microsatellites Stepwise mutation
microsatellites Genetic drift causes random changes of gene frequencies Mutation in the nucleotide sequence
Expected homozygosities within and between populations Two populations, "x" and "y" Jx = probability that two alleles from population x are the same
(expected homozygosity) = ip2ix
Jy likewise defined for population y
Jxy = probability that two alleles chosen from different populations x and y are the same = ipixpiy
Nei's gene identity I=Jxy/√(JxJy) Analogous to a correlation coefficient With multiple loci, take average of Jx , Jy, Jxy over loci
Nei’s genetic distance D = -ln(I) Increases linearly with time under infinite allele mutation model
Genetic distance: infinite allele model of Nei
Genetic distance: stepwise mutation model
Based on squared difference of mean allele size
ux = mean for population x uy =mean for population y 2
u = (ux-uy)2
Take average over multiple loci Increases linearly with time with stepwise mutation Highly dependent on allele size distribution
Often Nei’s infinite allele model better
Kermode bear microsatellite study
SSRs transferred from grizzly bears to black bears (11 loci)
hairs "trapped"
DNA extracted from hair roots
Hypothesis: are Kermode bear populations genetically distinct? Is the pattern of "neutral variation" different than the
MC1R (coat color gene) pattern?
Populations sampled
Genetic distances and Nm between populations
Relationship of populations based upon pairwise genetic divergence (previous table); gene frequencies of white phase given in parenthesis
(0.56)
(0.05)
(0.08)
(0.00)
(0.02)
(.013)
(0.05)
(0.04)
(0.21)
(0.10)(0.00)
(0.33)Kermode populationsare not closely relatedto each other, somesuggestion of complexinterrelations
PNAS December 2, 2008