What should we study - UFSCarevolucao/TGE/Lect02.pdf · Microsoft PowerPoint - Lect 02 Genetic...
Transcript of What should we study - UFSCarevolucao/TGE/Lect02.pdf · Microsoft PowerPoint - Lect 02 Genetic...
What should we study ?
• Levels of genetic variability - intrapopulational• Population structure - interpopulational• Geographic distribution of genetic diversity
• Taxonomic uncertainties – taxonomic and systematic studies
• Number of species – taxonomic and ecological approaches
Intrapopulational measures
Why Genetic Diversity
• Genetic diversity is important because it is the raw material on which selection can act, and thus species can respond to selective pressure.
• Majority of low frequency alleles exist in heterozygous states, and there if they are deleterious, their action may be fully or partially masked.
Why Genetic Diversity
• Genetic diversity also plays a role in determining IUCN categories.
• The lower the genetic diversity, the higher the perceived risk of threat.
Measuring Genetic Diversity
• Measures of genetic diversity depend on the data analyzed.
• One set of measures focuses on heterozygositymeasures and is based on diploid, co-dominant markers.
• Other set of measures focuses on allelic information, and or unphased diploid data.
• Some indexes implemented in Arlequin
Measures of Genetic diversity
Molecular Markers• Sequence data
• Single Nucleotide Polymorphism (SNP) data
• Microsatellite data
• Allozyme data
• Amplified Fragment Lengths Polymorphism (AFLP) data
• Randomly Amplified Polymorphic DNA (RAPD) data
• Hybridization data
• Chromosomal pattern data
Sequence data
Sequence data• Differences in haplotypes are due to point mutations
(transition or transversion types), due to insertions or due to deletions.
• In diploid organisms, differences are also due to recombination.
• Molecular models of evolution dealing with point mutations are very well studied.
Microsatellite data
Microsatellite data
Microsatellite data
Microsatellite data
Template strand
+1 repeat -1 repeat
Slippage
Misalignment
Growing strand
Microsatellite data• Differences in haplotypes are due to unequal crossing
over, or due to slippage in strand replication.
• This class of markers is co-dominant, i.e. heterozygous and both homozygous classes of individuals can be distinguished.
• Fast rate of molecular evolution.
• Models of molecular evolution are not well known.
Allozyme data
Allozyme data• Properties of allozyme data are very similar to
microsatellite data.
RFLP
RFLP data• Differences in haplotypes are due to point mutations
(transition or transversion types), due to insertions or due to deletions.
• In diploid organisms, differences are also due to recombination.
• This class of markers is dominant, i.e. heterozygous and homozygous dominant individuals cannot be distinguished.
Chromosomal data
Best Markers• Theoretically the best markers are sequence markers.
• If there is sufficient variation – sufficient sequence length.
• If the differences can be phased.
• And because we have the best models of molecular evolution for these markers.
HaplotypesSample 1 AAAAASample 2 AAAAASample 3 AGAAASample 4 AGAAASample 5 AGAAGSample 6 AGAAGSample 7 GGAAASample 8 GGAAASample 9 GGGAASample 10 GGGAASample 11 GGGGASample 12 GGGGA
Measuring Genetic DiversitySample 1 AGAACTTCTGSample 2 AGAACTTCTGSample 3 AGAACTTCTGSample 4 AAAA TTTTTGSample 5 AAAA TTTTTGSample 6 AAAATCTTTG
Number of segregating sites– Is the total number of mutations observed in the dataset.
Measuring Genetic DiversitySample 1 AGAACTTCTGSample 2 AGAACTTCTGSample 3 AGAACTTCTGSample 4 AAAA TTTTTGSample 5 AAAA TTTTTGSample 6 AAAATCTTTG
Gene Diversity –Is equivalent to expected heterozygosity for diploid data. It is defined as the probability that any two randomly selected sequences will be different.
Measuring Genetic DiversitySample 1 AGAACTTCTGSample 2 AGAACTTCTGSample 3 AGAACTTCTGSample 4 AAAA TTTTTGSample 5 AAAA TTTTTGSample 6 AAAATCTTTG
Mean number of pairwise differences –Mean number of differences between all pairs of haplotypes in the sample.d = mutational difference, p = allele frequency, k = allele number, n = sample size
Measuring Genetic DiversitySample 1 AGAACTTCTGSample 2 AGAACTTCTGSample 3 AGAACTTCTGSample 4 AAAA TTTTTGSample 5 AAAA TTTTTGSample 6 AAAATCTTTG
NucleotideDiversity –It is computed as the probability that two randomly chosen homologous sites are different.d = mutational difference, p = allele frequency, k = allele number, L = number of loci (allele number)
Measuring Genetic Diversity
• Theta = θ = 4Nµ = 4Nm = 4N(µ+m)• For haploid markers θ = 2Nµ = 2Nm = 2N(µ+m)• The all important population genetic parameter.• It is based on the number of alleles or the number of
different nucleotides in a given sample.• It quantifies genetic diversity of a given population.
Theta (θ) Hom
• The expected homozygosity (Zouros, 1979; Chakraborty and Weiss (1991) in a population at equilibrium between drift and mutation.
• Sensitive to small sample and allele sizes
• For microsat data
Theta (θ) S
• Estimated from the infinite-site equilibrium relationship (Watterson, 1975) between the number of segregating sites (S), the sample size (n) and θ for a sample of non-recombining DNA.
Theta (θ) k
• Estimated from the infinite-allele equilibrium relationship (Ewens, 1972) between the expected number of alleles (k), the sample size (n) and θ.
• 95% confidence limits are calculated as
Sterling number (expansion factor of a factorialFalling factorial
Theta (θ) πˆ
• Estimated from the infinite-site equilibrium (Tajima, 1983) relationship between the mean number of pair-wise differences (πˆ) and theta (θ ).
Why so many θ measures
• Not all methods are suitable for all types of data.• Ultimately all methods should result in the same
estimates of theta.• Differences in estimates can be interpreted as
violations of assumptions, and each method is sensitive to different assumptions.
Tajima’s D
• Tajima’s (1989) D test quantifies the discordance between the estimate of theta from number of segregating sites and from average pair-wise sequence divergence.
Fu’s Fs
• Fu’s (1997) Fs measures the probability of observing a certain number of haplotypes given particular value of θ
Differences in θ measures
• Have selective interpretations.• Have demographic interpretations.