Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453,...

33
Molecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Transcript of Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453,...

Page 1: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Molecular variation

Joe Felsenstein

GENOME 453, Autumn 2013

Molecular variation – p.1/33

Page 2: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Views of genetic variation before 1966

The Classical view The Balancing Selection view

Hermann Joseph Muller Theodosius Dobzhansky

Most loci will be homozygous Most loci will be polymorphicfor the “wild-type allele" due to balancing selection

but a few mutants will exist with strong selection

Molecular variation – p.2/33

Page 3: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Gel electrophoresis

slots

gel of potato starch orpolyacrilamide

tank of buffer solution

wick

power supply

+

Molecular variation – p.3/33

Page 4: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

After running the current

slots

gel of potato starch orpolyacrilamide

tank of buffer solution

wick

power supply

+

Molecular variation – p.4/33

Page 5: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Making one locus visible by staining

slots

gel of potato starch orpolyacrilamide

tank of buffer solution

wick

power supply

+

AA AA Aa aa Aa AA

stained

Molecular variation – p.5/33

Page 6: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

A monomeric enzyme

AA

Aa

On gel:

Molecular variation – p.6/33

Page 7: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

A dimeric enzyme

AA

Aa

On gel:

Molecular variation – p.7/33

Page 8: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Polymorphism on a gel

Molecular variation – p.8/33

Page 9: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Lewontin and Hubby’s 1966 work

Richard Lewontin, about 1980

Lewontin, R. C. and J. L. Hubby. 1966. A molecular approach to the studyof genic heterozygosity in natural populations. II. Amount of variation anddegree of heterozygosity in natural populations of Drosophilapseudoobscura. Genetics 54: 595-609.

Molecular variation – p.9/33

Page 10: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Measures of variability with multiple loci

Lewontin and Hubby (Genetics, 1966) suggested two measures ofvariability: polymorphism and heterozygosity.

Polymorphism is the fraction of all loci that have the most common alleleless than 0.95 in frequency (i.e. all the rarer alleles together add up to lessthan 0.05.

Heterozygosity is the estimated fraction of all individuals who areheterozygous at a random locus.

Molecular variation – p.10/33

Page 11: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Computing the average heterozygosity

If pi is the frequency in the sample of allele i at a locus, then theheterozygosity for that locus is estimated by taking the sum of squares ofthe gene frequencies (thus estimating the homozygosity) and subtractingfrom 1:

H = 1 −

alleles∑

i

p2

i

Molecular variation – p.11/33

Page 12: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

An example:

locus 1 1locus 2 1locus 3 0.8 0.2locus 4 0.94 0.04 0.02

The heterozygosities are calculated as:locus 1 1 − 1

2 = 0

locus 2 1 − 12 = 0

locus 3 1 −

(

0.82 + 0.22)

= 0.32

locus 4 1 −

(

0.942 + 0.04

2 + 0.022)

= 0.1144

The average heterozygosity in this example is

H = (0 + 0 + 0.32 + 0.1144) / 4 = 0.1086

Molecular variation – p.12/33

Page 13: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Amounts of heterozygosity

Molecular variation – p.13/33

Page 14: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Kimura’s neutral mutation theory

Motoo Kimura and family, 1966 Tomoko Ohta, recently

Kimura, M. 1968. Evolutionary rate at the molecular level. Nature 217:624-626.

Kimura, M., and T. Ohta. 1971. Protein polymorphism as a phase ofmolecular evolution. Nature 229: 467-469.

Molecular variation – p.14/33

Page 15: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Neutral mutation theory

0

1

2

3

4

5

6

7

8

10

11

14

15

Heterozygosity

expected to be

4Nu

4Nu + 1at any point is

13

assume: population size N, rate u of neutral mutations, all different

9

Crow and Kimura, 1964; Lewontin and Hubby, 1966;Kimura, 1968; King and Jukes, 1969; Kimura and Ohta, 1971

16

12

Molecular variation – p.15/33

Page 16: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Crow and Kimura’s theoretical calculation

James F. Crow, about 1990

Kimura, M., and J. F. Crow. 1964. The number of alleles that can bemaintained in a finite population. Genetics 49: 725-738.

Molecular variation – p.16/33

Page 17: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Expected heterozygosity with neutral mutation

In a random−mating population with neutral mutation,a fraction F of the pairs of copies will be homozygous.Suppose all mutations create completely new alleles,and the rate of these neutral mutations is

one generation ago

F

nowF’ F’

Ndiploid population of size

u

Molecular variation – p.17/33

Page 18: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Expected heterozygosity with neutral mutation

of the time1/(2N)

and the rate of these neutral mutations is

one generation ago

of the time

F

nowF’ F’

1−1/(2N)

Ndiploid population of size

u

In a random−mating population with neutralmutation, a fraction F of the pairs of copies

will be homozygous. Suppose all mutationscreate completely new alleles,

Molecular variation – p.18/33

Page 19: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Expected heterozygosity with neutral mutation

of the time1/(2N)

and the rate of these neutral mutations is

one generation ago

of the time

F

nowF’ F’

1−1/(2N)

Ndiploid population of size

To be identical, both copies must not be new mutants,

and the probability of this is

u

2(1−u)

In a random−mating population with neutralmutation, a fraction F of the pairs of copies

2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N)) will be homozygous. Suppose all mutationscreate completely new alleles, F’ =

Molecular variation – p.19/33

Page 20: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Expected heterozygosity with neutral mutation

of the time1/(2N)

and the rate of these neutral mutations is

one generation ago

of the time

F

nowF’ F’

1−1/(2N)

Ndiploid population of size

To be identical, both copies must not be new mutants,

and the probability of this is

So if we have settled down to an equilibrium level of

heterozygosity, F’ = F , so that

u

2(1−u)

F = 2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N))

In a random−mating population with neutralmutation, a fraction F of the pairs of copies

2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N)) will be homozygous. Suppose all mutationscreate completely new alleles, F’ =

Molecular variation – p.20/33

Page 21: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Expected heterozygosity with neutral mutation

of the time1/(2N)

and the rate of these neutral mutations is

one generation ago

of the time

F

nowF’ F’

1−1/(2N)

Ndiploid population of size

To be identical, both copies must not be new mutants,

and the probability of this is

So if we have settled down to an equilibrium level of

heterozygosity, F’ = F , so that

which is easily solved to give

F = 1 −

2

2

u

2(1−u)

(1−u)

(1−u)

F = 2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N))

(1/ (2N))

(1 − 1/ (2N))

In a random−mating population with neutralmutation, a fraction F of the pairs of copies

2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N)) will be homozygous. Suppose all mutationscreate completely new alleles, F’ =

Molecular variation – p.21/33

Page 22: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Expected heterozygosity with neutral mutation

of the time1/(2N)

and the rate of these neutral mutations is

one generation ago

of the time

F

nowF’ F’

1−1/(2N)

Ndiploid population of size

To be identical, both copies must not be new mutants,

and the probability of this is

So if we have settled down to an equilibrium level of

heterozygosity, F’ = F , so that

which is easily solved to give

F = 1

heterozygosity is:or to good approximation:

1 + 4N1−F =

4N

F = 1 −

2

2

u

2(1−u)

(1−u)

(1−u)

u

u

u

F = 2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N))

(1/ (2N))

(1 − 1/ (2N))

1 + 4 N

In a random−mating population with neutralmutation, a fraction F of the pairs of copies

2 [ x ] + (1−1/(2N)) F 1(1−u) (1/ (2N)) will be homozygous. Suppose all mutationscreate completely new alleles, F’ =

Molecular variation – p.22/33

Page 23: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Heterozygosity in marine invertebrates (Valentine, 1975)

species which is a samples/locus no. loci Het.Asterias vulgaris Northern sea star 19-27 26 1.1Cancer magister Dungeness crab 54 29 1.4Asterias forbesi Common sea star 19-72 27 2.1Lyothyrella notorcadensis brachiopod 78 34 3.9Homarus americanus lobster 290 37 3.9Crangon negricata shrimp 30 30 4.9Limulus polyphemus horseshoe crab 64 25 5.7Euphausia superba Antarctic krill 124 36 5.7Upogebia pugettensis blue mud shrimp 40 34 6.5Callianassa californiensis ghost shrimp 35 38 8.2Phoronopsis viridis horseshoe worm 120 39 9.4Crassostrea virginica Eastern oyster 200 32 12.0Euphausia mucronata small krill 50 28 14.1Asteriodea (4 spp.) deep sea stars 31 24 16.4Frielea halli brachiopod 45 18 16.9Ophiomusium lymani large brittlestar 257 15 17.0Euphausia distinguenda tropical krill 110 30 21.5Tridacna maxima giant clam 120 37 21.6

Molecular variation – p.23/33

Page 24: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

An interesting case: Limulus polyphemus

Carboniferous (300 mya) Jurassic (155 mya) today

Selander, R.K., S.Y. Yang, R.C. Lewontin, W.E. Johnson. 1970. Geneticvariation in the horseshoe crab (Limulus polyphemus), a phylogenetic“relic." Evolution 24:402-414.

Molecular variation – p.24/33

Page 25: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

An interesting case

Northern elephant seal Southern elephant sealMirounga angustirostris Mirounga leonina

Northern elephant seal: Population in 1890’s: 2-10 ?Population today: 150,000 or so (“911? help! there’s a monster dying onmy beach")

Bonnell, M.L., and R.K. Selander. 1974. Elephant seals: genetic variationand near extinction. Science 184: 908-909.

Molecular variation – p.25/33

Page 26: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

A “population cage" for Drosophila

Molecular variation – p.26/33

Page 27: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Yamazaki’s population cage experiment

Yamazaki, T. 1971. Measurement of fitness at the esterase-5 locus inDrosophila melanogaster. Genetics 67: 579-603.

Molecular variation – p.27/33

Page 28: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Explaining Electrophoretic Polymorphisms

Can do it either way:

By neutral mutation: If H = 0.15 then if Ne = 1, 000, 000 weneed 4Neµ = 0.176 to predict this, so that implies µ = 4.4 × 10

−8.So we can explain the level of variation by a neutral mechanism.

By selection: To be effective in a populationwith Ne = 1, 000, 000 selection would need to be big enoughthat 4Nes > 1 so s > 1/4, 000, 000 which is quite small, andimpossible to detect in laboratory settings.

Molecular variation – p.28/33

Page 29: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

DNA sequencing reveals a similar picture

Marty Kreitman

Kreitman, M. 1983. Nucleotide polymorphism at thealcohol-dehydrogenase Locus of Drosophila melanogaster. Nature 304:412-417.

Molecular variation – p.29/33

Page 30: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Kreitman’s sample of 11 ADH gene sequences, front end

Molecular variation – p.30/33

Page 31: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Kreitman’s sample of 11 ADH gene sequences, tail end

Molecular variation – p.31/33

Page 32: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

SeattleSNPs data (Nickerson lab)

Matrix Metalloproteinase 3 SNP data

Molecular variation – p.32/33

Page 33: Molecular variation - evolution.gs.washington.eduMolecular variation Joe Felsenstein GENOME 453, Autumn 2013 Molecular variation – p.1/33

Variation from the 1000 Genomes project

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●

●●●●

●●●●●●●

●●

●●

●●

●●●

●●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●●●●●

●●●●●

●●

●●●

●●

●●●

●●●●

●●

●●

●●

●●●●●

●●●●●●

●●

●●

●●●

●●●●●●●●●●

●●●●●●

●●●●

●●●

●●●●

●●●●●

●●

●●

●●●●●

●●●

●●●

●●●●

●●●

●●●●●●●●

●●●

●●●●●

●●●●●●●●●●●●●●●

●●

●●●●●

●●●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●●●●●

●●

●●

●●

●●●

●●●

●●

●●●●

●●●

●●●

●●

●●●●

●●●●●

●●

●●

●●●●

●●●

●●●●●●

●●●●

●●

●●

●●●●●●●●●●

●●●●

●●

●●

●●

●●

●●●●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●●●

●●●

●●●●

●●

●●●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●●●

●●

●●●

●●●●●●

●●●●●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●

●●●

●●

●●●●

●●●●●●

●●●

●●●

●●●●●●

●●●●●

●●

●●●

●●

●●●●●●

●●

●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●●●●●●●●

●●

●●●●●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●●●

●●

●●●●●●●●

●●●●●●

●●●●

●●●●

●●

●●●

●●

●●●

●●●

●●

●●●

●●●

●●●●●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●

●●●●

●●●●

●●●●●●●●●●●●

●●●●●●●

●●●●●●●

●●

●●●

●●●●●●

●●●●

●●●●

●●●●●

●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●●●●●●

●●●●●●●●

●●●●●●

●●●

●●●●●●●●●●●

●●●●●

●●●●

●●●

●●●●●●●●

●●●

●●

●●●●

●●●●●●●

●●●●●●●

●●

●●●●●●●●●●●

●●

●●●

●●●

●●●●●●●

●●●●

●●●

●●●●●●●

●●●●●●

●●●

●●●●●●●●

●●●●

●●

●●

●●●●●●

●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●

●●●●●●

●●●●●●

●●●●

●●●

●●

●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●●●●●●●●●●●●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●●●

●●●●

●●●●

●●●

●●●●●●●●●●

●●●

●●●

●●

●●●●●

●●●●

●●●●●●

●●●●●●●●

●●

●●●●●

●●

●●●

●●

●●

●●●●●

●●

●●●

●●●●

●●●●●●

●●●

●●●●

●●●●●●●●●●●●

●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●

●●

●●●●●

●●●●●●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●●

●●

●●●●●●

●●●

●●●●●●●●●●●●●●●●●

●●

●●

●●

●●●

●●●

●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●●●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●●

●●●

●●●

●●●●

●●

●●

●●●●

●●●

●●

●●●

●●

●●●

●●●●●●

●●●●

●●●

●●

●●

●●

●●

●●

●●●●●●●●●●

●●●●●

●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●●

●●

●●

●●●●●

●●●●●

●●

●●●●●●●●

●●

●●●●●●●●●●●●

●●●●

●●

●●

●●●●●●

●●●●●●●

●●●

●●

●●●●

●●

●●●●●●●●●●

●●●●●●●●

●●

●●

●●●●●●●●●●

●●●●

●●●●●●●●●●●●

●●●

●●●●

●●●●●

●●●●

●●●●●●●●●●●●●●●

●●●●

●●●●●●●●●●●

●●

●●

●●●●●●●

●●●●●●

●●●●

●●●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●

●●●●

●●●●●

●●●

●●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Div

ers

ity

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

-2.5 kb

m other intron

TSS

last

CDS5�UTR first intronm m

first

CDSmiddle

CDSt tt

-250 bp TES

3�UTR

-25 kb 2.5 kb 25 kb

SNP diversity ( x 1,000)

1 bp Indel diversity ( x 1,000)

Average SNP diversity ( x 1,000)

Average 1 bp Indel diversity ( x 10,000)

TF motif (m) or miRNA binding target (t)

Annotation

SNP diversity 95% confidence interval contour

Indel diversity 95% confidence interval contour

Supplementary Figure S7

From paper:Mu, X.J., Z.J. Lu, Y. Kong, H.Y. Lam, M.B. Gerstein. 2011. Analysis of genomic variation innon-coding elements using population-scale sequencing data from the 1000 GenomesProject. Nucleic Acids Research 39(16): 7058-7076.

Molecular variation – p.33/33