Genomic Conflict and DNA Sequence Variation

57
Marcy K. Uyenoyama Department of Biology Duke University Genomic Conflict and DNA Sequence Variation

description

Genomic Conflict and DNA Sequence Variation. Marcy K. Uyenoyama Department of Biology Duke University. Population genetics Historically model-rich Present need: model-based interpretation of observed patterns of genomic variation What are hallmarks of each model? - PowerPoint PPT Presentation

Transcript of Genomic Conflict and DNA Sequence Variation

Page 1: Genomic Conflict and  DNA Sequence Variation

Marcy K. UyenoyamaDepartment of Biology

Duke University

Genomic Conflict and DNA Sequence Variation

Page 2: Genomic Conflict and  DNA Sequence Variation

• Population geneticsHistorically model-richPresent need: model-based interpretation of observed

patterns of genomic variationWhat are hallmarks of each model?

• Self-incompatibility systems in plantsRecognizing genomic conflict due to sexual

antagonism

Overview

Page 3: Genomic Conflict and  DNA Sequence Variation

• Neutral evolutionPure neutrality: distribution of offspring number is

independent of any trait in parentDemographic history: deme founding, gene flowPurifying selection: maintain functioning state

against random deleterious mutations

• SelectionBalancing selection: maintenance of different formsSelective sweeps: substitution of most fit for less fit

Canonical models

Page 4: Genomic Conflict and  DNA Sequence Variation

• How do we know it when we see it?Patterns evident in genome variation

• Model selectionChoosing among a small number of canonical models

for any particular system

Hallmarks of evolution

Page 5: Genomic Conflict and  DNA Sequence Variation

A random sample of genes

Ancestral sequence

Sample

Observed

Page 6: Genomic Conflict and  DNA Sequence Variation

Site frequency spectrum

Allele and mutation spectra

0

1

2

3

4

5

6

7

1 2 3 4 5 6 17

Multiplicity

Num

ber o

f mut

atio

ns

a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai the number of alleles with multiplicity i

Page 7: Genomic Conflict and  DNA Sequence Variation

After an interval choose a lineage at random

– Replace it by two identical copies with probability

– Mutate it according to P with probability

The neutral coalescent

Sample root from stationary distribution of P,mutation transition matrix and bifurcate

t : exp(1 2 )

1 / (1 2 )

2 / (1 2 )

Page 8: Genomic Conflict and  DNA Sequence Variation

• Events on level k Bifurcation at rate Mutation at rate

• Population parameters: ratios of rates Next event is a bifurcation/coalescence with probability

Evolutionary rates

Nk

/21

ku2

1

1

2/2

2/2

lim0/1,

21

1

kk

kuNk

Nk

Nu

Nu

Nu 2/1limfor

0/1,

Page 9: Genomic Conflict and  DNA Sequence Variation

Site frequency spectrum

Allele and mutation spectra

0

1

2

3

4

5

6

7

1 2 3 4 5 6 17

Multiplicity

Num

ber o

f mut

atio

ns

a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai the number of alleles with multiplicity i

Page 10: Genomic Conflict and  DNA Sequence Variation

• MutationNovel allelic types formed at rate u per gene per generation

• ReproductionFrequency of allele i in the parental population: pi

Multinomial sampling of N genes to form the offspring

To find: probability of the sample of n genes (n1, n2, …, nk) or (a1, a2, …, an)

for k the number of distinct haplotypes (alleles)ni the number of replicates of allele i

ai the number of alleles with i replicates

Infinite-alleles model

Page 11: Genomic Conflict and  DNA Sequence Variation

!1

)1()1(!)(

1 i

an

i ainnp

i

a

a = (a1, a2, …, an), for ai the number of alleles representedby i replicates in a sample of size n

= 2Nu, for N the effective number of genes and u the per-locus, per-generation rate of mutation

Ewens (1972, Theoretical Population Biology)

Ewens sampling formula

Page 12: Genomic Conflict and  DNA Sequence Variation

Site frequency spectrum

Allele and mutation spectra

0

1

2

3

4

5

6

7

1 2 3 4 5 6 17

Multiplicity

Num

ber o

f mut

atio

ns

a = {a1 = 6, a3 = 1, a5 = 1, a6 = 1}, for ai the number of alleles with multiplicity i

Page 13: Genomic Conflict and  DNA Sequence Variation

Population genomics

http://www.arabidopsis.org

About 750 accessions isolated from natural populations worldwideSummary statistics for sample of 19 entire genomes

Page 14: Genomic Conflict and  DNA Sequence Variation

Arabidopsis SNP spectra

Kim et al. (2008 Nature Genetics. 39: 1151)

Site frequency spectra differ among functional classes2Minor allele counts 3 5 6 7 84

Page 15: Genomic Conflict and  DNA Sequence Variation

• Biallelic sample of size m

• Multiplicities i and (m – i )

ESF conditioned on two alleles

1

1 2

1( 2 | )1 1

mm

l j

jP K ml j

1

1

/2 1

1

1/ 1/ ( )( 1, 1| 2, ) for / 21/

2 /( 2 | 2, )1/

i m i m

j

m m

j

i m iP a a K m i mj

mP a K mj

independent of !

Page 16: Genomic Conflict and  DNA Sequence Variation

!1

)1()1(!)(

1 i

an

i ainnp

i

a

a = (a1, a2, …, an), for ai the number of alleles representedby i replicates in a sample of size n

= 2Nu, for N the effective number of genes and u the per-locus, per-generation rate of mutation

Ewens (1972, Theoretical Population Biology)

Ewens sampling formula

Page 17: Genomic Conflict and  DNA Sequence Variation

Actual site frequency spectra

Excess of rare and common types, deficiency of intermediate typesData from NIEHS Environmental Genome Project

Direct resequencing of loci considered environmentally-sensitiveGlobal representation of ethnicities

Hernandez, Williamson, and Bustamante (2007)

Page 18: Genomic Conflict and  DNA Sequence Variation

Black: constant population sizeGrey: recent expansion from small population size

Braverman et al. (1995)

Spectrum shapeSignature of expansion?

Expansions maintain more rare mutationsSignature of selective sweep?

Neutral variants experience selection asa population bottleneck

Page 19: Genomic Conflict and  DNA Sequence Variation

Arabidopsis SNP spectra

Kim et al. (2008 Nature Genetics. 39: 1151)

Site frequency spectra differ among functional classes2Minor allele counts 3 5 6 7 84

Page 20: Genomic Conflict and  DNA Sequence Variation

Modelling a SNP data set

• Single segregating mutation in the sample genealogyConditional on exactly one segregating site, determine the

distribution of the size (number of descendants) of the branch on which the mutation occurs

• Exactly two alleles in the sampleConditional on two haplotypes, bearing any number of

segregating sites, determine the distribution of numbers of the two alleles

Nordborg (2001 Handbook of Statistical Genetics)

Page 21: Genomic Conflict and  DNA Sequence Variation

• Two alleles

• One segregating site

Conditioning

1

1 2

1( 2 | )1 1

mm

l j

jP K ml j

1

1 2

1( 1| )1 1

mm

l j

jP S ml j

Page 22: Genomic Conflict and  DNA Sequence Variation

• Single segregating site in a sample of size m

• Multiplicity i

1

1 2

1( 1| )1 1

mm

l j

jP S ml j

Multiplicity conditioned on a SNP

1

2

2

1 11 1

( | , )1 1

1

m i

l

m

j

m lli l i

f i mm

i j

dependent on θ!

Ganapathy and Uyenoyama (2009 Theoretical Population Biology)

Page 23: Genomic Conflict and  DNA Sequence Variation

Arabidopsis SNP spectra

Kim et al. (2008 Nature Genetics. 39: 1151)

Site frequency spectra differ among functional classes2Minor allele counts 3 5 6 7 84

Page 24: Genomic Conflict and  DNA Sequence Variation

• Population geneticsHistorically model-richPresent need: model-based interpretation of observed

patterns of genomic variationWhat are hallmarks of each model?

• Self-incompatibility systems in plantsRecognizing genomic conflict due to sexual

antagonism

Overview

Page 25: Genomic Conflict and  DNA Sequence Variation

• PhenotypesMultiple genes generally influence a given phenotype

• ConflictTarget trait value differs among genes that control

phenotypeSexual antagonism

Male and female function collaborate in reproductionGenes influencing each function may come into conflict

Genomic conflict

Page 26: Genomic Conflict and  DNA Sequence Variation

• Mating type regions as a battlegroundS-locus controls self-incompatibility in flowering

plantsHow does sexual antagonism affect the pattern of

molecular-level variation within the S-locus?What are hallmarks of conflict?

• Develop a basis for inferenceModel-based approach to the analysis of genetic

variation

Conflict and genomic variation

Page 27: Genomic Conflict and  DNA Sequence Variation

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

• Flower developmentBasic perfect flower includes

both male and female components

• FertilizationPollen grains deposited on

stigma germinate and pollen tubes grow down style to the ovary

Page 28: Genomic Conflict and  DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

• Gametophytic SI (GSI)Specificity expressed by

individual pollen grain or tube determined by own S-allele

• Pollen rejectionGrowth of pollen tube

arrested in style

Page 29: Genomic Conflict and  DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

• Sporophytic SI (SSI)Specificity expressed by

individual pollen grain or tube determined by the S-locus genotype of its parent

• Pollen rejectionGermination of pollen grain

may be arrested at stigma surface

Page 30: Genomic Conflict and  DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

An Bn

Sn

Pistil (A) component: rejection ofrecognized specificities

Pollen (B) component: declaration ofspecificity

Page 31: Genomic Conflict and  DNA Sequence Variation

Mating type regions

Uyenoyama (2005)

Page 32: Genomic Conflict and  DNA Sequence Variation

Human Y chromosome

Skaletsky et al. (2003 Nature 423: 825)

• Non-recombining male-specific Y (MSY)Euchromatic region ~ 23 MBDifferences between two random Ys every 3 – 4 KB

• Mammalian sex determinant SRYY-linked regulator of transcription of many male-

specific Y-linked genes

Page 33: Genomic Conflict and  DNA Sequence Variation

Mating type regions

Uyenoyama (2005)

Linkage between pistil (A) and pollen (B)components is essential to SI function• Pollen: declaration of specificity• Pistil: rejection of recognized specificities

Page 34: Genomic Conflict and  DNA Sequence Variation

Brassica S-locus

Pollen componentPistil component

Nasrallah (2000 Curr. Opin. Plant Biol.)

Natural populations often contain 30 – 50 S-alleles

Page 35: Genomic Conflict and  DNA Sequence Variation

Vierstra (2009, Nature Reviews Molecular Cell Biology)

Ubiquitin tags proteins for degradation

• Style: S-RNase disrupts pollen tube growthUpon entering a pollen tube, S-RNases initially sequestered in a vacuoleIn incompatible crosses, vacuole breaks down, releasing S-RNases into

cytoplasm of pollen tube

• Pollen: SLF (S-locus F-box)Mediator of ubiquitinylation (attachment of ubiquitin)Disables all S-RNases except those of the same specificity

Page 36: Genomic Conflict and  DNA Sequence Variation

• Pistil: why reject fertilization?Screening of potential mates may improve offspring

qualityCost under incomplete reproductive compensation:

ovules may go unfertilized

• Pollen: why provoke rejection?Self-rejection may improve quality of own ovulesRejection by other plants reduces siring success

Hide behind another S-specificity in sporophytic SI?Decline to declare S-specificity altogether?

Sexual antagonism

Page 37: Genomic Conflict and  DNA Sequence Variation

• Basic discrete time recursion

• Symmetries in genotype and allele frequenciesModel change in frequency of focal allele i, assuming

all other alleles in equal frequency

GSI model

'

, ,

/ 21 1

jk ikij i j

k i j k i jj k i k

P PP q q

q q q q

Wright (1937, Genetics)

1 for [1 ( 1)] / for ,

2( 1) / 2 (1 ) / ( 1) for

ij jk

i j

nP P j i P P n j k i

q q P n q q n j i

Page 38: Genomic Conflict and  DNA Sequence Variation

• Change in allele frequency

• Diffusion equation coefficients

holds for large population size (N) and u (rate of mutation to new S-alleles) of order 1/N

Diffusion approximation

(1 ) for the number of common S-alleles3 2(1 ) for 1/

( 1)( 2)

q qnq nn q

nq nq q nn n

Wright (1937, Genetics)

2

( ) (1 ) / ( 1)( 2)

( ) (1 2 ) / 2

x nx nx n n ux

x x x N

Page 39: Genomic Conflict and  DNA Sequence Variation

Num

ber o

f S-a

llele

s

Frequency in population

• Diffusion with jumps

• Turnover rate

Wright’s diffusion model

(x) nx(1 nx)

(n 1)(n 2) ux

2 x(1 2x)

2N

4( 1)( 2)

Nunn n

Page 40: Genomic Conflict and  DNA Sequence Variation

Takahata (1993, Mechanisms of Molecular Evolution)

Expansion of time scaleunder balancing selection

• High rate of invasion of rare allelesPromotes invasion of new

and retention of rare typesMaintains high numbers of

alleles• Genealogical relationships

Tree shape similar under symmetric balancing selection and neutrality

Greatly expanded time scale

Page 41: Genomic Conflict and  DNA Sequence Variation

• Quasi-equilibrium of S-allelesInvasion of new, rare S-alleles balanced by extinction

of common S-alleles

• Expansion of time scaleRate of divergence among S-allele classes similar to

rate among neutral lineages, but in a population of size fN:

S-allele turnover

2

2

(1 1 / )2 2 ( 1)( 2)

2 4 162

j jn

n n n nfnNf N N u

Page 42: Genomic Conflict and  DNA Sequence Variation

• Basic discrete time recursion

• Diffusion approximation

Parameters:Effective population size (N)Rate of mutation to new S-specificities (u)

Gametophytic SI models

'

, ,

/ 21 1

jk ikij i j

k i j k i jj k i k

P PP q q

q q q q

2

(1 )( )( 1)( 2)

(1 2 )( )2

nx nxx uxn nx xx

N

Page 43: Genomic Conflict and  DNA Sequence Variation

• Stationary distribution of allele frequencyMost time spent close to

deterministic equilibrium (1/n) or in boundary layer close to extinction

• Number of S-allelesAnalytical expectation for

number of common S-alleles

Simulation results

Vallejo-Marín and Uyenoyama (2008)

Page 44: Genomic Conflict and  DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Mariana Ruizhttp://commons.wikimedia.org/wiki/File:Mature_flower_diagram.svg

An Bn

Sn

Pistil (A) component: rejection ofrecognized specificities

Pollen (B) component: declaration ofspecificity

Page 45: Genomic Conflict and  DNA Sequence Variation

Pollen specificity in GSI• Each pollen expresses its

own specificityRarer specificities are

incompatible with fewer plants

• Incompatible matingsFor n S-alleles in equal

frequencies, a pollen type is incompatible with a proportion 2/n of all plants

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Gametophytic_self-incompatibility.png

Page 46: Genomic Conflict and  DNA Sequence Variation

• Pistil: why reject fertilization?Screening of potential mates may improve offspring

qualityCost under incomplete reproductive compensation:

ovules may go unfertilized

• Pollen: why provoke rejection?Self-rejection may improve quality of own ovulesRejection by other plants reduces siring success

Hide behind another S-specificity in sporophytic SI?Decline to declare S-specificity altogether?

Sexual antagonism

Page 47: Genomic Conflict and  DNA Sequence Variation

1.00.80.60.40.20.00.0

0.2

0.4

0.6

0.8

1.0

Column 2Inf

Data from "Ainv"

s

Col

umn

2

Self-pollen fraction (s)

Rel

ativ

e vi

abili

ty o

f inb

red

offs

prin

g (

)

Full SC

Polymorphism

Full SI

Fate of style-part mutantAn+1 Bn

Sa

Page 48: Genomic Conflict and  DNA Sequence Variation

1.00.80.60.40.20.00.0

0.2

0.4

0.6

0.8

1.0

Data from "Binv"

s

n= 1

0

Self-pollen fraction (s)

Rel

ativ

e vi

abili

ty o

f inb

red

offs

prin

g (

)

Full SC

Polymorphism

Full SI

Disruption

Uyenoyama, Zhang, and Newbigin (2001)

Fate of pollen-part mutantAn Bn+1

Sb

Page 49: Genomic Conflict and  DNA Sequence Variation

An+1 Bn

Sa

An Bn+1

Sb

An+1 Bn+1

Sn+1

An Bn

Sn

Direction of pollen flow

Uyenoyama, Zhang, and Newbigin (2001)

Page 50: Genomic Conflict and  DNA Sequence Variation

An+1 Bn

Sa

An Bn+1

Sb

An+1 Bn+1

Sn+1

An Bn

Sn

Uyenoyama, Zhang, and Newbigin (2001)

Evolutionarily unlikelyTURN OFFPartial breakdown of SIby pollen disablement

TURN ONRestoration of SIby stylar recognition

Evolutionarily unlikely

Page 51: Genomic Conflict and  DNA Sequence Variation

Joint genealogies

Newbigin, Paape, and Kohn (2008)

Unlike S-RNase genes, SLF genes show– Low divergence between allelic types– No trans-specific sharing of lineages

Solanaceae and Plantaginaceae Rosaceae

Page 52: Genomic Conflict and  DNA Sequence Variation

• Family-specific genealogiesRosaceae: do highly-diverged, ancient SFB lineages

reflect continuous operation or restoration of same F-box genes?

Solanaceae, Plantaginaceae: Recruitment of new F-box genes?

• Turnover of pollen-specificity lociExpression and recognition of a paralogue of the

former pollen specificity gene?Can homologues be distinguished from paralogues

with new function?

Cycles of loss/restoration of SI?

Page 53: Genomic Conflict and  DNA Sequence Variation

Brassica S-locus

Pollen componentPistil component

Nasrallah (2000 Curr. Opin. Plant Biol.)

Natural populations often contain 30 – 50 S-alleles

Page 54: Genomic Conflict and  DNA Sequence Variation

• Sexual antagonism in mating type regionsNeutral variation in linked regionsRates of substitution at determinants of mating type

• InferenceGoal: use the pattern of variation in population

samples of genomic regions as a basis for inference about the evolutionary process

Detection • genomic conflict and other forms of selection• mating systems and population structure

An appeal for inference methods

Page 55: Genomic Conflict and  DNA Sequence Variation

Norbert Holsteinhttp://commons.wikimedia.org/wiki/File:Sporophytic_self-incompatibility.png

Pollen specificity in SSI• Codominance

Both specificities expressedAlmost twice as many incompatible

styles under SSI than GSI for same number of S-alleles

• Complete dominanceOne specificity expressed

Page 56: Genomic Conflict and  DNA Sequence Variation

SRK genealogies

Edh, Widén and Ceplitis (2009)

• Sporophytic SIDiploid genotype of pollen parent

determines S-specificity of each pollen grain

Class I is dominant over Class II, with codominance within class

• Class II: pollen-recessiveLower number of segregating

alleles, each with relatively higher frequency in population

Greater genealogical relationship within class?

Page 57: Genomic Conflict and  DNA Sequence Variation

Is class II younger

than class I?

Uyenoyama (1995)

• MRCA agesClass I: 25.5 ± 8.1 MYClass II: 3.1 ± 0.9 MYI/II: 41.4 ± 12.7 MY

• Origin of SLG/SRK system42.1 ± 9.0 MY