Computational analysis of transposable element evolution...

36
Casey M. Bergman Faculty of Life Sciences University of Manchester [email protected] http://bioinf.manchester.ac.uk/bergman Computational analysis of transposable element evolution in Drosophila genomes.

Transcript of Computational analysis of transposable element evolution...

Page 1: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Casey M. Bergman

Faculty of Life SciencesUniversity of Manchester

[email protected]://bioinf.manchester.ac.uk/bergman

Computational analysis of transposable element evolution in Drosophila genomes.

Page 2: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Overview of Talk

• Demographic history of TEs in D. melanogaster

• TE population genomics using 454 sequencing

• Discovery and detection of TEs in genomes

• Abundance and distribution of TEs in Drosophila genomes

• Noncoding DNA (ncDNA) & transposable elements (TEs)

Page 3: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Higher organisms have ahigher proportion of ncDNA

Bacteria15 %

Yeast30 %

Page 4: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Higher organisms have ahigher proportion of ncDNA

Bacteria15 %

Yeast30 %

Human98 %

Fly75 %

Page 5: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

The function of most noncoding DNA is unknown & unannotated

= Exon

Mef2

Mef2

Mef2

Mef2

Mef2

CG15863

CG12130

CG1418

CG12133

Adam

CG12134

CG12134

eve

TER94

TER94

Pka-R2

Pka-R2

Pka-R2

CG12128

BS 1360

Page 6: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

(A)n

Mef2

Mef2

Mef2

Mef2

Mef2

CG15863

CG12130

CG1418

CG12133

Adam

CG12134

CG12134

eve

TER94

TER94

Pka-R2

Pka-R2

Pka-R2

CG12128

BS 1360

Enhancers

AR3/72

APRCQ4/6

mes

15RP2

Transposable elements

Goal: comprehensive functional annotation of noncoding sequences in Drososphila

Page 7: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

(A)n

DNA transposons (cut+paste)

RNA retrotransposons (copy+paste)

3 major types of transposable element (TE)

Terminal Inverted Repeat (TIR)

LINE-like (non-LTR)

Long Terminal Repeat (LTR)(A)n

Page 8: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Why is the discovery and detection of TE sequences in genomes important?

• Genome alignment

• Genome evolution

• Population genomics

• Genome organization

• Genome assembly

Page 9: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Discovery of new TE families

• Homology to TE proteins (e.g. transposase) => HMMer, tBLASTx

gagPBS

3’TSD 5’LTR 3’LTR 3’TSD

PPTpol env

Dmin ! (b3 – b5) ! Dmax

Lmin ! (e5 – b5), (e3 – b3) ! Lmax

b5 e5 b3 e5

• structural motifs (e.g. LTRs)=> LTRstruc, LTRharvest

• comparative genomics=> compTE

• dispersed repeats (all-by-all, k-mers)=> RECON, PILER, RepeatScout, ReAS

Reviewed in: Bergman & Quesneville (2007) Brief Bioinf. 8:382-92

Page 10: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

hmmall-by-allRECON

BLASTER

RepeatMasker

TBLASTX

RMBLR

Release 3

Release 4

Quesneville, Bergman et al. (2005) PLoS Comp. Biol. 1:e22

Detection of individual TE copies

Page 11: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Genomic TE distribution in D. melanogaster

~3% ~20%

genome-wideaverage ~5.5%

10

20

30

40

50

5 10 15 20

X# TEs per 50kb

~ centromere~ high-low rec.

Bergman, Quesneville et al. (2006) Genome Biology 7:R112

Page 12: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Drosophila 12 genomes project

Clark, Eisen, Smith, Bergman, et al (2007) Nature 450:203-18

Page 13: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

D.mel D.sim D.sec D.yak D.ere D.ana D.pse D.per D.wil D.vir D.moj D.gri

Proport

ion T

E/R

epea

t in

Sca

ffold

s >

200 K

bBLASTER-tx+Repbase-NoDros

BLASTER-tx+BDGP

BLASTER-tx+PILER

RepeatMasker+ReAS

RepeatRunner+PILER

CompTE

TE abundance in 12 Drosophila genomes

Clark, Eisen, Smith, Bergman, et al (2007) Nature 450:203-18

5.3%

2.7% 3.7%

12.0

%

24.9

%

6.9%

15.6

%

2.8%2.7%

8.5%

13.9

%

8.9%

Page 14: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

D.mel D.sim D.sec D.yak D.ere D.ana D.pse D.per D.wil D.vir D.moj D.gri

Proport

ion T

E C

lass

in S

caff

old

s >

200 K

b

LTR

LINE

TIR

OTHER

Abundance of major TE types is conserved across genus Drosophila

non-LTR

LTR

TIR

Clark, Eisen, Smith, Bergman, et al (2007) Nature 450:203-18

Page 15: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Is the genomic distribution of TEs in D. melanogaster affected by historical activity?

Page 16: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

A brief introduction to transposable element (TE) evolution: the current paradigm

• TEs are mobile DNA sequences, intra-genomic parasites

• Transposition rates >> excision rates

• Equilibrium maintained by transposition-selection balance

• Mode of natural selection is debated

- deleterious effects of transposition

- deleterious effects of TE insertion

- deleterious effects of TE-mediated ectopic recombination

✴ TE insertions observed at low frequency in nature

Page 17: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Estimating the age of ‘pseudogene-like’ retrotransposons

Petrov & Hartl (1998) Mol. Biol. Evol. 15:293-302

Alignment of paralogous TEs

Page 18: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Petrov & Hartl (1998) Mol. Biol. Evol. 15:293-302

Estimating the age of ‘pseudogene-like’ retrotransposons

Page 19: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

# families

# elements

Total bp Surveyed 1st 2nd 3rd Total Point

Sub. P (Ho) Ψ

All non-LTR 19 377 836,819 1,515 1,424 1,917 4,884 3.56E-24 N

Ψ non-LTR 10 158 336,748 791 746 781 2,341 0.192 Y

All LTR 27 385 1,973,013 677 603 1,120 2,420 2.18E-44 N

Ψ LTR 17 279 1,491,867 272 267 307 851 0.159 Y

Grand Total 46 762 2,809,832 2,192 2,027 3,037 7,304 5.18E-61 N

Total Ψ 27 437 1,828,615 1,063 1,013 1,088 3,192 0.06 Y

59% of retrotransposon families exhibit a pseudogene-like mode of evolution on terminal branches

Most retrotransposon families exhibit a ‘pseudogene-like’ mode of sequence evolution

Page 20: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

D. mel - D. sim speciation

Bergman & Bensasson (2007) PNAS 104:11340-5

a_in

vader2

_6

b_m

icro

pia

_4

c_T

abor_

3

d_17.6

_11

e_S

talk

er_

4

f_ro

ver_

3

g_flea_16

h_copia

_28

i_m

dg3_10

j_ro

o_86

k_T

ranspac_4

l_opus_16

m_blo

od_22

n_412_24

o_B

urd

ock_13

p_div

er_

9

q_T

irant_

20

r_jo

ckey2_7

s_H

ele

na_7

t_C

r1a_36

u_baggin

s_6

v_G

4_10

w_D

oc3_7

x_G

5_8

y_B

S_15

z_Juan_9

zz_D

oc_53

0.00

0.02

0.04

0.06

0.08

0.10

0.12D

iverg

ence (

sub/s

ite)

0

1.80

3.60

5.41

7.21

9.01

10.81

Age (

Mya)

Retrotransposon demographics in D. melanogaster

Page 21: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Intra-element LTR-LTR comparisons support age estimates based on terminal branch length

5’ LTR ACGTAGCTAGGCTGACGTGGACTGTAC ||||||||||| ||||||| |||||||3’ LTR ACGTAGCTAGGGTGACGTGCACTGTAC

T = D/(2*r)

T - absolute timeD - 5’ vs. 3’ LTR divergence

r - neutral substitution rate (0.0111/my)

see also Bowen and McDonald (2001) Genome Res 11:1527-1540

a_

inva

de

r2_

6

b_

mic

rop

ia_

4

c_

Ta

bo

r_3

d_

17

.6_

11

e_

Sta

lke

r_4

f_ro

ve

r_3

g_

fle

a_

16

h_

co

pia

_2

8

i_m

dg

3_

10

j_ro

o_

86

k_

Tra

nsp

ac_

4

l_o

pu

s_

16

m_

blo

od

_2

2

n_

41

2_

24

o_

Bu

rdo

ck_

13

p_

div

er_

9

q_

Tira

nt_

20

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Div

erg

en

ce

(su

b/s

ite

)

0

1.80

3.60

5.41

7.21

9.01

10.81

Ag

e (

Mya

)

Page 22: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Intra-element LTR-LTR age estimates correlate with terminal branch age estimates

sqrt(branch length)

sqrt

(LTR

-LTR

div

erge

nce)

Spearman’s Rank Correlation Test:

p= 0.006531

Page 23: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Recent demographic history of D. melanogaster

Li & Stephan (2006) PLoS Genet. 2:e166

15,800 ya60,000 ya

Page 24: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Current paradigm is based on LTR families

Maside et al. (2000) Genet. Res 75:275-284

******?

****

Page 25: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Current paradigm assumes transposition-selection equilibrium

Carr et al. (2002) Chromosoma 110:511-518

Page 26: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Current paradigm interprets low TE frequency as evidence for purifying selection

Aquadro et al. (1986) Genetics 114:1165-1190

Page 27: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Summary of retrotransposon demographicsinferred from intra-genomic comparisons

• LTR elements systematically younger than non-LTR elements

=> Low frequency of LTR insertions may not be due to selection

• non-LTR families inserted in waves since speciation

• most LTR families inserted since colonization of non-African habitats

=> LTR insertions not at transposition-selection equilibrium

• LTR elements evolve under pseudogene-like mode like non-LTR elements

Page 28: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

From evolutionary statics to dynamics: population genomics of TEs using 454 sequencing

Page 29: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Population genomics of TEs using 454 sequencing

Hybrid TE-unique reads“Unique Flank Tags”

Strain X

454 Reads

TEs

Page 30: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Population genomics of TEs using 454 sequencing

Hybrid TE-unique reads“Unique Flank Tags”

Strain X

454 Reads

TEs

KNOWN ✓ReferenceNEW !

Page 31: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Population genomics of TEs using 454 sequencing

TEs in reference sequence

Page 32: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Population genomics of TEs using 454 sequencing

TEs in reference sequence

Known INE-1 insertion present in NC and AF strains

Novel jockey insertion present in >1 strain

Page 33: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Preliminary findings using 454 sequencing

• 10 strains of D. melanogaster: 6 USA, 4 Malawi

• ~1/3 of INE-1 found in nature, so estimate ~3900 annotated TEs present in >=1 wild strain

• ~24% of annotated TEs found in >=1 wild strain (1300/5400)

• ~72% found in nature in low recomb. regions (950/1300)

• DNA transposons (~30%) found in nature more often than LTR/non-LTR retrotransposons (~10%)

• consistent with all TEs fixed in low recomb. regions plus few hundred segregating in high recomb. regions

Page 34: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Summary

• Mature methods exist for analysis of TEs in genomeshttp://www.bioinf.manchester.ac.uk/bergman/te-tools.html

• Structural classes of TEs have different genome dynamics

• Recent LTR insertion has implications for transposition-selection balance paradigm

• Population genomics using next generation sequencing will help resolve forces controlling TE evolution

Page 35: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

Hadi QuesnevilleRuqiang Li

Douda Bensasson

Post-doctoral & PhD positions open

Andy Clark

Page 36: Computational analysis of transposable element evolution ...bergmanlab.genetics.uga.edu/wp-content/uploads/...A brief introduction to transposable element (TE) evolution: the current

University of Sheffield•15-17 July 2009

RES Symposium on insect infection and immunity: Evolution, Ecology and MechanismsRES Annual National Meeting

ento’09

Specialist topics to include: • Immunity • Comparative genomics • Reproduction • Range expansion/climate change • Insect Evo-devo • General Entomology • Insect chemoreceptionSpeakers include: Professor Fotis Kafatos, Imperial College, London, UK; Professor Paul Schmid-Hempel, ETH Zurich, Switzerland;Professor Shelley Adamo, Dalhousie University, Canada; Professor Bruno Lemaitre, EPF Lausanne, Switzerland

SYMPOSIUM CONVENORS:Professor Stuart Reynolds, University of Bath, [email protected]. Jens Rolff, University of Sheffield, [email protected]

NATIONAL MEETING CONVENORS:Professor Roger Butlin, University of Sheffield, [email protected] Mike Siva-Jothy, University of Sheffield, [email protected]. Klaus Reinhardt, University of Sheffield, [email protected]

Further information, registration, abstract and accommodation booking forms available on www.royensoc.co.uk

Tel: +44 (0)1727 899387 Fax: +44 (0)1727 894797E-mail: [email protected]

PHOTO CREDITS: PAUL DEAN, STAINED CELL SHOTS; RICHARD NAYLOR, BED BUG.