Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally,...

40
Molecular Phylogeny

Transcript of Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally,...

Page 1: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

Molecular Phylogeny

Page 2: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

2

Phylogeny is the inference of evolutionary relationships.Traditionally, phylogeny relied on the comparison of morphological features between organisms. Today, molecular sequence data are mainly used for phylogenetic analyses.

One tree of life A sketch Darwin madesoon after returning from his voyage onHMS Beagle (1831–36) showed his thinkingabout the diversification of speciesfrom a single stock (see Figure, overleaf).This branching, extended by the conceptof common descent,

Page 3: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

3

Haeckel (1879) Pace (2001)

Page 4: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

4

Molecular phylogeny uses trees to depict evolutionaryrelationships among organisms. These trees are based upon DNA and protein sequence data

Human

Chimpanzee

Gorilla

Orangutan

Gorilla

Chimpanzee

Orangutan

Human

Molecular analysis:Chimpanzee is related more closely

to human than the gorilla

Pre-Molecular analysis:The great apes

(chimpanzee, Gorilla & orangutan)Separate from the human

Page 5: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

5

What can we learn from phylogenetics tree?

Page 6: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

• Was the extinct quagga more like a zebra or a horse?

1. Determine the closest relatives of one organism in which we are interested

Page 7: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

7

Which species are closest to Human?

Human

Chimpanzee

Gorilla

Orangutan

Gorilla

Chimpanzee

Orangutan

Human

Page 8: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

8

Example Metagenomics

A new field in genomics aims the study the genomes recovered from environmental samples.

A powerful tool to access the wealthy biodiversity of native environmental samples

2. Help to find the relationship between the species and identify new species

Page 9: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

106 cells/ ml seawater107 virus particles/ ml seawater

>99% uncultivated microbes

Incredible microbial diversity in a drop of seawater

Page 10: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

shear

3 – 4 kb shotgunlibrary

paired-end sequence(F / R)

compositecontig assembly

community DNA

…ACGGCTGCGTTACATCGATCATTTACGAACATCGATCATTTACGATACCATTG…

community sample

(cloning bias)

(extraction bias)

Metagenomics

Page 11: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

11

From : “The Sorcerer II Global Ocean Sampling Expedition: Metagenomic Characterization of Viruses within Aquatic Microbial Samples” Williamson et al, PLOS ONE 2008

Page 12: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

3. Discover a function of an unknown gene or protein

12

RBP1_HS

RBP2_pig

RBP_RAT

ALP_HS

ALPEC_BV

ALPA1_RAT

ECBLC

Hypothetical protein

Hypothetical protein

X

Hypothetical protein

Page 13: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

13

Relationships can be represented by Phylogenetic Tree or Dendrogram

A B C D

E

F

Page 14: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

14

Phylogenetic Tree Terminology

• Graph composed of nodes & branches

• Each branch connects two adjacent nodes

A B C D

E

F

R

Page 15: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

15

Rooted tree

based on priori knowledge:

Human

Chimp

Chicken

Gorilla

Human ChimpChicken Gorilla

Un-rooted tree

Phylogenetic Tree Terminology

Page 16: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

16

Rooted vs. unrooted trees

1

2

3

3 1

2

Page 17: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

17

How can we build a tree with molecular data?

-Trees based on DNA sequence (rRNA)-Trees based on Protein sequences

Page 18: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

18

Questions:

• Can DNA and proteins from the same gene produce different trees ?

• Can different genes have different evolutionary history ?

• Can different regions of the same gene produce different trees ?

Page 19: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

19

Methods

Page 20: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

20

Approach 1 - Distance methods

• Two steps :– Compute a distances between any two sequences from the MSA.– Find the tree that agrees most with the distance table.

• Algorithms : -Neighbor joining

Approach 2 - State methods• Algorithms:

– Maximum parsimony (MP)– Maximum likelihood (ML)

Page 21: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

21

Neighbor Joining (NJ)

• Reconstructs unrooted tree• Calculates branch lengths Based on pairwise distance• In each stage, the two nearest nodes of the

tree are chosen and defined as neighbors in our tree. This is done recursively until all of the nodes are paired together.

Page 22: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

Star StructureAssumption: Divergence of sequences is assumed to occur at constant rate Distance to root equals

a

d

c

b

Page 23: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

23

a b c d

a 0 8 7 5

b 8 0 3 9

c 7 3 0 8

d 5 9 8 0

a

d

c

b

Basic Algorithm

Initial star diagramDistance matrix

Page 24: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

24

a b c d

a 0 8 7 5

b 8 0 3 9

c 7 3 0 8

d 5 9 8 0

a

d

c

b

Choose the nodes with the shortest distance and fuse them.

Selection step

Page 25: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

25

Then recalculate the distance between the rest of the remaining sequences (a and d) to the new node (e) and remove the fused nodesfrom the table.

dc,b e

aa d e

a 0 5 6

d 5 0 7

e 6 7 0

D (EA) = (D(AC)+ D(AB)-D(CB))/2

Next Step

D (ED) = (D(DC)+ D(DB)-D(CB))/2

a b c d

a 0 8 7 5

b 8 0 3 9

c 7 3 0 8

d 5 9 8 0

Page 26: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

26

In order to get a tree, un-fuse c and b by calculating their distance to the new node (e)

d

c

e

a

a d e

a 0 5 6

d 5 0 7

e 6 7 0 b

Dce

Dde

Next Step

Page 27: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

27

a,d

c

ea d e

a 0 5 6

d 5 0 7

e 6 7 0 b

Dce

Dde

f

Next…

Page 28: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

28

ac

ef e

f 0 4

e 4 0

b

Daf

Dde

f

d

Dce

Dbf

Final

D (EF) = (D(EA)+ D(ED)-D(AD))/2

Page 29: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

29

dc,b e

a

a,d

c

e

b

Dce

Dde

f

d

ac

e

b

Daf

Dde

fDce

Dbf

1 2

3

Page 30: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

30

IMPORTANT !!!•Usually we don’t start from a star diagram

and in order to choose the nodes to fuse we have to calculate the relative distance matrix (Mij) representing the relative distance of each node to all other nodes

Page 31: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

31

EXAMPLE

   A  B  C  D  E

 B  5        

 C  4  7      

 D  7  10  7    

 E  6  9  6  5  

 F  8  11  8  9  8

   A  B  C  D  E

 B  -13        

 C  -11 -11      

 D  -10  -10 -10.5    

 E  -10  -10 -11 -13  

 F -10.5 -10.5  -11  -11.5  -11.5

Original distance Matrix Relative Distance Matrix (Mij)

The Mij Table is used only to choose the closest pairs not for calculating the distances

Page 32: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

32

Advantages -It is fast and thus suited for large datasets -permits lineages with largely different branch lengths

Disadvantages - sequence information is reduced - gives only one possible tree

Advantages and disadvantages of the neighbor-joining method

Page 33: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

More problems with phylogenetic trees

• It is wrong to assume that branch length is proportional to speciation time (molecular clock).

• It is wrong to produce a tree based on distance values of the whole alignment.

Page 34: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

Problems with phylogenetic trees

1

7

3

5

6

2

4

0.2

Bacillus

E.coli

Pseudomonas

Salmonella

Aeromonas

Lechevaliera

Burkholderias

Page 35: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

1

7

5

3

6

2

4

0.2

Bacillus

1

3

7

5

6

2

4

0.2

1

5

3

7

6

2

4

0.2

3

5

7

1

6

2

4

0.2

Bacillus

Bacillus

Bacillus

E.coli

E.coli E.coli

E.coli

Pseudomonas

Pseudomonas

Pseudomonas

Pseudomonas

Salmonella

Salmonella Salmonella

Salmonella

Aeromonas

Aeromonas

Aeromonas

Aeromonas

Lechevaliera

Lechevaliera

Lechevaliera

Lechevaliera

Burkholderias

Burkholderias

Burkholderias

Burkholderias

Problems with phylogenetic trees

Page 36: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

Problems with phylogenetic trees

• It is wrong to assume that branch length is proportional to speciation time (molecular clock).

• It is wrong to produce a tree based on distance values of the whole alignment : using different regions from a same alignment may produce different trees.

• What to do?: use bootstrap

Page 37: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

1

3

7

5

6

2

477

100

83

58

0.2

Pseudomonas

Burkholderias

E.coli

Salmonella

Lechevaliera

Aeromonas

Bacillus

Boostraped tree

•Bootstrapping is a methods for estimating generalization error based on

“resampling“. •In the context of phylogenetic trees, it consist in randomly selecting

different positions from an alignment and constructing a tree based on these

position.•As a result we get the % of times a certain node was formed.

Highly reliable none

less reliable none

Page 38: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

38

Tools for tree reconstruction

• CLUSTALX (NJ method)

• Phylip -PHYLogeny Inference Package– includes parsimony, distance matrix, and

likelihood methods, including bootstrapping.

• Phyml (maximum likelihood method)

• More phylogeny programs

Page 39: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

39

362

Page 40: Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.

40

http://www.phylogeny.fr