1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series,...

27
1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279- 284 Speaker : Fang-Ling Lin Advisor : Prof. R.C. T. Lee National Chi-Nan University

Transcript of 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series,...

Page 1: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

1

Construction of Phylogenetic Trees

Walter M. Fitch and Emanuel MargoliashScience, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284

Speaker : Fang-Ling Lin

Advisor : Prof. R.C. T. Lee

National Chi-Nan University

Page 2: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

2

Outline

Basic nounsConstruct phylogenetic treeAnalyze the phylogenetic treeReconstruction of the ancestral cytochrome c

amino acid sequences.

Page 3: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

3

Introduction

Biochemists have attempted to use quantitative estimates of variance between substances obtained from different species to construct phylogenetic trees.

These methods have not been completely satisfactory because

1. restricted2. accuracy3. mathematical

Page 4: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

4

What is cytochrome c?

Cytochrome c is a protein that participates in the metabolism of the mitochondrion .

It will move from the mitochondrion to the cytoplasm and the cell will die.

Page 5: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

5

Determining the Mutation Distance

The mutation distance : The minimal number of nucleotides that would need to be altered in order for the gene for one cytochrome to code for the other.

ACTGAT A C T G AT -

T C T - AT C

TCTATC

Page 6: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

6

Problem

Given:

Output: phylogenetic tree

Page 7: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

7

The construction of the tree

Assume there are proteins, A, B and C, and their mutation distances.

There are two fundamental problems:1. Which pair does one join together first?

2. What are the lengths of edges a, b, and c?

B C

A 24 28

B 32

Page 8: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

8

Which pair does one join together first ?

It is simply by choosing the pair with the smallest mutation distance.

B C

A 24 28

B 32 A B C

Page 9: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

9

What are the lengths of legs a, b, and c?

B C

A 24 28

B 32

a+b=24 a+c=28b+c=32

a=10b=14c=18

A B C

a b

c

Page 10: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

10

When information from more than three proteins is utilizedWhen information from more than three

proteins is utilized, the basic procedure is the same.

One then simply joins two subsets to create a single subset.

Until all proteins are members of a single subset.

Page 11: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

11

Example: 5 proteins

1 2 3 4 5

1 0 1 13 17 16

2 0 12 16 15

3 0 10 8

4 0 1

5 0

1,2 3 4 5

1,2 0 (13+12)/2

=12.5

(17+16)/2

=16.5

(16+15)/2

=15.5

3 0 10 8

4 0 1

5 0

1 2 3,4,5

a+b=1a+c=(13+17+16)/3=15.33b+c=(12+16+15)/3=14.33

a=1b=0c=14.33

a=1 b=0

c=14.33

Page 12: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

12

Example: 5 proteins

1,2 3 4,5

1,2 0 12.5 (16.5+15.5)/2

=16

3 0 (10+8)/2

=9

4,5 0

a+b=1a+c=(16.5+10)/2=13.25b+c=(15.5+8)/2=11.75

a=1.25b=-0.25c=121 2 , 3 4 5

c=12

a=1.25 b=-0.251 0

Page 13: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

13

Example: 5 proteins

1,2 3,4,5

1,2 0 (12.5+16)/2

=14.25

3,4,5 0

1 2 3 4 5

c=9.75

a=2.75b=6.25

1 0

a+b=9a+c=12.5b+c=16

a=2.75b=6.25c=9.75

1.25 -0.25

Page 14: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

14

Example: 5 proteins

1,2 3,4,5

1,2 0 14.25

3,4,5 0

1 2 3 4 5

c=9.75

2.75 b=6.25

1 01.25 -0.25

x=5.75

((x+1.25)+(x-0.25))/2=6.25x=5.75

((y+1)+(y+0))/2=9.75y=9.25

y=9.25

Page 15: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

15

Testing Alternative Trees

In this method, the output is generated by input, and the results are the same by using the same input every time.

Since a particular assignment of species to A and B subsets defines a tree, thus different assignments of species to A and B produce different trees. Check this out.

Fig. 1 is the best of 40 phylogenetic trees.

Page 16: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

16

Phylogenetic Tree of 20 species

•Back 1•Back 2

Fig.1

Page 17: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

17

Reconstructed distances

Values in the upper right half of the table are reconstructed distances found by summing the leg lengths in Fig.1.

i

j

original input

reconstruct value

Page 18: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

18

Standard deviation

the percentage of change from the input data

standard deviation :summed over all values of i<j

Page 19: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

19

The statistically optimal tree

In testing phylogenetic alternatives, one is seeking to minimize the percent “standard deviation.”

Fig.1 has a percent “standard deviation” of 8.7, the lowest of the 40 alternatives so far tested.

The percent “standard deviation” for the initial tree was 12.3.

Page 20: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

20

The statistically optimal tree

Page 21: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

21

Fig.1 is remarkably like that constructed in accord with classical zoological comparisons.

Almost all the alternative phylogenetic schemes tested involved rearrangements with the groups birds (turkey, chicken) and nonprimate mammals (cow, sheep, pig).

Page 22: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

22

Three noticeable deviations

Birds of flight (Neognathae) and penguin (Impennae)

Kangaroo v.s. nonprimate mammals and placental mammals v.s. marsupials

The turtle appears more closely associated with the birds than to its fellow reptile the rattlesnake.

Fig.1

Page 23: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

23

Indeed, from any phylogenetic ancestor, today’s descendants are equidistant with respect to time but not equidistant genetically.

The method indicates those lines in which the gene has undergone the more rapid changes.

For example, The mutation distance between mammals and primates is 7.5 and that between mammals and non-primates is 5.8. The change in the cytochrome c gene has been much more rapid in the descent of the primates than in that of the other mammals. Fig.1

Page 24: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

24

Reconstruction of the ancestral cytochrome c amino acid sequences.

The procedure is dependent upon the phylogenetic tree on which these sequence data are arranged.

Page 25: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

25

Amino acid No.

Ancestral MammalAncestral PrimateMonkeyMan----------Kangaroo----------Rabbit----------DogAncestral UngulatePigAncestral PerissodactylDonkeyHorse

17 18 21 39 41 50 52 53 56 64 66 68 89 94 95 98 109

V Q L H U P F A E I G L I E Q NS

V Q L H U P F S A E Y G L I Y Q N

V Q L H U P F S A E I G L I E Q N V Q L H U P F E A E I G L I E Q N

V Q L H U V F S A E Y A L I A L N

W M S H U P O S L E Y A V I G L N W M S H U P F S L W Y A V I G L N

V Q L N W P F S A W Y A L I Y L N

V Q L H U P F S A E Y G L E Y L I

V Q L H U P O S A E Y A L I G L N

W M S H U P O S L E Y A V I G L N

V Q L H U P F S A E Y A L I Y L N

V Q L H U P F S A E Y A L I Y L N

V Q L H U P O S A E Y G L I Y L N

Y YV Q L H U P F S A E G L I Q N

Page 26: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

26

There is presently no detectable relationship between the primary structures of cytochrome c and those of hemoglobins. The reconstruction and comparison of the ancestral amino acid sequences may reval a homology that cannot be detected in present-day proteins.

The employment of such ancestral sequences may be generally useful for detecting common ancestry not otherwise observable.

Page 27: 1 Construction of Phylogenetic Trees Walter M. Fitch and Emanuel Margoliash Science, New Series, Volume 155, Issue 3760(Jan. 20, 1967), 279-284 Speaker.

27

Thank you !