Introduction to Phylogenies Dr Laura Emery [email protected] .

29
Introduction to Phylogenies Dr Laura Emery [email protected] www.ebi.ac.uk/training

Transcript of Introduction to Phylogenies Dr Laura Emery [email protected] .

Page 1: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Introduction to Phylogenies

Dr Laura Emery

[email protected]

www.ebi.ac.uk/training

Page 2: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Objectives

After this tutorial you should be able to…

• Use essential phylogenetic terminology effectively

• Discuss aspects of phylogenies and their implications for phylogenetic interpretation

• Apply phylogenetic principles to interpret simple trees

Page 3: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Outline

• Applications of phylogenetics

• What is a phylogeny or tree?

• Aspects of a tree

• Phylogenetic Interpretation

Page 4: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

What can I do with phylogenetics?

• Deduce relationships among species or genes

• Deduce the origin of pathogens

• Identify biological processes that affect how your sequence has evolved e.g. identify genes or residues undergoing positive selection

• Explore the evolution of traits through history

• Estimate the timing of major historical events

• Explore the impact of geography on species diversification

Page 5: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

What is a phylogenetic tree?

A tree is an explanation of how sequences evolved, their genealogical relationships and thus how they came to be the way they are today (or at the time of sampling).

Darwin 1837

Page 6: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Phylogenies explain genealogical relationships

• Family tree

Page 7: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Aspects of a tree

1. Topology (branching order)

2. Branch lengths (indication of genetic

change)

3. Nodes

i. Tips (sampled sequences known as taxa)

ii. Internal nodes (hypothetical ancestors)

iii. Root (oldest point on the tree)

4. Confidence (bootstraps/probabilities)

*

*

Page 8: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

1. Topology

The topology describes the branching structure of the tree, which indicate patterns of relatedness.

A B C ABCB A CThese trees display the

same topology

A B C CBAC A BThese trees

display different topologies

Page 9: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Topology Question

Are these topologies the same?

Answer = yes

Page 10: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Topology Question II

Which of these trees has a different topology from the others?

A B CF DE A E DF BC B A CF DE

C A BF ED E D FC AB

Page 11: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

2. Branch lengths indicate genetic change

• Longer branches indicate greater change

• Change is typically represented in units of number of substitutions per site (but check the legend)

1.20.6

0.8

0.5

0.5

0.5

Page 12: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

A scale bar can represent branch lengths

0.5

These are alternative representations of the same phylogeny

1.20.6

0.8

0.5

0.5

0.5

Page 13: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Branch Length Question

Which of these statements are true?

1. For both gene trees, the Fish is the most genetically different of the four species compared

2. For both gene trees, more substitutions have occurred since the divergence of Dog and Snake than they have since Cat and Snake

3. Gene B has accumulated more substitutions than Gene A on the Snake lineage

4. Gene B has accumulated more substitutions than Gene A on the Fish lineage

0.5

Fish

Snak

e

Dog

Cat

Gene A

Fish

Snak

e

Dog

Cat

Gene B

Page 14: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Alternative representations of phylogenies

All of these representations depict the same topologyBranch lengths are indicated in blue

Red lengths are meaningless

Page 15: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Not all trees include branch length data

Cladogram Phylogram

Page 16: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Distance and substitution rate are confounded• Branch lengths indicate the genetic change that

has occurred

• We often don’t know if long branch lengths reflect:

• A rapid evolutionary rate

• An ancient divergence time

• A combination of both

• Genetic change = Evolutionary rate x Divergence time (substitutions/site) (substitutions/site/year) (years)

C

D

EA B

Page 17: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Alternative Representations Question

Page 18: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

3. Nodes

• Nodes occur at the ends of branches

• There are three types of nodes:

i. Tips (sampled sequences known as taxa)

ii. Internal nodes (hypothetical ancestors)

iii. Root (oldest point on the tree)

C D EA B

Figures Andrew Rambaut

Page 19: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

The root is the oldest point on the tree

• The root indicates the direction of evolution

• It is also the (hypothesised) most recent common ancestor (MRCA) of all of the samples in the tree

C D EA B

past

present

Figures Andrew Rambaut

Page 20: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Trees can be drawn in an unrooted form

Rooted Unrooted

These are alternative representations of the same topology

C D EA BA

B

C

D

E

Page 21: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

There are multiple rooted tree topologies for any given unrooted tree

• Most tree-building methods produce unrooted trees

• Identifying the correct root is often critical for interpretation!

*

Figure Aiden Budd

Page 22: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

How to root a tree

• Midpoint rooting

• Assume constant evolutionary rate

• Often not the case!

• Outgroup rooting

• The outgroup is one or more taxa that are known to have diverged prior to the group being studied

• The node where the outgroup lineage joins the other taxa is the root

Midpoint rooted

Outgroup rooted

Unrooted

Recommended

Page 23: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Root Question

This tree shows a cladogram i.e. the branch lengths do not indicate genetic change.

Indicate any root positions where bird and crocodile are not sister taxa (each other's closest relatives).

Page 24: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

4. Confidence

How good is a tree?A tree is a collection of hypotheses so we assess our confidence in each of its parts or branches independently

There are three main approaches:

• Bootstraps

• Bayesian methods

• Approximate likelihood ratio test (aLRT) methods

85

63

100

probabilistic

0.93

0.81

0.99

Page 26: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Confidence Question

Which of the bootstrap values indicates our confidence in the grouping of A, B, C, and D together as a monophyletic group? Do you think we can be confident in this grouping?

A

B

C

D

E

F

84

63

91

100

Page 27: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Review

1. Topology (branching order)

2. Branch lengths (indication of genetic

change)

3. Nodes

i. Tips (sampled sequences known as taxa)

ii. Internal nodes (hypothetical ancestors)

iii. Root (oldest point on the tree)

4. Confidence (bootstraps/probabilities)

*

*

Page 28: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Simple phylogenetic interpretation question• Which is true?

• A) Mouse is more closely related to fish than frog is to fish

• B) Lizard is more closely related to fish than mouse is to fish

• C) Human and frog are equally related to fish

Page 29: Introduction to Phylogenies Dr Laura Emery Laura.Emery@ebi.ac.uk .

Now it is your turn…

• Open your tutorial manual and begin Tree-thinking quiz 1 (appendix 1)

• The manual is available to download from:

http://www.ebi.ac.uk/training/course/scuola-di-bioinformatica-2013

• When you are finished you can mark your own (the answers are at the end of the quiz).

• Remember to ask for help at any stage!