Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution...

36
Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction

Transcript of Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution...

Page 1: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Terminology of phylogenetic treesTypes of phylogenetic treesTypes of DataCharacter EvolutionApproaches to Phylogeny Reconstruction

Page 2: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Phylogenetic tree (dendrogram)

Nodes: branching pointsBranches: linesTopology: branching pattern

Page 3: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Sister Taxa: two taxa that are more closely relatedto eachother than either is to a third taxon.

A + B

C + D

Page 4: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Branches can be rotated at a node, without changing therelationships among the OTU’s.

Page 5: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Levels of Resolution on a Phylogenetic Tree

Page 6: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Hard polytomy: simultaneous divergence.Soft polytomy: lack of resolution.

Page 7: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Rooted: unique path from root.Unrooted: degree of kinship, no evolutionary path.

Page 8: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Number of possible phylogenetic trees

3 OTU’s: 1 unrooted tree3 rooted trees

4 OTU’s: 3 unrooted trees15 rooted trees.

Page 9: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.
Page 10: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

TYPES OF TREES

Page 11: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Newick (shorthand) format

- text based representation of relationships.

Page 12: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Qualitative vs. quantitative data

Quantitative: continuous data (i.e.height or length)

Qualitative: discrete (2 or more values)Binary: 2 values

Mulitstate: more than 2 values

Most molecular data are qualitativeBinary: presence or absence of band, or gap in sequence

Multistate: nucleotide data (A, T, G, C)

Page 13: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Nucleotide character data

Characters: position in the nucleotide sequence.(i.e. position 352)

Character states: nucleotide at the positionin the nucleotide sequence.(G, A, T, or C)

Page 14: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Unordered: change from one character toanother occurs in one step.(i.e. nucleotide changes)

Ordered: number of steps from one stateto another equals the absolute value ofthe difference between their state number.

1 2 3 4 5 requires 4 steps5 4 3 2 1 requires 4 steps

(reversible vs. unreversible)

Assumptions About Character Evolution

Page 15: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Phylogenetic reconstruction methods take into assumption:

(1) # of discrete steps required for one character state to change into another

(2) probability with which such change occurs.

Page 16: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Step matrix

- number ofsteps requiredbetween characterstates.

Page 17: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Approaches to Phylogeny Reconstruction

Cladistics (parsimony): recency of common ancestryMaximum Likelihood: model of sequence evolutionPhenetics (UPGMA, neighbor joining): overall similarity

Page 18: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Parsimony: General scientific criterion for choosing amongcompeting hypotheses that states that we should acceptthe hypothesis that explains the data most simply andefficiently.

Maximum parsimony method of phylogeny reconstruction:The optimum reconstruction of ancestral character states isthe one which requires the fewest mutations in the phylogenetictree to account for contemporary character states.

PARSIMONY APPROACH

Page 19: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

First step in maximum parsimony analysis:Identify all of the informative sites.

Invariant: all OTU’s possess the same characterstate at the site.

Any invariant site is uninformative.

Page 20: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Two types of variable sites:

Informative: favors a subset of trees over other possible trees.Uninformative: a character that contains no groupinginformation relevant to a cladistic problem (i.e. autapomorphies).

Page 21: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Uninformative: each tree 3 steps

Page 22: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Parsimony Analysis 2nd step: Calculate the minimum numberof substitutions at each informative site

Informative: favors tree 1 over other 2 trees.

1 step 2 steps 2 steps

Page 23: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Final step in parsimony analysis: Sum the number of changes over all informative sites for each possible tree and choose the tree associated with the smallest number of changes.

Site 3

Site 4

Site 5

Site 9

3 steps 3 steps 4 steps

Page 24: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Parsimony Search Methods:

Exhaustive search method: searches all possible fully resolved topologies and guarantees that all of the minimum length cladograms will be found.(not a practical option, time consuming)

Branch and bound methods: begins with a cladogram. The lengthof starting cladogram is retained as an upper bound for useduring subsequent cladogram construction. As soon as a lengthof part of the tree exceeds the upperbound, the cladogram isabandoned. If equal length, cladogram is saved as an optimaltopology. If length is less, it is substituted for the original as the optimal upperbound. (good option for fewer than 20 taxa, time consuming)

Heuristic methods: approximate or “hill climbing technique”Begin with a cladogram, add taxa and swap branches until a shorter length cladogram is found. Procedure can be replicated many times to increase chance of finding minimum length cladogram.

Page 25: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Different types of parsimony analyses:

Unweighted parsimony: all character state changes aregiven equal weight in the step matrix.

Weighted parsimony: different weights assigned todifferent character state changes.

Transversion parsimony: transitions are completelyignored in the analysis, only transversions are considered.

Page 26: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Maximum Likelihood Method:

The likelihood (L) of a phylogenetic tree is theprobability of observing the data (nucleotide sequences)under a given tree and a specified model ofcharacter state changes.

The aim is to find the tree (among all possible trees)with the highest L value.

Page 27: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Models of character state changes (sequence evolution):

Jukes and Cantor 1 parameter model: all changes equal probabilityKimura 2 parameter model: transitions more frequent than

transversionsOther more complicated models…...

Page 28: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

1. Calculate likelihoodfor each site on a specific tree.

2. Sum up the L values for all sites onthe tree.

3. Compare the Lvalue for all possibletrees.

4. Choose tree withhighest L value.

Page 29: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Distance Methods: evolutionary distances (number of substitutions)are computed for all pairs of taxa.

UPGMA: unweighted pairgroup method with arithmetic means- assumes equal rate of substitutions- sequential clustering algorithms- pairs of taxa are clustered in order of decreasing similarity

Neighbor Joining: finding shortest (minimum evolution) tree by finding neighbors that minimize the total length of the tree. Shortest pairs arechosen to be neighbors and then joined in distance matrix as one OTU.

Page 30: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Consensus Methods:

Consensus trees are derived from a set of trees andsummarize the phylogenetic information of severaltrees in a single tree.

Most commonly used consensus trees:

Strict consensus: all conflicting branching patterns arecollapsed.

50% majority rule consensus: branching patterns thatoccur with a frequency of 50% or more are retained,all others are collapsed.

Page 31: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

ABCDE

FG

A

BC

DE

FG

A

BCD

E

FG

ABCDE

FG

ABCDE

FG

CONSENSUS METHODS

Page 32: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Bootstrap method of assessing tree reliability:

Inferred tree is constructed from data set. Characters are resampled from the data set with replacement. Resampling is replicated several (100-1000) times.

Bootstrap trees are constructed from the resampled data sets.

Bootstrap tree is compared to original inferred tree.

% of bootstrap trees supporting a node are determined foreach node in the tree.

Page 33: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Homoplasy: non-homologous similarity- resemblance not due to common ancestry- evolved independently- considered “noise”

Page 34: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.
Page 35: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Known bacterial phylogeny:ancestors at each node known.

Hillis & Huelsenbeck 1992tested the ability of different methods, of finding the “true” phylogeny.

Maximum parsimony andmaximum likelihood performedwell, UPGMA & neighborjoining did not.

Page 36: Terminology of phylogenetic trees Types of phylogenetic trees Types of Data Character Evolution Approaches to Phylogeny Reconstruction.

Strengths and Weaknesses:

UPGMA & neighbor-joining: fast but not as accurate asother methods.

Maximum parsimony: time consuming, but more accurate.can combine morphological characters with DNA charactersin a single analysis.

Maximum likelihood: very time consuming, includinginformation from morphology is a new technique (but it iscontroversial), can invoke a specific model of sequence evolution.

Reference: Molecular Systematics 2nd Ed., Hillis et. al (1996), Sinauer Associates. ISBN:0-87893-282-8