A Short Introduction on Cladistics

download A Short Introduction on Cladistics

of 70

Transcript of A Short Introduction on Cladistics

Prsentation PowerPoint

A Short Introduction on CladisticsChristophe HENDRICKX, Phd Student

I. Theory and terminology

SystematicsSystematics: study of the diversification of living organisms, both past and present, and the evolutionary relationships among groups of organisms through time.

Systematics:Provides scientific names for organisms.Provides classifications for the organisms.Describes organisms.Preserves collections of them.Investigates evolutionary histories of organisms.Considers their environmental adaptations.Taxonomy

TaxonomyTaxonomy: Classification, identification, and naming of organisms.

TaxonTaxon (pl. taxa): Group of two or more organisms. Usually a taxon is given a name (ex. plants, dinosaurs, birds, dogs, etc.) and a rank (Kingdom, class, family, genus, species, etc.), but neither is required.

Plants (Plantae Kingdom)Birds (Aves Class)Lion (Panthera leo Species)

CladisticsCladistics: Method of classification that groups taxa hierarchically into nested sets based on shared characters.

PhylogeneticsPhylogenetics: study of evolutionary relationships among groups of organisms.

CladogramCladogram (= phylogenetic tree) : A branching diagram specifying hierarchical relationships among taxa based upon homologies.

Homology

Homology: Structural similarities, correspondence of features in different organisms that is due to inheritance from a common ancestor. Character present in an ancestor and its descendants. Richard Ower: the same organ in different animals under every variety of form and function. Ex. the forelimb of tetrapods.

AnalogyAnalogy: similarity of function and superficial resemblance of structures that have different origins. Ex. the wing of insects and flying vertebrates.

ConvergenceConvergence: acquisition of the same biological trait in unrelated lineages. Ex. The hydrodynamic and pisciform shape of the body of sharks, ichtyosaurs and dolphins.

Terminology in a cladogramBranche: Line on a cladogram connecting two nodes (internal branches or internodes), a node and the root (basal branch) or a node and a terminal taxon (terminal branch). Node: point on a cladogram where three or more branches meet.

Terminology in a cladogramTerminal taxon/node: A taxon placed at one end of a terminal branch. Taxon under comparison. Operational taxonomic units (OTUs).Internal node: ancestral unit.Root: common ancestor of all OTUs under study.The path from root to node and nodes to nodes defines an evolutionary path.

Rooting treeInferring evolutionary relationships between the taxa requires rooting the tree.

Terminology in a cladogramOutgroup: taxon used for comparative purposed. Serves as a reference taxon for determination of the evolutionary relationship among three or more clades.Ingroup: clade that includes all taxa of interest to the current study. Group of interest under investigation in order to resolve the relationships of its members.

Groups

Monophyletic group (= clade): group including a most recent common ancestor and all its descendants. (Ex. dinosaurs, birds, mammals, etc.).Monophyletic groups are characterized by shared derived characters.

Groups

Paraphyletic group: group including a most recent common ancestor and only some of its descendants. Only recognized by the absence of synapormorphies. Ex. gymnosperms (- angyosperms), fishes (- tetrapods), prosauropods (- sauropods), etc.

Groups

Polyphyletic group: group that does not include the most common ancestor of all its members. Ex. warm blooded animals (mammals and birds).

Types of characters

Plesiomorphy: ancestral/primitive character or character state (usually coded 0 in a datamatrix).

Types of characters

Apomorphy: derived character or character state (usually coded 1, 2, etc. in a datamatrix).Types of characters

Synapomorphy (= homology): Apomorphy (derived character) that unites two or more taxa into a monophyletic group (clade). Derived character(s) defining a clade.Types of characters

Symplesiomorphy: Plesiomorphy (ancestral character) shared by two or more taxa into a monophyletic group (clade).Types of characters

Autapomorphy: Apomorphy (derived character) that is restricted to a single terminal taxon in a data set. Derived character(s) defining a taxon (Ex. a genus or species).Types of characters

Homoplasy: Similarity in species of different ancestry that is the result of convergent evolution. Correspondence between parts or organs acquired as the result of parallel evolution or evolutionary convergence..

Any character that is not a synapomorphy.

II. Cladistic analysis

How to construct a cladogram?

Select your OTUs (Operational Taxonomic Units) = ingroup. They have shared primitive (plesiomorphies) and derived characters (synapomorphies).

Select an outgroup. The outgroup has one or several shared primitive character that is common to all OTUs.

Construct a character table and code each OTUs.

Construct a cladogram based on the number of shared characters. The more shared characters, the more closely related are the OTUs. How to construct a cladogram?

Lancelot00000Lamprey00011Tuna00011Salamander00111Turtle01111Leopard11111Characters

Character: Observable feature of an orgasism used to distinguish it from another.Character state: Scored observation of a feature perceived in an organism choosen as an OTUs.

Example:Dentary ramus: (0) elongate; (1) shortened, not much longer than tall.Character = elongation of the dentary ramus.Character state = elongated dentary ramus.

Characters

Discrete characters: denumerable character, character that can be represented by a subset of all possible real number.Binary characters: characters that have just two states. Usually coded as 0 and 1 (e.g. absence/presence). Multistates characters: character that has more than two observed states. Usually coded as 0, 1, 2, 3...n. Can be ordered or unordered.

Characters

Polarized characters: character or transformation series where the direction of character change or direction of evolution has been specified, thereby determining the relative plesiomorphy (primitive character) or apomorphy (derived character) of the characters or character states.

Characters

Ordered characters: A multistate characters of which the order has been determined.

Transformation between two adjacent states costs the same number of steps, but transformation between two non-adjacent states costs the sum of the steps between their implied adjacent states.Ex. 0 1 2. 0 1 and 1 2 costs the same, but 0 2 costs twice as many.

Characters

Ordered characters: A multistate characters of which the order has been determined.

Example: Tooth row: (0) extends posteriorly to approximately half the length of the orbit; (1) ends at the anterior rim of the orbit; (2) completely antorbital, tooth row ends anterior to the vertical strut of the lacrimal.

120Characters

Continuous characters: Character for which potential values are so infinitesimally close that there are potentially no disallowable real numbers.

Example: Quadrate, elongation (ratio: lateromedial width of mandibular articulation/ventrodorsal length from entocondyle to cotylus).

Coding methods

Coding methods

Characters 2 and 3 are inapplicable for taxa W (usually coded 9 or -)Coding methods

Molecular data

Datamatrix

0(usually) plesiomorphic characters.1 or 2 (usually) apomorphic characters.2multistate characters.[01]polymorphic character.?unknown data, missing values.-inapplicable characters.Taxa = OTUs (divided into outgroup and ingroup)

Instructions Creating a datamatrixOpen Mesquite

File New given a name to the .nex file you are creating. Ex. spino.nex

Name: Taxa (or genera).Number of taxa: 5 (in our case).Select Make character Matrix.New character (lets say 10).

Create your datamatrix by naming your OTUs (taxa), defining your character and character states and coding your taxa for each characters.

0plesiomorphic characters.1 or 2 apomorphic characters.2multistate characters.

Once its done, save it (Ctrl+s).

[01]polymorphic character.?unknown data, missing values.-inapplicable characters.

Instructions Creating a datamatrix

Instructions Creating a datamatrixOpen the .nexus file with Notepad.

Remove all the text and only keep the datamatrix newly created.You must have something looking like this:

nstates 2 xread10 5Eustreptospondylus 1000000000Baryonyx 11110011[12]1Suchomimus 1111001121Irritator_Angaturama11?111-12?Spinosaurus 111111-110;proc/;

Add in the beginning nstates 2the number of different states, here two. (up to 32)xread10 5 the number of characters number of taxa.The polymorphic characters have to be bracketed with quadrangular brackets. e.g. [01] or [012].

The inapplicable characters are coded - rather than 9. There are treated the same way as ?.

Instructions Creating a datamatrixnstates 2 xread10 5Eustreptospondylus 1000000000Baryonyx 11110011[12]1Suchomimus 1111001121Irritator_Angaturama11?111-12?Spinosaurus 111111-110;proc/;

Add as a last line;proc/;

Save the file as a .txt file or as a.tnt file.

For numerical (discrete characters), TNT accept up to 32 states noted 0 to 9, then A to V for state 10 to 31.

Instructions Creating a datamatrixnstates 2 xread10 5Eustreptospondylus 1000000000Baryonyx 11110011[12]1Suchomimus 1111001121Irritator_Angaturama11?111-12?Spinosaurus 111111-110;ccode +8;;proc/;

If you want to order some characters, add the following two lines after the datamatrix.;ccode + 35 64;(here characters 35 and 64 are now ordered)

Be aware that, in TNT, the first character is not one but zero !!!!Here for instance, there are 10 characters from 0 to 9.

Instructions datamatrix with continuous charactersnstates 32xread3 5& [cont]A 1.23 3 8.7B 2.35? 5.36C 3.65 7.89 0.25D 4.65 23.23 0.87E 8.25 23.23 8;proc/;nstates 32xread6 5 & [cont]A 1.23 3 8.7B 2.35? 5.36C 3.65 7.89 0.25D 4.65 23.23 0.87E 8.25 23.23 8

& [num]A 0 0 1B 1 1 2C 1 2 3D 1 2 1 E 1 3 2;ccode + 5 6;;proc/;In TNT, the values of continuous characters can go up to 65 and can have three decimals.Principe of parsimony

Principe of parsimony: general scientific criterion for choosing among competing hypotheses that states that we would accept the hypothesis that explains the data most simply and efficiently.

The principle of parsimony (Occams Razor) states that a theory about nature should be the simplest explanation that is consistent with facts. Keep it simple.Principe of parsimony

A phylogenetic tree is a hypothesis. There may be many possible trees, but the simplest one is probably the most accurate.Simplest tree = shortest tree. Tree with the fewest character changes and the minimal number of nodes.

Principe of parsimony

A cladistic analysis tries to find the most parsimonious trees (MPTs), all trees that minimize the number of evolutionary changes (steps).

Heuristic search

Heuristic search: Algorithm for constructing cladograms. Try to find the best tree by reducing the set of trees examined and just calculating the score for some likely trees. Does NOT guarantee to find the best tree.

Instructions Heuristic searchOpen the software TNT

File Open input file open the .tnt file newly created.

Analyze New Technology search

Then select theseoptions.

Search

Open the software TNT

File Open input file open the .tnt file newly created.

Analyze New Technology search

Then select theseoptions.

Search

Click here to visualize the consensus tree.

Instructions Visualizing the MPTs

Instructions Visualizing the MPTsOpen the software TNT

File Open input file open the .tnt file newly created.

Analyze New Technology search

Then select theseoptions.

Search

Consensus tree

Consensus tree: convenient way to summarise the agreement between two or more trees. Branching diagram produced using a consensus method, a method combining the grouping information contained in a set of cladograms for the same taxa into a single topology.

Resulting consensus tree:

Polytomy: node which has more than two immediate descending branches.Consensus tree

Strict consensus tree: contains only those clusters found in all the trees (100%).

Majority rule consensus tree: contains all clusters occurring in at least half the trees, contains only those clusters found in a majority (> 50%) of the trees in the profile.

Consensus tree

Semi-strict consensus tree: contains all the uncontradicted clusters in a profile of trees. Includes the clusters retained by the strict consensus tree, but also contains any clusters that are not contradicted by any other clusters in the profile.

Measures of character fit

Tree lenght: minimum number of character changes (steps) required on a cladogram to account for the data.

Consistency index (CI): Measure of the amount of homoplasy in a character relative to a given cladogram. m / s

Retention index (RI): Measure of the amount of similarity in a character that can be interpreted as a synapomorphy. (g s) / (g m)

m = minimum number of steps a character can exhibit on any cladogram.s = minimum number of steps a character can exhibit on the cladogram in question.g = greatest number of steps a character can exhibit on any cladogram.Support for individual clades

Bremer support (= branch support, decay index): number of extra steps required before a clade is lost from the strict consensus tree.

Support for individual clades

Bootstrap analysis: method consisting of creating a large number of pseudoreplicate data sets of the same size as the original by randomly sampling characters with replacement.

Support for individual clades

How does it work? The analysis consist of deleting some characters randomly and reweight the rest randomly. The MPTs for these pseudoreplicates are then calculated. The percentage of pseudoreplicates that recover a given group corresponds to the measure of confidence in the group.

Instructions Bremer support, CI and RICopy and paste your .tnt file into the folder TNT which must include the scripts STATS.run and aquickie.run.

Both are downloadable on Internet on these links:http://tnt.insectmuseum.org/index.php/Scripts/statshttp://tnt.insectmuseum.org/index.php/Scripts/aquickie.run.

Open the software TNT

File Open input file open the .tnt file newly crealy.

In the Command line, enter the command aquickie

The resulting consensus tree will be displayed, as well as the Bremer support.

Then enter the command stats which will display the Consistency and Retention indexes (Ci and Ri).

Instructions Bootstrap analysisIn order to perform a Bootstrap analysis:

Analyze Resampling

Then choose thefollowing options

Ok

Instructions Performing a cladistic analysisIn order to visualize the list of synapomorphies for each clade:

Optimize Synapomorphies List synapomorphiesthen select the tree (the last one is the consensus tree).

To add the list into a publication:

File Output Print display buffer

And then use the software CutePDF Writer freely dowloadable on the Web to save the list in a .pdf file.

To save the tree and arrange them by using Dendroscope.

File Tree Save file Open, parentheticalgive a name to the file and save it as a .tre file.

Instructions Performing a cladistic analysisOpen the .tre file with Notepad.

Delete all the text except the last line (consensus tree) writed like this:

(Eustreptospondylus ((Baryonyx Suchomimus )(Irritator_Angaturama Spinosaurus )))

Then replace the following things:

(= space) by :1.0,)( by ),(,) by)))by):1.0)

Add as a first line: # DENDROSCOPE{TREETreeAnd as a last line: ;}

Save the new file as a .tre file.

Instructions Performing a cladistic analysisYou must then have something like this:

#DENDROSCOPE{TREE'Tree'(Eustreptospondylus:1.0,((Baryonyx:1.0,Suchomimus:1.0),(Irritator_Angaturama:1.0,Spinosaurus:1.0):1.0));}

Open the .tre file with Dendroscope.

Choose to display the graph asa rectangular phylogram, a rectangular cladogram,a slanted cladogram,or a circular cladogram like this one:

You can name clades (e.g. Spinosauridae, Baryonychinae, ect.), change the font and the colour of each taxon, and add colours to each clades or stems.

Instructions Performing a cladistic analysis

Instructions WincladaOpen the .tps file newly created with Winclada.

Select all characters with your mouse (there must be in green when selected), or Chars select all chars.

Chars Make sel chars NONADDITIVE (fitch) Ok

If you have ordered characters, select the characters to order, then

Chars Make sel chars ADDITIVE (farris) Ok

To perform a Heuristic search:

Analyze Ratchet (Island Hopper)

Island Hop Yes.

Instructions WincladaIn order to display the results and visualize the synpomorphies, select the following options:

The length of the tree as well as the Consistency index (CI) and Retention index (RI) are displayed on the bottom of the window.

Instructions WincladaTo perform a Bootstrap analysis:

Analyze Bootstrap/Jackknife/CR with NONA Bootstrap.

Instructions WincladaTo save trees:

Trees Save ALL Trees to file Name taxa (full names, NOT NONA readable) do it!

Give a name to the .tre file.

Do the same procedure as with TNT in order to read the file with Dendroscope (step 11).

Instructions PAUP*Open the .nex file newly created with Mesquite with PAUP*.

File Open

In the command line, write the following commands and press enter:

HsearchThat will perform a heuristic search. Reset Maxtrees (Automatically increase by 100) if necessary .

Contree all/majrule treefile=name_tree.tre

Gives the strict and majority rule consensus trees, which will be both saved with the name name_tree.tre)

DescribetreesGives the tree length, the Consistency index (CI), the Homoplasy index (HI), the Retention index (RI) and the Rescaled consistenct index (RC).

Instructions PAUP*Open the .nex file newly created with Mesquite with PAUP*.

File Open

In the command line, write the following commands and press enter:

HsearchThat will perform a heuristic search. Reset Maxtrees (Automatically increase by 100) if necessary.

Contree all/majrule treefile=name_tree.tre

Gives the strict and majority rule consensus trees, which will be both saved with the name name_tree.tre)

BootStrap all/treefile= name_tree2.tre

That will perform a Bootstrap analysis on the MPTs and save the results with the name name_tree2.tre.