Multiple Sequence Alignment (MSA) and Phylogeny

28
Multiple Multiple Sequence Sequence Alignment (MSA) Alignment (MSA) and and Phylogeny Phylogeny

description

Multiple Sequence Alignment (MSA) and Phylogeny. Clustal X. Input: multiple sequence Fasta file. >gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens] MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQ - PowerPoint PPT Presentation

Transcript of Multiple Sequence Alignment (MSA) and Phylogeny

Page 1: Multiple Sequence Alignment (MSA) and Phylogeny

Multiple Multiple Sequence Sequence

Alignment (MSA)Alignment (MSA)andand

Phylogeny Phylogeny

Page 2: Multiple Sequence Alignment (MSA) and Phylogeny

Clustal XClustal X

Page 3: Multiple Sequence Alignment (MSA) and Phylogeny

Input: multiple sequence Fasta fileInput: multiple sequence Fasta file>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQMNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANSQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANS

>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]MNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQMNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANSQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANS

>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]MNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQMNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN

>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQMRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN

>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQMHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECLVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECL. . .. . .

Page 4: Multiple Sequence Alignment (MSA) and Phylogeny

OneOne of the options to get multiple of the options to get multiple sequence Fasta filesequence Fasta file

Page 5: Multiple Sequence Alignment (MSA) and Phylogeny

OneOne of the options to get multiple of the options to get multiple sequence Fasta filesequence Fasta file

Page 6: Multiple Sequence Alignment (MSA) and Phylogeny

Input: multiple sequence Fasta fileInput: multiple sequence Fasta file>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQMNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANSQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANS

>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]MNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQMNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANSQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANS

>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]MNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQMNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN

>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQMRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN

>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQMHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECLVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECL. . .. . .

Page 7: Multiple Sequence Alignment (MSA) and Phylogeny

Input: multiple sequence Fasta fileInput: multiple sequence Fasta file>>gi|21536452|ref|NP_002762.2|gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]mesotrypsin preproprotein [Homo sapiens]MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQMNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANSQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANS

>>gi|114051746|ref|NP_001040585.1|gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]protease, serine, 2 [Macaca mulatta]MNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQMNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANSQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANS

>>gi|6755891|ref|NP_035775.1|gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]mesotrypsin [Mus musculus]MNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQMNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN

>>gi|6981422|ref|NP_036861.1|gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]protease, serine, 2 [Rattus norvegicus]MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQMRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN

>>gi|27819626|ref|NP_777115.1|gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]pancreatic anionic trypsinogen [Bos taurus]MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQMHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECLVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECL. . .. . .

Page 8: Multiple Sequence Alignment (MSA) and Phylogeny

Step1: Load the sequencesStep1: Load the sequences

Page 9: Multiple Sequence Alignment (MSA) and Phylogeny

Sequences and conservation viewSequences and conservation view

Page 10: Multiple Sequence Alignment (MSA) and Phylogeny

Step2: Perform AlignmentStep2: Perform Alignment

Page 11: Multiple Sequence Alignment (MSA) and Phylogeny

Sequences and conservation viewSequences and conservation view

Page 12: Multiple Sequence Alignment (MSA) and Phylogeny

Sequences and conservation viewSequences and conservation view

Page 13: Multiple Sequence Alignment (MSA) and Phylogeny

Step 3: Create treeStep 3: Create tree

Page 14: Multiple Sequence Alignment (MSA) and Phylogeny

Step 4: NJPlotStep 4: NJPlot

Page 15: Multiple Sequence Alignment (MSA) and Phylogeny

Step 4: NJPlotStep 4: NJPlot

Page 16: Multiple Sequence Alignment (MSA) and Phylogeny

The Newick tree format is used to represent trees as strings

CA D

In Newick format: ((A,C),(B,D));

B

Each pair of parenthesis () enclose a clade in the tree, and the comma separates the members of the corresponding clade.“;” – is always the last character

Page 17: Multiple Sequence Alignment (MSA) and Phylogeny

How How robustrobust is our tree is our tree??

Page 18: Multiple Sequence Alignment (MSA) and Phylogeny

We need some statistical way to estimate We need some statistical way to estimate the confidence in the tree topologythe confidence in the tree topology

But we don’t know anything about the tree But we don’t know anything about the tree topology distribution or parameterstopology distribution or parameters

The only data source we have is our data The only data source we have is our data (MSA)(MSA)

So, we must rely on our own resources: So, we must rely on our own resources: “pull up by your own bootstraps”“pull up by your own bootstraps”

How robust is our treeHow robust is our tree??

Page 19: Multiple Sequence Alignment (MSA) and Phylogeny

Bootstrap(and jackknife)

Page 20: Multiple Sequence Alignment (MSA) and Phylogeny

Jackknife1. We create n (typically 100-1000) new MSAs (pseudo-data sets) by randomly sampling half of the characters. (random samples without replacement)

We do not change the number of sequences, just the number of positions!

POS: 523161 : TATTT2 : CATTT3 : CACTTN : AACTT

POS: 187451 : TTTAT2 : TAACC3 : TAACCN : TGGGA

POS: 183941 : TTGTA2 : TAGAC3 : TAAACN : TGAGG

Page 21: Multiple Sequence Alignment (MSA) and Phylogeny

Jackknife2. We reconstruct a tree from each data set, using the same method used for reconstructing the original tree

POS: 523161 : TATTT2 : CATTT3 : CACTTN : AACTT

POS: 187451 : TTTAT2 : TAACC3 : TAACCN : TGGGA

POS: 183941 : TTGTA2 : TAGAC3 : TAAACN : TGAGG

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Page 22: Multiple Sequence Alignment (MSA) and Phylogeny

3. For each node in our original tree, we count the number of times it appeared in the Jackknife analysis

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Back to Jackknife

Sp1Sp2

Sp3

Sp4

67%100%

In 67% of the data sets, the node SP1+SP2 was found

Page 23: Multiple Sequence Alignment (MSA) and Phylogeny

Bootstrap

The same as jackknife, but instead of sampling K/2 positions, we sample K positions with replacement

Page 24: Multiple Sequence Alignment (MSA) and Phylogeny

Bootstrap

1. Resample K positions n times

12345 K1 : ATCTG…A 2 : ATCTG…C3 : ACTTA…C N : ACCTA…T

11244 K1 : AATTT…T2 : AATTT…G3 : AACTT…TN : AACTT…T

47789…K1 : TTTAT…T2 : TAACC…G3 : TAACC…TN : TGGGA…T

15578… K1 : AGGTA…T2 : AGGAC…G3 : AAAAC…AN : AAAGG…C

Page 25: Multiple Sequence Alignment (MSA) and Phylogeny

Bootstrap2. Reconstruct a tree from each data set using the same method used for reconstructing the original tree

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

11244 K1 : AATTT…T2 : AATTT…G3 : AACTT…TN : AACTT…T

47789…K1 : TTTAT…T2 : TAACC…G3 : TAACC…TN : TGGGA…T

15578… K1 : AGGTA…T2 : AGGAC…G3 : AAAAC…AN : AAAGG…C

Page 26: Multiple Sequence Alignment (MSA) and Phylogeny

Bootstrap3. For each node in our original tree, we count the number of times it appeared in the bootstrap analysis

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3Sp4

Sp1Sp2

Sp3

Sp4

67%100%

• The jackknife method is less general than bootstrap• Jackknife explores the data differently• Jackknife is easier to apply to complex sampling schemes

Page 27: Multiple Sequence Alignment (MSA) and Phylogeny

Step 3.5 - BootstrapStep 3.5 - Bootstrap

Page 28: Multiple Sequence Alignment (MSA) and Phylogeny

Bootstrap values on NJPlotBootstrap values on NJPlot

Note:ClustalX saves trees as .ph filetrees with bootstrap are saved as .phb

You might have to reopen the tree…