A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen,...
-
Upload
gyles-arnold -
Category
Documents
-
view
217 -
download
0
Transcript of A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen,...
![Page 1: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/1.jpg)
A Bayesian method for DNA barcoding
Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen,
University of Copenhagen
![Page 2: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/2.jpg)
Varieties of barcoding
• Assignment to existing species.
• Identification of new species.
• Assignment to taxonomic levels in general
![Page 3: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/3.jpg)
Motivation
1. Environmental aDNA samples.
2. Putative Neandertal DNA.
• Often short query sequences.– Little information.
• Permissive PCR conditions.– Not always from the intended locus.
![Page 4: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/4.jpg)
Given a set of database reference sequences from different species
– according to which criteria should we assign new query sequences to taxonomic levels?
?
![Page 5: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/5.jpg)
True species assignment
• Requires proper population genetic analyses quantifying variablity within species.
• Often not possible...– small database sample size for each species.– short query PCR products.
![Page 6: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/6.jpg)
Phylogenetic alternative
- Purely phylogenetic criteria which ignore population genetic problems.
- Taxonomic annotation of database sequences is used to map phylogenetic groups to taxonomic levels.
- The simpler approach has its own advangates:
Less data required / Fewer assumptions
![Page 7: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/7.jpg)
Monophyletictaxonomic group
Ingroup or outgroup?
Query
![Page 8: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/8.jpg)
Estimating trees
• Estimation of a single tree is not sufficient because of the uncertainty regarding the phylogeny.
• We suggest instead to use a Bayesian approach which quantifies this uncertainty
![Page 9: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/9.jpg)
Bayesian approach
• Let Q be the query sequence, X the database data, G a gene tree, and F a desired taxonomic group, then
where Gi is the ith gene tree sampled from p(G | X).
k
ii
G
GFQIk
dGXGpGFQIXFQ
1
)in icmonophylet ,(1
)|()in icmonophylet ,()|Pr(
![Page 10: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/10.jpg)
Assignment pipeline
SummaryStatistics
QuerySequence
Homologyset
Taxonomysummary
Sampledtrees
Alignment
Database(GenBank)
NCBI blastRetrieval of sequences and taxonomy annotation
ClustalW
MrBayes
![Page 11: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/11.jpg)
Summary statistics
• For each tree:– Find the sister group to the query.– Find the list of taxonomic levels shared by the
sequences in the sister group (consensus taxonomy)
Sister group Query
![Page 12: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/12.jpg)
Summary statistics
• For each tree:– Find the sister group to the query.– Find the list of taxonomic levels shared by the
sequences in the sister group (consensus taxonomy)
• For each name of each taxonomic level:– Find the fraction of samples trees where the
consensus taxonomy include that name.
![Page 13: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/13.jpg)
Example taxonomy summary
![Page 14: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/14.jpg)
Environmental Samples
• 379 environmental samples (aDNA)
• RBCL and TRNL markers.
• Aim is the identification of environmental flora
![Page 15: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/15.jpg)
Orders >90%
Asterales Brassicales Caryophyllales Coniferales
Dipsacales Ericales Fabales Fagales
Lamiales Lepidoptera Malpighiales Poales
Pottiales Ranunculales Rosales Sapindales
Saxifragales Solanales Zingiberales
![Page 16: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/16.jpg)
Families >90%
Amaranthaceae Asteraceae Betulaceae Brassicaceae
Caprifoliaceae Caryophyllaceae Ericaceae Fabaceae
Fagaceae Juncaceae Musaceae Papaveraceae
Pinaceae Plantaginaceae Poaceae Rosaceae
Rutaceae Salicaceae Saxifragaceae Solanaceae
Taxaceae Theaceae
![Page 17: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/17.jpg)
Genera >90%
Achillea Alnus Aruncus Cerastium
Fagus Musa Picea Pinus
Plantago Poa Saxifraga Symphoricarpos
Taxus
![Page 18: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/18.jpg)
Botanical evaluation
Temperate climate
similar to central Sweden.
![Page 19: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/19.jpg)
Testing putative Neandertal DNA
• Needless to say we have had several negative examples ...
• One positive example:– Posterior probability of 91%.
![Page 20: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/20.jpg)
Problems
• No population genetic modelling:– Outgroup problem.– Species issues are is not addressed.– Lineage sorting - not reciprocal monophyli.
• Incomplete database
![Page 21: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/21.jpg)
Advantages
• Phylogenetic uncertainty and statistical uncertainty of assignment is addressed.
• Posterior probability of assignment.
• Alternative to single tree assignment.
• Can be used on any database.
![Page 22: A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649e575503460f94b4f923/html5/thumbnails/22.jpg)
Conclusions
• The phylogenetic barcoding does not model the coalescence process.
• It is the appropriate method for assignment with little data, or when assigning to higher taxonomic levels.
• Bayesian approach offers a measure of confidence in assignment.