Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
-
Upload
hilmar-lapp -
Category
Technology
-
view
962 -
download
1
description
Transcript of Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
![Page 1: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/1.jpg)
Towards a Simple, Standards Compliant, and
Generic Phylogenetic Database Module
Hilmar Lapp and Todd VisionNational Evolutionary Synthesis Center
(NESCent)
![Page 2: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/2.jpg)
Rich diversity of online data repositories
![Page 3: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/3.jpg)
Most data is not online
Clark J.R. et al. (2008) A Comparative Study in Ancestral Range Reconstruction Methods: Retracing the Uncertain Histories of Insular Lineages. Systematic Biology,57:5,693-707
Syst. Biol.Data Archive
![Page 4: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/4.jpg)
Little standards support
![Page 5: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/5.jpg)
Accelerating knowledge dissemination: A Story
• Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species.
• Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees.
• The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
![Page 6: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/6.jpg)
Accelerating knowledge dissemination: A Story
• Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species.
• Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees.
• The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
![Page 7: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/7.jpg)
Accelerating knowledge dissemination: A Story
• Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species.
• Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees.
• The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
![Page 8: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/8.jpg)
Accelerating knowledge dissemination: A Story
• Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species.
• Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees.
• The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
![Page 9: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/9.jpg)
• Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data.
• As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data.
• Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
![Page 10: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/10.jpg)
• Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data.
• As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data.
• Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
![Page 11: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/11.jpg)
• Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data.
• As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data.
• Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
![Page 12: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/12.jpg)
• Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data.
• As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data.
• Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
![Page 13: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/13.jpg)
• Other researchers easily download and integrate her results in their own analyses.
• Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it.
• Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs.
• Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
![Page 14: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/14.jpg)
• Other researchers easily download and integrate her results in their own analyses.
• Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it.
• Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs.
• Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
![Page 15: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/15.jpg)
• Other researchers easily download and integrate her results in their own analyses.
• Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it.
• Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs.
• Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
![Page 16: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/16.jpg)
• Other researchers easily download and integrate her results in their own analyses.
• Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it.
• Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs.
• Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
![Page 17: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/17.jpg)
• Other researchers easily download and integrate her results in their own analyses.
• Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it.
• Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs.
• Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
![Page 18: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/18.jpg)
How to get there?
Phylogenetic Database supporting- ontologies
- arbitrary metadata(PhyloDB / BioSQL)
Precompute Query
Optimization
Data loading tools (BioSQL)
Language binding for database model
(BioPerl, Biojava, Biopython, Bioruby)
Topology-oriented Queries
Embeddable Tools
(PhyloWidget,
GBrowse TreeWidget)
Phylogenetic Trees
(Gene, Species)
ITIS, NCBI Taxonomies
Parser libraries for data and semantics
standards (NeXML, CDAO)
Middleware: Query & Persistence Management
Data and other services API (PhyloWS)
supporting exchange standards (NeXML, CDAO)
TaxonomiesCharacter
Data
Metadata (Evolutionary, Biodiversity,
Computational)
Client-based Query
Interfaces
Data Aggregators,
Mash-up Applications
Molecular Data
(Sequences, Annotation)
Ontologies
Data
Management
Tools
![Page 19: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/19.jpg)
Achieving the Vision:Coordinated & open
development,nurturing & harnessing
existing efforts
![Page 20: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/20.jpg)
Database:PhyloDB module
Tree-Name-Identifier-Is_Rooted
Node-Label-Left_Idx-Right_Idx
Edge
Node_Path- distance
Biodatabase
TermTaxon
Bioentry Ontology
-Value-Rank
Node_Qualifier_Value
Tree_Dbxref
-Value-Rank
Edge_Qualifier_Value
Node_Dbxref
-Value-Rank
Tree_Qualifier_Value
-Is_Alternate-Significance
Tree_Root
Dbxref
-Rank
Node_Taxon
-Rank
Node_Bioentry
![Page 23: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/23.jpg)
Semantics: CDAO
http://www.evolutionaryontology.org
![Page 24: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/24.jpg)
Service API: PhyloWShttp://evoinfo.nescent.org/PhyloWS
![Page 25: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/25.jpg)
Embeddable tools:
![Page 26: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/26.jpg)
Community-owned, reusable software
![Page 27: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/27.jpg)
Nurturing the community
![Page 28: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/28.jpg)
Phyloinformatics Hackathon, Dec 2006
![Page 29: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/29.jpg)
• James Estill (U. Georgia):“A Perl-based Command Line Interface to a Topological Query Application for BioSQL in Support of High Throughput Classification and Analysis of LTR Retrotransposons in Plant Genomes”
![Page 30: Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database](https://reader034.fdocuments.us/reader034/viewer/2022051313/548fe949b47959763e8b4e3c/html5/thumbnails/30.jpg)
Acknowledgments
• Phyloinformatics Hackathon participants
• BioHackathon 2008 participants
• EvoInformatics Working Group participants
• Google Summer of Code Students:Jamie Estill
• Sponsors & support:
• NESCent
• BioSynC
• TDWG
• DBCLS, CBRC (Japan)