Open Tree of Life @NSF
-
Upload
karen-cranston -
Category
Technology
-
view
157 -
download
1
description
Transcript of Open Tree of Life @NSF
Karen CranstonNational Evolutionary Synthesis Center
@kcranstnhttp://www.slideshare.net/kcranstn
opentreeoflife.org
What does it mean to “have” the tree of life?
complete & dynamic
browse, download, query
use for research questions
implies digital access
0"
2000"
4000"
6000"
8000"
10000"
12000"
1978"1979"1980"1981"1982"1983"1984"1985"1986"1987"1988"1989"1990"1991"1992"1993"1994"1995"1996"1997"1998"1999"2000"2001"2002"2003"2004"2005"2006"2007"2008"
Num
ber'o
f'pap
ers'p
ublishe
d'
Year'
Phylogeny'papers,'1978;2008'
Source:"ISI"Web"of"Science""
Rapid"increase"in"applica?ons"of"phylogeny,"beginning"in"early"1990s"
graph from David Hillis
Goals
1. Synthesize a complete draft tree of life from existing phylogenies
2. Release in year 1 with:
a. engaging public interface
b. ability to upload new data, explore conflict, see provenance
c. open data: tree, subtrees and source data
Graph databases of taxonomy + source trees •filter / weight input trees
• combine into synthetic trees
• feedback • input new data sets
~ 4% of all published phylogenetic trees
Stoltzfus et al 2012
Inputs: Phylogenetic data
Archiving sequence data is a community norm
assemblyalignmentinference
expertisetime$$$
thermore, a paraphyletic relationship of phorids and syrphidswould support the hypothesis that their shared special mode ofextraembryonic development (dorsal amnion closure) (26)evolved in the stem lineage of Cyclorrhapha and preceded theorigin of the schizophoran amnioserosa.
To test this hypothesis, we used a relatively recent phylogenomicmarker: small, noncoding, regulatory micro-RNAs (miRNAs).miRNAs exhibit a striking phylogenetic pattern of conservationacross the metazoan tree of life, suggesting the accumulation andmaintenance ofmiRNA families throughout organismal evolution
Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (!lnL =344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im-proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The numberof origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology ofthe organisms.
Wiegmann et al. PNAS Early Edition | 3 of 6
EVOLU
TION
Why do we need to database phylogenetic trees?
Heroic data collection efforts
Surveyed >7000 phylogenetic studies in plants, fungi and animals, unicellular organisms
Result: repository of data for >2300 studies, >4800 trees
Remaining data not available digitally
Manuscript accepted to PLoS Biology
Inputs: Taxonomy
Large fraction of species not represented in phylogenies
taxonomy provides backbone & coverage at tips
Need name resolution services for data cleaning
Process
Source trees(Phylografter) Data storage &
synthesis(treemachine)
OpenTree: visualization,
search, downloadTaxonomies(taxamachine)
Source tree management
phylografter.opentreeoflife.org
Source tree & taxonomy synthesis
Novel graph database for phylogenies (treemachine) and taxonomy (taxomachine)
Allows for efficient storage and retrieval
OpenTree
dev.opentreeoflife/opentree
Public tree of life
publictreeoflife.com/tree
open data: requiring CC0 license on source trees
open source software: https://github.com/OpenTreeOfLife
wiki: http://opentree.wikispaces.com/ (52 members)
public mailing list (67 members)
“Open” Tree of Life
Community engagement
~50 visitors per day to blog.opentreeoflife.org
@opentreeoflife on Twitter (~900 followers)
Tree of Life symposium: Evolution 2013
Hackathon in year 2 (joint with Arbor)
Collaborations
providing images and text for public tree
developing methods for subtree extraction
summer student providing links to ToLWeb pages
treeviz project from U Indiana MOOC, upcoming summer intern
year 2-3 plans for data archiving / harvest
Assessment: PI survey
general satisfaction with progress on data collection, synthesis and software development
more focus on incentives for users
more integration across labs
Assessment: Advisory board
Members:
David Hillis (UT Austin)
Jan Reichelt (Mendeley)
Andy Sinauer (Sinauer Associates)
Planning meeting for start of year 2
On track for year 1 release
1. Synthesize a complete draft tree of life from existing phylogenies
2. Release in year 1 with:
a. engaging public interface
b. ability to upload new data, explore conflict, see provenance
c. open data: tree, subtrees and source data
Goals for year 2
Refine draft tree based on user feedback
Empirical use cases drive development
Incentives for users / data contributors
Collaboration with external projects (AVAToL, ToLWeb, Phylotastic, Dryad)
opentreeoflife.org