OpenTree at NESCent Academy 2012
-
Upload
karen-cranston -
Category
Technology
-
view
248 -
download
2
description
Transcript of OpenTree at NESCent Academy 2012
A community-assembled, continually updated evolutionary history of all life
Karen A. CranstonNational Evolutionary Synthesis Center
Duke University
0"
2000"
4000"
6000"
8000"
10000"
12000"
1978"1979"1980"1981"1982"1983"1984"1985"1986"1987"1988"1989"1990"1991"1992"1993"1994"1995"1996"1997"1998"1999"2000"2001"2002"2003"2004"2005"2006"2007"2008"
Num
ber'o
f'pap
ers'p
ublishe
d'
Year'
Phylogeny'papers,'1978;2008'
Source:"ISI"Web"of"Science""
Rapid"increase"in"applica?ons"of"phylogeny,"beginning"in"early"1990s"
Where can I browse, search and download the
tree of life?
You can’t. (Yet)
0"
2000"
4000"
6000"
8000"
10000"
12000"
1978"1979"1980"1981"1982"1983"1984"1985"1986"1987"1988"1989"1990"1991"1992"1993"1994"1995"1996"1997"1998"1999"2000"2001"2002"2003"2004"2005"2006"2007"2008"
Num
ber'o
f'pap
ers'p
ublishe
d'
Year'
Phylogeny'papers,'1978;2008'
Source:"ISI"Web"of"Science""
Rapid"increase"in"applica?ons"of"phylogeny,"beginning"in"early"1990s"
DATA AVAILABILITY
~4% of all published phylogenetic trees
High archival rate of sequence data
thermore, a paraphyletic relationship of phorids and syrphidswould support the hypothesis that their shared special mode ofextraembryonic development (dorsal amnion closure) (26)evolved in the stem lineage of Cyclorrhapha and preceded theorigin of the schizophoran amnioserosa.
To test this hypothesis, we used a relatively recent phylogenomicmarker: small, noncoding, regulatory micro-RNAs (miRNAs).miRNAs exhibit a striking phylogenetic pattern of conservationacross the metazoan tree of life, suggesting the accumulation andmaintenance ofmiRNA families throughout organismal evolution
Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (!lnL =344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im-proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The numberof origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology ofthe organisms.
Wiegmann et al. PNAS Early Edition | 3 of 6
EVOLU
TION
Most trees published as (beautiful) figures
in PDF files
not reusable!
Weigmann et al. PNAS, 2011
Pictures of independent phylogenies
• Ideas Lab = 5-day workshop• Self-assembly into groups• Pitched pre-proposals and end of lab• NSF invited full proposals
opentreeoflife.org
Karen Cranston, lead PI (Duke)
Gordon Burleigh (Florida)
Keith Crandall (BYU)
Karl Gude (MSU)
David Hibbett (Clark)
Mark Holder (Kansas)
Laura Katz (Smith)
Rick Ree (FMNH)
Stephen Smith (Michigan)
Doug Soltis (Florida)
Tiffani Williams (TAMU)
AVAToL: Assembling, Visualizing and Analysis of the Tree of Life
• 1.8 million named species
•Millions more unnamed / undiscovered
Tree of life
COMPARATIVE BIOLOGY
Modified from Garland and Carter, 1994
Conventional statistics assume:
Evolutionary trees provide:
PHYLOGENETIC PLACEMENT
Metagenomic reads+
Reference phylogeny
Kembel et al 2011
1. Build the first complete draft tree of life
2. Engage the community in refinement and annotation
3. Promote a culture of data sharing through software products
4. Develop novel methods for phylogenetic synthesis
+ taxonomies of living and extinct species+ any digital phylogenetic data we can get: NSF Assembling the Tree of Life projects recent high-profile phylogenies ribosomal RNA trees for Bacteria and Archaea TreeBASE and Dryad trees
Graph database holding a ‘cloud’ of thousands of input trees with millions of nodes
Graph database holding thousands of input trees with millions of nodes
Filter / weight input data (number of taxa, size of alignment, year of publication, etc)
Synthesis (supertrees, grafting)
Graph database holding a ‘cloud’ of thousands of input trees with
millions of nodes • filter input trees• synthesize into summary
trees
• compare to previous trees• invite annotation• input new data sets
$GG�FLWDWLRQV
)ODJ�DV�GLVSXWHG
8SORDG�DOWHUQDWLYH
5HTXHVW�UHDQDO\VLV
Tree image modified from Tree of Life Web Project page http://tolweb.org/Nymphalidae/12172 Pictures by Katja Schulz (queen butterfly; CCAttribution-NonCommercial) and Charles Lam (via Flicker ;CCAttribution-ShareAlike)
FlagGet citationsAnnotateUpload alternate
Ability to annotate and improve
Clear links to source data and methods
Compare your results with synthetic tree
Lonicera ciliosaHeptacodium miconioidesDiervilla rivularisValeriana celticaViburnum densiflorum
Lonicera ciliosa
Heptacodium miconioides
Diervilla rivularis
Viburnum densiflorum
Valeriana celtica
http://www.evoio.org/wiki/Phylotastic
NESCent hackathon to architect and implement a phylogenetic pruning service for megatrees
YEAR 2 & 3: SMART GENERATION OF FIGURES FOR PUBLICATION
thermore, a paraphyletic relationship of phorids and syrphidswould support the hypothesis that their shared special mode ofextraembryonic development (dorsal amnion closure) (26)evolved in the stem lineage of Cyclorrhapha and preceded theorigin of the schizophoran amnioserosa.
To test this hypothesis, we used a relatively recent phylogenomicmarker: small, noncoding, regulatory micro-RNAs (miRNAs).miRNAs exhibit a striking phylogenetic pattern of conservationacross the metazoan tree of life, suggesting the accumulation andmaintenance ofmiRNA families throughout organismal evolution
Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (!lnL =344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im-proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The numberof origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology ofthe organisms.
Wiegmann et al. PNAS Early Edition | 3 of 6
EVOLU
TION
• Semantic annotation layers
• Collaborative editing
• Integrated submission of topology, branch lengths and annotations to archives
YEAR 2 & 3: AUTOMATIC UPDATING
update trees with new
sequence data
detect and incorporate newly published trees
Community assembly of the tree of life (Open Tree of Life)
Next generation Phenomics (PI O’Leary)
Arbor: Comparative Analysis Workflows (PI Harmon)
POTENTIAL IMPACTS
• Phylogenies for any set of species easily available
• Benchmark for current state of phylogenetic knowledge
• Increasing rate of data archive
• Placing “dark taxa” in global informatics framework
BIGGEST CHALLENGES?
• Lack of digitally-available trees
• Visualization
• Engaging community to annotate and update
• Producing usable and visually appealing software