Open Tree of Life @NSF

22
Karen Cranston National Evolutionary Synthesis Center @kcranstn http://www.slideshare.net/kcranstn opentreeoflife.org

description

Presentation about Open Tree of Life given at NSF, May 2013

Transcript of Open Tree of Life @NSF

Page 1: Open Tree of Life @NSF

Karen CranstonNational Evolutionary Synthesis Center

@kcranstnhttp://www.slideshare.net/kcranstn

opentreeoflife.org

Page 2: Open Tree of Life @NSF

What does it mean to “have” the tree of life?

complete & dynamic

browse, download, query

use for research questions

implies digital access

Page 3: Open Tree of Life @NSF

0"

2000"

4000"

6000"

8000"

10000"

12000"

1978"1979"1980"1981"1982"1983"1984"1985"1986"1987"1988"1989"1990"1991"1992"1993"1994"1995"1996"1997"1998"1999"2000"2001"2002"2003"2004"2005"2006"2007"2008"

Num

ber'o

f'pap

ers'p

ublishe

d'

Year'

Phylogeny'papers,'1978;2008'

Source:"ISI"Web"of"Science""

Rapid"increase"in"applica?ons"of"phylogeny,"beginning"in"early"1990s"

graph from David Hillis

Page 4: Open Tree of Life @NSF

Goals

1. Synthesize a complete draft tree of life from existing phylogenies

2. Release in year 1 with:

a. engaging public interface

b. ability to upload new data, explore conflict, see provenance

c. open data: tree, subtrees and source data

Page 5: Open Tree of Life @NSF

Graph databases of taxonomy + source trees •filter / weight input trees

• combine into synthetic trees

• feedback • input new data sets

Page 6: Open Tree of Life @NSF

~ 4% of all published phylogenetic trees

Stoltzfus et al 2012

Inputs: Phylogenetic data

Archiving sequence data is a community norm

Page 7: Open Tree of Life @NSF

assemblyalignmentinference

expertisetime$$$

thermore, a paraphyletic relationship of phorids and syrphidswould support the hypothesis that their shared special mode ofextraembryonic development (dorsal amnion closure) (26)evolved in the stem lineage of Cyclorrhapha and preceded theorigin of the schizophoran amnioserosa.

To test this hypothesis, we used a relatively recent phylogenomicmarker: small, noncoding, regulatory micro-RNAs (miRNAs).miRNAs exhibit a striking phylogenetic pattern of conservationacross the metazoan tree of life, suggesting the accumulation andmaintenance ofmiRNA families throughout organismal evolution

Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (!lnL =344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Nodes with im-proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95–100%, gray/bp = 88–94%, white/bp = 80–88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The numberof origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology ofthe organisms.

Wiegmann et al. PNAS Early Edition | 3 of 6

EVOLU

TION

Why do we need to database phylogenetic trees?

Page 8: Open Tree of Life @NSF

Heroic data collection efforts

Surveyed >7000 phylogenetic studies in plants, fungi and animals, unicellular organisms

Result: repository of data for >2300 studies, >4800 trees

Remaining data not available digitally

Manuscript accepted to PLoS Biology

Page 9: Open Tree of Life @NSF

Inputs: Taxonomy

Large fraction of species not represented in phylogenies

taxonomy provides backbone & coverage at tips

Need name resolution services for data cleaning

Page 10: Open Tree of Life @NSF

Process

Source trees(Phylografter) Data storage &

synthesis(treemachine)

OpenTree: visualization,

search, downloadTaxonomies(taxamachine)

Page 11: Open Tree of Life @NSF

Source tree management

phylografter.opentreeoflife.org

Page 12: Open Tree of Life @NSF

Source tree & taxonomy synthesis

Novel graph database for phylogenies (treemachine) and taxonomy (taxomachine)

Allows for efficient storage and retrieval

Page 13: Open Tree of Life @NSF

OpenTree

dev.opentreeoflife/opentree

Page 14: Open Tree of Life @NSF

Public tree of life

publictreeoflife.com/tree

Page 15: Open Tree of Life @NSF

open data: requiring CC0 license on source trees

open source software: https://github.com/OpenTreeOfLife

wiki: http://opentree.wikispaces.com/ (52 members)

public mailing list (67 members)

“Open” Tree of Life

Page 16: Open Tree of Life @NSF

Community engagement

~50 visitors per day to blog.opentreeoflife.org

@opentreeoflife on Twitter (~900 followers)

Tree of Life symposium: Evolution 2013

Hackathon in year 2 (joint with Arbor)

Page 17: Open Tree of Life @NSF

Collaborations

providing images and text for public tree

developing methods for subtree extraction

summer student providing links to ToLWeb pages

treeviz project from U Indiana MOOC, upcoming summer intern

year 2-3 plans for data archiving / harvest

Page 18: Open Tree of Life @NSF

Assessment: PI survey

general satisfaction with progress on data collection, synthesis and software development

more focus on incentives for users

more integration across labs

Page 19: Open Tree of Life @NSF

Assessment: Advisory board

Members:

David Hillis (UT Austin)

Jan Reichelt (Mendeley)

Andy Sinauer (Sinauer Associates)

Planning meeting for start of year 2

Page 20: Open Tree of Life @NSF

On track for year 1 release

1. Synthesize a complete draft tree of life from existing phylogenies

2. Release in year 1 with:

a. engaging public interface

b. ability to upload new data, explore conflict, see provenance

c. open data: tree, subtrees and source data

Page 21: Open Tree of Life @NSF

Goals for year 2

Refine draft tree based on user feedback

Empirical use cases drive development

Incentives for users / data contributors

Collaboration with external projects (AVAToL, ToLWeb, Phylotastic, Dryad)

Page 22: Open Tree of Life @NSF

opentreeoflife.org