EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for...

Post on 14-Aug-2015

223 views 0 download

Tags:

Transcript of EuKRef. A community effort towards phylogenetic-based curation of ribosomal databases for...

A community effort towards phylogenetic-based curation

of ribosomal databases for environmental sequencing

Javierdel Campo

LauraParfrey

Motivation

Integrate expert views on taxonomy into public database resources

Improve resources for high throughput sequence annotation

• Catalyze experts in protist taxonomy to engage in curation and validation of a ribosomal DNA marker gene database for eukaryotic lineages across the tree of life.

• Synthesize the efforts of individual curators to produce a phylogenetically curated ribosomal DNA marker gene database for eukaryotes.

• Use the improved reference database to characterize the environmental distribution of eukaryotic microbes from large-scale HTES datasets.

Aims

1) HTES 18S rDNA sequence retrieval

2) Reference database annotation

3) Community analysis using classification

High-throughput environmental sequence (HTES) analysis of eukaryotes

Starting reference database 18S phylogeny

Use the phylogeny to improve classification

Reference database 18S phylogenyAfter curation

Integrate environmental metadata

Where was this sequence isolated?• Fresh water or marine?• Aerobic or anoxic?• Host information? (symbiotic clades)

0 500 1000 1500 2000 2500 3000 3500 HTS readsA manually curated reference DBThe opisthos example

Outputs for each group

• Set of curated sequences– Files with chimeric sequences and short sequences

• Alignment of these sequences• Phylogenetic tree• Database

– Full classification (unlimited ranks)– Environmental metadata

• Open access after 1 year embargo (if desired)

18S reference DB curation pipeline (simplified)

18S reference DB curation pipeline

• A refined curation pipeline, associated computational tools, and curation instructions

• Reference databases for individual lineages

• Synthesis of classification for each group

Workshop Outputs

After the workshop…

• Continued curation.

• Recruit new curators using refined tools.

• Coordinate with other groups.

• Integrate data from different curation efforts (into a cohesive database).

• Data sharing and distribution.

Acknowledgments

Thank you!Advisers and participants