DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary,...

32
DNA Barcodes: DNA Barcodes: Linking GenBank Linking GenBank records to Museum records to Museum Specimens Specimens David E. Schindel, Executive Secretary, David E. Schindel, Executive Secretary, CBOL CBOL Robert Hanner, University of Guelph Robert Hanner, University of Guelph

Transcript of DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary,...

Page 1: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

DNA Barcodes:DNA Barcodes:Linking GenBank Linking GenBank

records to Museum records to Museum SpecimensSpecimens

David E. Schindel, Executive Secretary, David E. Schindel, Executive Secretary, CBOLCBOL

Robert Hanner, University of GuelphRobert Hanner, University of Guelph

Page 2: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Washington Airport Gate Washington Airport Gate 33

Dulles, National, or Baltimore-Dulles, National, or Baltimore-Washington?Washington?

2 concourses at BWI concourse A or 2 concourses at BWI concourse A or B?B?

3 concourses at National3 concourses at National 4 Dulles concourses4 Dulles concourses

Page 3: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

The Controlled Vocabulary of Airport Codes

Page 4: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Biodiversity Informatics:Biodiversity Informatics:What Connects the Parts?What Connects the Parts?

Journal Publication

Species Name

Specimen

??

Page 5: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

DNA Barcodes:DNA Barcodes:A Key Variable for Biodiversity A Key Variable for Biodiversity

InformaticsInformatics

Journal Publication

Species Name

Voucher Specimen

Barcode Sequence

Authority files of taxonomic

names

Museum databases of

associated dataDatabases of species

occurrences and distribution (OBIS)

Page 6: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

A DNA barcode is a A DNA barcode is a short gene sequence short gene sequence

taken from taken from standardized portions standardized portions

of the genome,of the genome, used to identify used to identify

speciesspecies

Page 7: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

The Mitochondrial Genome

Cyt bCyt b

D-Loop

ND5

H-strand

ND4

ND4LND3

COIII

COICOIL-strand

ND6

COI

ND2

ND1

COII

Small ribosomal RNA

Large ribosomal RNA

ATPase subunit 8

ATPase subunit 6

Page 8: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

• Promote barcoding as a global standard

• Build participation

• Working Groups

• BARCODE standard

• International Conferences

• Increase production of public BARCODE records

Projects, Networks, Organizations

Page 9: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Uses of DNA BarcodesUses of DNA BarcodesResearch tool for improving species-level taxonomy:Research tool for improving species-level taxonomy: Associating all life history stages, gendersAssociating all life history stages, genders Testing species boundaries, finding new variantsTesting species boundaries, finding new variants

Applied tool for identifying regulated species:Applied tool for identifying regulated species:

Disease vectors, agricultural pests, invasivesDisease vectors, agricultural pests, invasives Environmental indicators, protected species Environmental indicators, protected species Using minimal samples, damaged specimens, gut Using minimal samples, damaged specimens, gut

contents, droppingscontents, droppings

““Triage” tool for flagging potential new species:Triage” tool for flagging potential new species: Undescribed and cryptic speciesUndescribed and cryptic species Taxonomic groups with few morphological Taxonomic groups with few morphological

featuresfeatures

Page 10: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Adoption by RegulatorsAdoption by Regulators US Federal Aviation Administration – All BirdsUS Federal Aviation Administration – All Birds US Environmental Protection AgencyUS Environmental Protection Agency

$250K pilot test, water quality bioassessment$250K pilot test, water quality bioassessment US Food and Drug Administration US Food and Drug Administration

Reference barcodes for commercial fishReference barcodes for commercial fish FISH-BOL and fish regulatory agenciesFISH-BOL and fish regulatory agencies

CBOL workshop in Taipei, September 2007CBOL workshop in Taipei, September 2007 FAO International Plant Protection CommissionFAO International Plant Protection Commission

Proposal for Diagnostic Protocols for fruit flies Proposal for Diagnostic Protocols for fruit flies CITES, National Agencies, Conservation NGOsCITES, National Agencies, Conservation NGOs

International Steering Committee, identifying pilot International Steering Committee, identifying pilot projectsprojects

Page 11: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.
Page 12: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

International Nucleotide International Nucleotide Sequence Database Sequence Database

CollaborationCollaboration

http://www.insdc.org/

Page 13: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Direct Submission to Direct Submission to GenBankGenBank

Page 14: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

BOLD Data SystemBOLD Data System Developed/hosted by Univ. GuelphDeveloped/hosted by Univ. Guelph Workbench for assembling data Workbench for assembling data 300,000 records from 30,000 species300,000 records from 30,000 species Management and Analysis SystemManagement and Analysis System Identification system for matching Identification system for matching

unknowns to reference recordsunknowns to reference records Uploading to GenBankUploading to GenBank

Page 15: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Barcode of Life Data Barcode of Life Data Systems Systems (BOLD)(BOLD)

Page 16: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

BARCODE Data BARCODE Data StandardsStandards

Consensus results of Front Royal Consensus results of Front Royal meetingmeeting GBIFGBIF ITIS ITIS GRIN GRIN NBIINBII Species2000 Species2000 IPNI IPNI ICZNICZN ZooRecord ZooRecord OBIS OBIS

Structured link to voucher specimenStructured link to voucher specimen Species name selected from authoritySpecies name selected from authority Trace files, primers, and quality scoresTrace files, primers, and quality scores Minimum sequence lengthMinimum sequence length

Page 17: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Barcode Sequence

Voucher Specimen

Species Name

Specimen Metadata

Literature(link to content or

citation)

BARCODE Records in INSDC

Indices - Catalogue of Life - GBIF/ECAT

Nomenclators - Zoo Record - IPNI - NameBank

Publication links - New species

GeoreferenceHabitat

Character setsImages

BehaviorOther genes

Trace filesOther

DatabasesPhylogenetic

Pop’n GeneticsEcological

Primers

Databases - Provisional sp.

Page 18: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.
Page 19: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Link from GenBank to Museums

Page 20: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.
Page 21: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Linkout from GenBank to BOLD

Page 22: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Linkout from GenBank to Taxonomy

Page 23: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Barcode Sequence

Voucher Specimen

Species Name

Specimen Metadata

Literature(link to content or

citation)

BARCODE Records in INSDC

Indices - Catalogue of Life - GBIF/ECAT

Nomenclators - Zoo Record - IPNI - NameBank

Publication links - New species

GeoreferenceHabitat

Character setsImages

BehaviorOther genes

Trace filesOther

DatabasesPhylogenetic

Pop’n GeneticsEcological

Primers

Databases - Provisional sp.

Page 24: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Structured Link to Structured Link to VouchersVouchers

Institutional Acronym

Collection Code

Catalog ID

: :

Page 25: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Structured Link to Structured Link to VouchersVouchers

NHM LEP 123456: :

personal DHJanzen SRNP12345: :

Page 26: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

NCBI’s Biorepository ListNCBI’s Biorepository List

Compiled from literature sources, Compiled from literature sources, GenBank submissionsGenBank submissions

6,936 institutions6,936 institutions 1,177 institutions with non-unique 1,177 institutions with non-unique

acronymsacronyms 660 homonymous acronyms660 homonymous acronyms 514 shared by two institutions514 shared by two institutions 146 shared by three institutions146 shared by three institutions

Page 27: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

CBOL/GBIF/NCBI CBOL/GBIF/NCBI Registry of BiorepositoriesRegistry of Biorepositories

www.biorepositories.org

Page 28: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

www.biorepositories.orgwww.biorepositories.org

Page 29: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.
Page 30: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.
Page 31: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.
Page 32: DNA Barcodes: Linking GenBank records to Museum Specimens David E. Schindel, Executive Secretary, CBOL Robert Hanner, University of Guelph.

Long-term data curationLong-term data curationof BARCODE recordsof BARCODE records

Data records assembled in

BOLD

IDs consistent with other records?

Compliant with BARCODE standards?

Data records released on

INSDC

Data records published in

BOLD

Community feedback

Update records

(audit trail of species names

retained)

CBOL control of BARCODE

flag

GenBank adds BARCODE flag