Legal Status Data Legal status project and relating developments in CWS
BioDBCore: Current Status and Next Developments
-
Upload
pascale-gaudet -
Category
Technology
-
view
108 -
download
2
description
Transcript of BioDBCore: Current Status and Next Developments
![Page 1: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/1.jpg)
Pascale Gaudet Chair, International Society for Biocuration Scientific Manager, neXtProt, SIB Swiss Institute of Bioinformatics
BioDBCore: Current status and future developments
![Page 2: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/2.jpg)
International Society for Biocuration: Mission statement
• Define and promote the work of biocurators
• Foster connections with user communities to ensure that databases and accompanying tools meet specific user needs
• Promote communication and exchanges between curators: meetings, workshops,
• Encourage best practices by providing documentation on standards and annotation procedures ISB
![Page 3: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/3.jpg)
The need • Databases: improve data integration from
published papers
• Journals: link to databases objects
• Researchers: identify resources
• Grant submitters: enforce data sharing plans
![Page 4: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/4.jpg)
Goals 1) Gather information required to provide a
general overview of the database landscape and compare the various resources
2) Encourage consistency and interoperability 3) Promote the use of standards 4) Provide guidance for users 5) Maximize the collective impact of the
resources
![Page 5: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/5.jpg)
BioDBcore group organization • Lead by Pascale Gaudet (ISB/SIB) and
Philippe-Rocca-Serra (BioSharing)
• Guidelines proposed in 2011 paper
• Implemented in 2012 NAR database issue
![Page 6: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/6.jpg)
Use cases • Show all resources of type database which use
MIMARK guidelines • Show all resources where John Smith is involved • Show all resources for mouse phenotypes • Where can I submit my data?
and also: • Guidance for grants’ data sharing policies • Improving integration of data from papers into
databases
![Page 7: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/7.jpg)
Collaborative philosophy • Many groups/resources have been providing
registries and lists of databases • Often not funded, not maintained • BioDBCore seeks to collaborate with all interested
parties to work together to provide a more permanent solution to database descriptions
![Page 8: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/8.jpg)
BioDBcore: Participating groups ² BioDB100 ² BioSharing ² BioCatalogue ² Bioinformatics Links Directory ² Biositemaps ² CASIMIR ² MIBBI ² MIRIAM ² Model Organism Databases ² NIF registry ² … and your group !
![Page 9: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/9.jpg)
BioDBCore descriptors 1. Database name
2. Main resource URL 3. Contact information (e-mail; postal mail) 4. Date resource established (year) 5. Conditions of use (Free, or type of license) 6. Scope: data types captured, curation policy,
standards used 7. Standards: MIs, Data formats, Terminologies 8. Taxonomic coverage 9. Data accessibility/output options 10. Data release frequency 11. Versioning policy and access to historical files 12. Documentation available 13. User support options 14. Data submission policy 15. Relevant publications 16. Resource’s Wikipedia URL 17. Tools available
![Page 10: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/10.jpg)
Database name dictyBase Main resource URL http://dictybase.org Contact information [email protected] Date resource established (year) 2003 Conditions of use Free Scope: Data types captured Genome sequence; gene models including CDS and predicted proteins; Phenotypes, Gene Ontology annotations, Functional annotation (gene product names), Gene nomenclature; Strains; Plasmids; Free text descriptions, Domains (via InterPro), Orthologs (via OrthoMCL and inParanoid), Protein subcellular location (via Swiss-Prot); Protein existence (via Swiss-Prot), Citations, Researchers database
![Page 11: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/11.jpg)
Curation policy manual curation Standards: MIs, Data formats, Terminologies Gene Ontology, Dicty Anatomy Ontology, Dicty Gene Nomeclature Data formats FASTA, OBO, GAF, GFF3 (standard) Taxonomic coverage (use NCBI Taxid) D. discoideum (44689) including all strains [PRIMARY], also some genome/EST/gene model info for D. purpureum (5786), and gene model sequences for P. pallidum (13642) and D. fasiculatum (261658) Data accessibility/output options HTML, text, database reports Data release frequency curators work on the 'live' database, weekly data dumps (sequences) or monthly (other data) Versioning policy/ access to historical files no versioning but access to historical files is possible
![Page 12: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/12.jpg)
Documentation available http://dictybase.org/FAQ/HelpFilesIndex.html User support options documents, email, webform Data submission policy Data from published literature. Some HTP data
corresponding to published analyses is incorporated Relevant publications PMID: 18974179, PMID: 14681427 Resource’s Wikipedia URL http://en.wikipedia.org/wiki/DictyBase Tools available BLAST, BioMart, Generic Genome Browser, TextPresso, MetaCyc (dictyCyc)
![Page 13: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/13.jpg)
Implementation of BioDBCore at BioSharing (Many thanks to Philippe RS !)
![Page 14: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/14.jpg)
BioDBcore announcement
Published in Nucleic Acids Research database issue 2011 and in the DATABASE journal
![Page 15: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/15.jpg)
Implementation plan • Goal: BioDBCore data public and linked
• Community aware approach: reuse existing stuff
• Current Data model: RDF based on categories from BioSiteMap, MIRIAM, NIF, Dublin core, Darwin Core
• Defined extension mechanisms
![Page 16: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/16.jpg)
www.biodbcore.org
![Page 17: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/17.jpg)
Example BioDBCore entry (1/2)
![Page 18: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/18.jpg)
Example BioDBCore entry (2/2)
![Page 19: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/19.jpg)
Creating, editing, maintaining entries
• Until now: records are manually created from data provided by NAR at publication of Database issue and the Life Sciences Registry (Michel Dumontier and Nick Juty) - Those mostly come as xls files that need to be manually entered - Close to 200 records have been entered out of over 2,000 obtained
![Page 20: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/20.jpg)
Beyond maintenance at BioSharing Ideally database providers would maintain their BioDBCore record up to date • Claim ownership
- A database provider can now (in theory) maintain his own BioDBCore record
Encouraging best practices • DATABASE and Nucleic Acids Research journals:
Editors in chief request BioDBCore information from submitters
• ISB seal of approval • BioDB100 - launched at InCoB 2011 – examples of 100 well
annotated databases
![Page 21: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/21.jpg)
What’s next ?
q Continue to extend participating groups and journals
q Refine scope
q Integrate semantic support
q Develop querying system
q Implement validation tests
q Set up mechanisms for exchange of data among
collaborating groups (in BioDBCore RDF format, or
other)
![Page 22: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/22.jpg)
Identifying or developing semantic support • Policies and guidelines: BioSharing
• Publications and taxon info: identifiers.org
• Authors: ORCID (will also implement organizations)
• Keywords/database scope: NIF when possible
Identifying resources is preferable to developing them !
![Page 23: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/23.jpg)
For biohackaton2013
q Evaluate need for BioDBCore in today’s landscape
of metadatabase resources
q Evaluate further collaboration opportunities
q Set up a better system for creating and maintaining
BioDBCore records
q Identify/develop ontologies pertinent to BioDBCore
![Page 24: BioDBCore: Current Status and Next Developments](https://reader034.fdocuments.us/reader034/viewer/2022042623/54c69fc14a7959a0798b4580/html5/thumbnails/24.jpg)
Acknowledgements Philippe Rocca-Serra Susanna-Assunta Sansone Eamonn Maguire Alejandra Gonzalez Beltran
International Society for Biocuration
Michael Galperin David Landsman Francis Ouellette
OXFORD UNIVERSITY PRESS
collaborators