EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
-
Upload
dag-endresen -
Category
Technology
-
view
816 -
download
0
description
Transcript of EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
Web service demo for EURISCOGBIF Tools and Darwin Core extension for germplasm
N.I. Vavilov Research Institute of Plant Industry (VIR), April 26th – 29th 2010, St Petersburg, Russian FederationDag Endresen, Jonas Nordling, Nordic Genetic Resources Center (NordGen)
Topics for this session
Web service installations for EURISCO Overview of the current project Darwin Core and the extension for
germplasm GBIF informatics tools Integrated Publishing Toolkit (IPT) Distributed datasets
2
Possible Upgraded PGR Network Model
3
The gene bank dataset is shared from the holding gene bank.
The National Inventory (NI) endorse all national gene banks (and eventually individual accessions) for EURISCO.
ECPGR Crop databases can access passport data from EURISCO and additional crop specific data from the genebank IPT interface.
Standard data sharing tools ensure that the genebank dataset is available to other relevant decentralized thematic, regional or global networks.
Objectives of the EURISCO demo project
Evaluate the GBIF decentralized architecture
Install the IPT installation for 8 genebanks in Europe that, as far as possible, are also EURISCO/ECPGR partners.
Test the registration of IPT installation through the GBIF registryGlobal Biodiversity Resources Discovery System (GBRDS).
Test the Harvesting and Indexing Toolkit (HIT) installation for the EURISCO platform (Bioversity HQ, Rome).
Project runs until 20 December 2010.
4
EURISCO NordGen (Nordic) Bioversity-Montpellier
(France) IPK Gatersleben (Germany) BLE (Germany) WUR CGN (The
Netherlands) CRI (Czech Republic) VIR (Russian Federation) SeedNET (Balkan) Baltic (Estonia, Latvia,
Lithuania)
2010 : IPT installations for EURISCO
5
2005 : BioCASE demo
http://chm.grinfo.net/6
Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.
The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community (TDWG, GBIF).
Potential of the GBIF technology
http://data.gbif.org/datasets/network/2
7
Darwin CoreThe purpose of DwC terms is to facilitate data sharing • a well-defined standard core vocabulary
• a flexible framework to maximize re-usability
The Darwin Core can be extended by adding new terms to share additional information.
Approved as TDWG standard 2009
“The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, and samples, and related information.”
http://rs.tdwg.org/dwc/
8
http://code.google.com/p/darwincore-germplasm
http://rs.nordgen.org/dwc
DwC extension for germplasm
DwC Germplasm : DRAFT 0.1 : August 26, 2009
• “MCPD in Darwin Core”
• Maintained by gene banks worldwide
• Additional terms to describe germplasm samples
• Includes the new terms for crop trait experiments developed as part of the European EPGRIS3 project
• Includes a few additional terms for new international crop treaty regulations
9
Mapping of DwC-G terms to the MCPD descriptors
(EURISCO data exchange format)
10
Mapping of DwC-G terms to the MCPD descriptors (continued)
11
MCPD -> ABCD 2.06 (2004) for BioCASE
National Inventory CodeInstitute CodeAccession NumberCollecting NumberCollecting Institute CodeGenusSpeciesSpecies Authority„Subtaxa“„Subtaxa“ AuthorityCommon Crop NameAccession NameAcquisition Date
Country of OriginLocation of Collection SiteLatitude of CSLongitude of CSElevation of CSCollecting Date of SampleBreeding Institute CodeBiological Status of
AccessionAncestral DataCollecting/Acquisition
Source
Donor Institute CodeDonor Accession NumberOther Identification (Number)
associated with the accessionLocation of Safety DuplicatesType of Germplasm StorageRemarksDecoded Collecting InstituteDecoded Breeding InstituteDecoded Donor InstituteDecoded Safety Duplication
LocationAccession URL
Helmut KnüpfferIPK Gatersleben
Walter BerendsohnBGBM
http://www.ecpgr.cgiar.org/epgris/Tech_papers/EURISCO_Descriptors.pdf
12
GBIF Informatics Suite
GBIF tools to empower decentralized thematic or regional networks
Darwin Core extension for germplasm makes these tools usable for crop gene banks.
13
A tool for data publishers.
A simple mechanism to share primary biodiversity data following the Darwin Core standard.
Open source, Java based web application.
Provides a local tool for data quality assessment, etc.
Integrated Publishing Toolkit (IPT)
14
• Embeds its own database
• Multilingual
• Has a user management feature based on roles, which allows for multiple data managers to share a common instance
• Manages multiple data sources
• Several upload options: relational database management systems or data files
• Public web interface allows for data browsing and full text search
• Customised detail pages
15
The IPT user interface includes
the germplasm extension
16
XML interface includes thegermplasm extension
17
VIR (RUS001)Passport data
VIR (RUS001)Crop departments
Global Crop Registries
European EURISCO Catalog
European ECPGR Crop Databases
18
VIR (RUS001)Passport data
VIR Crop dataset
Global Crop Registries
EURISCO
ECPGR Crop Databases
Same dataset available from multiple information systems...
?!
19
VIR (RUS001)Passport data
VIR Crop dataset
Global Crop Registries
EURISCO
ECPGR Crop Databases
Resolvable persistent identifiers can direct the user to the publisher of the primary dataset (official original dataset)
20
• The Persistent Identifier (PI) is a digital name tag– Also called Global Unique Identifiers (GUID)– Life Science Identifiers (LSID) is one example– Digital Object Identifier (doi) is another example
• The Persistent Identifier concept for to naming and identification of data resources stored in multiple, distributed data stores.
• Effective identification of data objects is essential for linking the world’s biodiversity data.
Persistent Identifier
21
Global crop collections
Moving towards… global integration of information
Threatened species
Migratory species
Spatial data
Global crop system
Crop standards
Legislation and regulations etc.
Crop collections in Europe
Genebank datasets
22
• GBIF, Global Biodiversity Information Facility http://www.gbif.org
• TDWG, Biodiversity Information Standards http://www.tdwg.org
• Bioversity International http://www.bioversityinternational.org
Things can happen in a band, or any type of collaboration, that would not otherwise happen. (Jim Coleman, Musician)
Special thanks to: