Global Information Systems for Plant Genetic Resources (2009)

52
CAC course at NordGen, Alnarp January 2009 CAC course at NordGen, Alnarp January 2009 Cover Cover slide slide Documentation of Genetic Resources Global Information Systems CAC Training Course January 29, 2009 NordGen, Alnarp Dag Terje Filip Endresen Nordic Genetic Resource Center/ Bioversity International

description

Global information systems for plant genetic resources. For the Caucasus germplasm network training course at the Nordic Genetic Resource Center (NordGen), Alnarp Sweden 29th January 2009.

Transcript of Global Information Systems for Plant Genetic Resources (2009)

Page 1: Global Information Systems for Plant Genetic Resources (2009)

CAC course at NordGen, Alnarp January 2009CAC course at NordGen, Alnarp January 2009

Cover Cover slideslideDocumentation of

Genetic Resources

Global Information Systems

CAC Training CourseJanuary 29, 2009NordGen, Alnarp

Dag Terje Filip EndresenNordic Genetic Resource Center/ Bioversity International

Page 2: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 2

TOPICSTOPICS

Documentation of genetic resources:

Information Systems Data standards Data exchange Distributed data network

Page 3: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 3

Global PGRInformationSystems

Page 4: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 4

Germplasm cataloguesGermplasm catalogues

The three large germplasm catalogues are indexed by the GBIF data portal

EURISCO is the data catalogue of the European genebanks (more than 1 000 000 accessions)

SINGER is the portal to the international CGIAR collections (more than 650 000 accessions)

USDA-GRIN is the portal to the USDA ARS National Germplasm Repositories of the USA (more than 400 000 accessions)

Page 5: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 5

ECPGR ECPGR (AEGIS, ECCDB, EURISCO)(AEGIS, ECCDB, EURISCO)

A European Genebank Integrated System (AEGIS)

Sharing of responsibilities (Most Appropriate Accession; common agreed quality standards for ex situ conservation).

Conservation of the genetically unique and important accessions for Europe and making them available for breeding and research.

Four model crops: Allium, Avena, Brassica and Prunus species.

Membership in AEGIS is open to all European countries (ECPGR).

EURISCO and the Central Crop Databases play a key role in the information management.

Page 6: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 6

ECPGR Central Crop ECPGR Central Crop DatabasesDatabases

Page 7: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 7

ECPGR Central Crop ECPGR Central Crop DatabasesDatabases

Page 8: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 8

EURISCO data catalogue of the European genebanks (more than 1 000 000 accessions from 35 European countries)

EURISCO holds accession level data on 1 300 genera and 8 500 species.

EURISCO was released in September 2003 as a result of the EU funded EPGRIS project.

EURISCO is hosted by Bioversity International on behalf of the ECPGR.

EURISCO EURISCO [[http://eurisco.ecpgr.org/]]

Page 9: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 9

EURISCO (new layout)EURISCO (new layout)

Page 10: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 10

Data flow from genebanks to EURISCO and ECCDBsData flow from genebanks to EURISCO and ECCDBs

Page 11: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 11

EPGRIS3 is a volunteer self-funded follow up on the EU funded EPGRIS project.

EPGRIS3 is about improving the data exchange of European genebank datasets and to further develop the IT infrastructure on genetic resources in Europe.

EPGRIS3 EPGRIS3 [http://www.epgris3.eu/]

Page 12: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 12

A EPGRIS3 Wiki environment is hosted by NordGen. Please register and contribute to the discussions. [http://wwwdev.ngb.se/epgris3/]

Please make contact with one of the EPGRIS3 contact persons if you want to contribute to the EPGRIS3 project.[http://www.epgris3.eu/ EPGRIS3contacts.htm]

EPGRIS3 EPGRIS3 Wiki EnvironmentWiki Environment

Page 13: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 13

The System-wide Information Network for Genetic Resources (SINGER).

More than 650 000 accessions from the 12 international CGIAR organizations.

SINGER is hosted by Bioversity International on behalf of the CGIAR.

SINGER SINGER [[http://singer.grinfo.net/]]

Page 14: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 14

AVRDC - The World Vegetable Center

Bioversity - Bioversity International

CIAT - Centro Internacional de Agricultura Tropical

CIMMYT - Centro Internacional de Mejoramiento de Maiz y Trigo

CIP - Centro Internacional de la Papa

ICARDA - International Center for Agricultural Research in the Dry Areas

ICRAF - The World Agroforestry Centre

ICRISAT - International Crops Research Institute for the Semi-Arid Tropics

IITA - International Institute of Tropical Agriculture

ILRI - International Livestock Research Institute

IRRI - International Rice Research Institute

WARDA - The Africa Rice Center

CGIAR CGIAR [[http://www.cgiar.org/]]

Page 15: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 15

GCPGCPGeneration Challenge Programme.

The GCP Mission: The GCP Mission: To use advanced genomics science and plant genetic diversity to overcome complex agricultural bottlenecks that condemn millions of the world’s neediest people to a future of poverty and hunger.

The GCP Vision: The GCP Vision: A future where plant breeders have the tools to breed crops in marginal environments with greater efficiency and accuracy for the benefit of the resource-poor farmers and their families.

GCP GCP [[http://www.generationcp.org/]]

Page 16: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 16

The Nordic Genetic Resource Center (NordGen) was established in January 2008.

NordGen replaces the former institute Nordic Gene Bank (NGB) established in 1979.

NordGen is the joint regional genetic resource center for all the 5 Nordic countries: Denmark, Finland, Iceland, Norway and Sweden.

The NordGen reports to the Nordic Council of Ministers [http://www.norden.org].

The mandate of the NordGen is conservation and utilization of Nordic Genetic Resources.

NordGen NordGen [[http://www.nordgen.org/]]

Page 17: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 17

SEEDNet, South East European Development Network on Plant Genetic Resources was established in 2004. [http://seednet.geminova.net/]

SADC, Southern African Development Community program on genetic resources was started in 1989. [http://www.spgrc.org/]

USDA GRIN, Germplasm Resources Information Network of the US. [http://www.ars-grin.gov/]

… and more

Regional Programs on Genetic Regional Programs on Genetic ResourcesResources

Page 18: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 18

GBIFGlobal Biodiversity Information Facility

Page 19: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 19

GBIF Data PortalGBIF Data PortalGBIF [http://data.gbif.org/]

Page 20: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 20

GBIF PGR Network 2GBIF PGR Network 2

[http://data.gbif.org/datasets/network/2]

Page 21: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 21

GBIF NordGenGBIF NordGen [http://data.gbif.org/]

Page 22: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 22

[http://data.gbif.org/]GBIF SINGERGBIF SINGER

Page 23: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 23

GBIF USDAGBIF USDA

Page 24: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 24

FAOWIEWS

Page 25: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 25

FAO WIEWS [http://apps3.fao.org/wiews/]

Page 26: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 26

FAO WIEWS, GPA [http://www.pgrfa.org/gpa/]

Leipzig Declaration 1996, 150 countries[http://www.globalplanofaction.org/]

Page 27: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 27

Data Standards

Page 28: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 28

Crop DescriptorsCrop Descriptors

The crop descriptor lists from Bioversity International provide global standards for characterization and evaluation data on crop genetic resources.

The MCPD (Multi Crop Passport Descriptor List) provides a global standard for "passport data" across the crops.

The MCPD descriptor list is compatible with the TDWG standard: ABCD 2.06.

Page 29: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 29

Accession level, Data StandardsAccession level, Data Standards

Multi Crop Passport (MCPD)[http://www.bioversityinternational.org/publications/pubfile.asp?id_pub=124]

Darwin Core (DwC v2)[http://wiki.tdwg.org/twiki/bin/view/DarwinCore/]

Access to Biological Collection Data (ABCD 2.06)[http://wiki.tdwg.org/twiki/bin/view/ABCD]

Generation Challenge Programme (GCP Passport v1.05)

[http://gcpcr.grinfo.net/include/webservices/schema-documentation.php]

Page 30: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 30

W3C :: RDFW3C :: RDF

Resource Description Framework Scenario: You have a dataset of genebank accessions with pointers

to the source datasets of the holding genebanks. You produce phenotypic evaluation data on accessions in this dataset. You find evaluation data from other sources on some of the accessions in your dataset. Some of the evaluation data are produced in areas of different day length, rainfall, soils… Some of the accessions in your dataset originate from areas of higher population densities other accessions originate from more natural habitats. Unfortunately most of the different sources of information is located on different web sites and it is difficult to bring the information together.

You would need to go through more or less the same process as other researchers in many domains of gathering heterogeneous data from multiple sources, combining and analysing it. This is the challenge that faces the web as a whole and is being addressed by the Semantic Web project.

RDFs can assist you to relate information from different sources.

A RDF triplet looks like this: subject-predicate-object

<rdf:Description rdf:about="http://www.example.org/index.html"> <dc:creator>John Smith</dc:creator></rdf:Description>

anytime approximate case

study diagnosis inconsistent kads

banana apples stem color

knowledge based systems

knowledge level knowledge

management knowledge representation

LSID accession

number GUID unitID

ontology

owl parametric design Full

Scientific Name peer to peer systems

problem solving landrace

traditional cultivar 300 methods

rdf rdf WEB2 ABCD

SDD semantic web semantics specification

languages web based web ontology INSTCODE plant genetic resources

germplasm agricultural

traits Aegilops

Page 31: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 31

Life Science IDentifiers

LSID is a digital name tag. LSIDs are GUIDs, Global Unique Identifiers.

[http://lsid.sourceforge.net/]

Structure urn:lsid:authority:namespace:object:revision Example (fictive) urn:lsid:eurisco.org:accession:H451269

The LSID concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data stores.

LSID defines a simple, common way to identify and access biologically significant data

LSID provides a naming standard to support interoperability. Developed by OMG-LSR and W3C, implemented by IBM.

W3C :: LSIDW3C :: LSID

Page 32: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 32

TaxonReferences

http://www.catalogueoflife.org

http://www.itis.gov/

Page 33: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 33

Biodiversity data exchange tools

Page 34: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 200934

Page 35: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 200935

Page 36: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009

Test resource with client form:http://localhost/tapirlink/tapir_client.php

The XML Client form is very illustrative for understanding exactly how the wrapper software works!

36

Page 37: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 37

Data Provider Software

PyWrapper v3, based on the BioCASE Python software.

[http://www.pywrapper.org/]

[http://www.biocase.org/]

DiGIR, Distributed Generic Information Retrieval. [http://digir.net]

TapirLink [http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLink]

TapirDotNet [http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirDotNET]

Page 38: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 38

Distributed BioCASE/PyWrapper networkDistributed BioCASE/PyWrapper network

Page 39: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 200939

Page 40: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 40

Example of a service requestExample of a service request

All exchanged data is formatted with XML tags.

Page 41: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 41

Example of a service responseExample of a service response

Page 42: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 42

Data portal and decentralized data networks with web services

Page 43: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 43

Data warehouse modelData warehouse model

Page 44: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 44

EURISCO(Europe)

NordGen(Northern Europe)

IPK Gatersleben(Germany)

IHAR(Poland)

(Other European gene banks...)

SINGER(CGIAR)

(CGIARInternationalFuture Harvest gene banks...)

USDA GRIN (USA)

(USDA ARSNational Germplasm Repositories...)

WUR CGN(Netherlands)

GBIF(Global BiodiversityInformation Facility)

USER

ALIS(Accession Level Information System)

Web Services

MCPD

Svalbard Global Seed Vault(Safe Backup)

Page 45: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 45

Germplasm data indexing toolsGermplasm data indexing tools

We have recently built data indexing tools for access to gene bank datasets provided with the BioCASE/PyWrapper.

This is planned to build a Global Accession Level Information System (ALIS).

In cooperation with GBIF, which themselves index basic biodiversity data from a similar approach.

[http://chm.grinfo.net/]

Page 46: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 46

[http://wwwdev.ngb.se/portal/]

Page 47: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 47

Crop Wild RelativesCrop Wild Relatives

ARMLKA

BOL

MDG

UZB

National Datasetsare shared with the central CWR data index.

The national datasets as well as access to other International datasets are provided from the CWR data portal.

EURISCO

SINGER[http://www.cropwildrelatives.org]

Page 48: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 48

The Taxon and Country pages provides access to the relevant external datasets.

Taxonomy level metadata

Page 49: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 49

Country level metadata

Page 50: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009

Participation and the sharing of your institute datasets with global and national biodiversity projects

is important for your public and scientific visibility,

promoting the use (usefulness) of your data

and ultimately for the continued funding of your institutional activities.

50

Page 51: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009

Bioversity International [http://www.bioversityinternational.org]

GBIF, Global Biodiversity Information Facility [http://www.gbif.org]

BioCASE, The Biological Collection Access Service for Europe. [http://www.biocase.org]

TDWG, Biodiversity Information Standards [http://www.tdwg.org]

51

Page 52: Global Information Systems for Plant Genetic Resources (2009)

Global PGR Information Systems for the CAC course at NordGen, January 29, 2009Global PGR Information Systems for the CAC course at NordGen, January 29, 2009 52

Thank you for listening!