Development of the Generation Challenge Program Ontology for Crops Elizabeth Arnaud (Bioversity...
-
Upload
everett-hill -
Category
Documents
-
view
214 -
download
0
Transcript of Development of the Generation Challenge Program Ontology for Crops Elizabeth Arnaud (Bioversity...
Development of the Development of the Generation Challenge Program Generation Challenge Program Ontology for CropsOntology for Crops
Elizabeth Arnaud Elizabeth Arnaud
(Bioversity International)(Bioversity International)
andand
Rosemary Shrestha (CRIL-CIMMYT), Rosemary Shrestha (CRIL-CIMMYT), Richard Richard BruskiewichBruskiewich (IRRI) (IRRI)
TDWG 2008 Annual Conference, TDWG 2008 Annual Conference,
20-25 October 200820-25 October 2008
Fremantle, Western AustraliaFremantle, Western Australia
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The Generation Challenge The Generation Challenge ProgrammeProgramme
Science for better crops in the tropicsScience for better crops in the tropics
For the majority of For the majority of crop farmers in the crop farmers in the developing worlddeveloping world, the ravages of drought, low , the ravages of drought, low soil fertility, crop pests and diseases are soil fertility, crop pests and diseases are aggravated by their limited access to aggravated by their limited access to improved improved cropscrops. .
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The Generation Challenge The Generation Challenge ProgrammeProgrammeScience for better crops in the tropicsScience for better crops in the tropics
By using advances inBy using advances in molecular biologymolecular biology and and harnessing the harnessing the rich global stocks of crop genetic rich global stocks of crop genetic resourcesresources, the Generation CP creates and , the Generation CP creates and provides a new generation of plants that meet provides a new generation of plants that meet farmer needs. farmer needs.
http://www.generationcp.org/http://www.generationcp.org/
Consultative Group on International AgriculturalConsultative Group on International AgriculturalResearch (CGIAR)Research (CGIAR)
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP subprogramsGCP subprograms
SP1- Genetic Diversity of Global Genetic Resources
SP2 - Genomics towards gene discovery
SP3 - Trait Capture for Crop Improvement
SP4 - Bioinformatics and Crop Information SystemsBuilding an 'integrated platform' of molecular biology and bioinformatics tools = Molecular breeding platform
SP5 - Capacity Building and Enabling Delivery
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The Generation Challenge The Generation Challenge ProgrammeProgramme
Target areasDrought-prone environments
Mandate cropsAll the CGIAR mandate crops = 22 crops
Commissioned and competitive projects 275 projects in 5 years
GCP New Challenge initiativesGCP New Challenge initiatives
CerealsCerealsl Rice/Rice/droughtdrought//AAfricafrical Wheat/Wheat/droughtdrought//AAsiasial Sorghum/Sorghum/droughtdrought//AAfricafrical Rice-Sorghum-Maize/Rice-Sorghum-Maize/soil problemsoil problem//AAsia & Africasia & Africa
LegumesLegumes
1.1. Cowpeas/Cowpeas/droughtdrought//AAfricafrica
2.2. CChickpeas/hickpeas/droughtdrought//AAfrica and Asiafrica and Asia
Root and tubersRoot and tubers
1.1. Cassava/Cassava/virusvirus//AAfricafrica
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Integration across diverse crop datasets
Volume and complexity of biological data is Volume and complexity of biological data is increasingincreasing
Historical data are scattered in numerous Historical data are scattered in numerous crop crop specific databasesspecific databases
Each database uses slightly different Each database uses slightly different terminologies for terms related to phenotypes terminologies for terms related to phenotypes
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Integration across Diverse GCP Crop Data
• Anatomical• Developmental• Field Performance• Stress Response
GenotypeGermplasm Phenotyp
e
MolecularExpressio
n
Environment
• Inventory• Identification (passport)• Genealogy
• Genetic Maps• Physical Maps• DNA Sequence• Functional Annotation• Molecular Variation (Natural or Induced)
• Location (GIS)• Climate• Day Length• Ecosystem• Agronomy• Stresses
• Transcripteome• Proteome• Metabolome• Physiology
has has
determinesdetermines
affects
SP3
SP3
SP2
SP1
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
An integrated platform for An integrated platform for molecular breedingmolecular breeding
To support and encourage researchers to share and reuse information among agricultural databases
To form the basis for the generation of data templates, web services and software.
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP Scientific Domain ModelGCP Scientific Domain Model
Germplasm identification (“passport") and pedigree data Phenotypic characterization and evaluation data Geographic location and environmental descriptions Genotype and molecular data Genomic map data for markers and loci Functional genomics data
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The exchange of new findings and joint work on projects presuppose that all those involved have the same understanding of the terms they use. This calls the need for an extensively standardized description of plant development stages with phenological characteristics and coding.
Prof. Dr. F. KlingaufPresident of the Federal Biological Research Centrefor Agriculture and Forestry,Berlin and Braunschweig
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Importance of crop ontologyImportance of crop ontology
Similar plant structures are described by their species-Similar plant structures are described by their species-specific terms.specific terms.
Fruit
Kernel in Maize
Grain in Wheat
Pod in Beans
Grain or caryopsis in Rice
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
The GCP OntologyThe GCP Ontology
"Thesaurus" of biological concepts that can be shared "Thesaurus" of biological concepts that can be shared and used across species to which and used across species to which genetic and genetic and phenotypic dataphenotypic data can be associated can be associated
integrative data miningintegrative data mining on GCP annotated data using on GCP annotated data using the platform and web servicesthe platform and web services
Developed with Developed with crop experts,crop experts, for plant structure, for plant structure, developmental stages, traits and expression of the developmental stages, traits and expression of the traitstraits
for selected priority GCP crops: for selected priority GCP crops: Wheat, Maize, Wheat, Maize, Sorghum, Chickpea, Banana & PlantainSorghum, Chickpea, Banana & Plantain
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP Sources for mapping the GCP Sources for mapping the terms terms International Crop Information Systems International Crop Information Systems ICIS model (ICIS model (http://www.icis.cgiar.orghttp://www.icis.cgiar.org ) )
IMIS (maize)IMIS (maize) IRIS (rice)IRIS (rice) IWIS (wheat)IWIS (wheat)
MusaMusa germplasm information system ( germplasm information system (http://www.musa-diversity.orghttp://www.musa-diversity.org ) )
ICRISAT information system (Sorghum, chickpea)ICRISAT information system (Sorghum, chickpea)
CIP information system (potato)CIP information system (potato)
Crop descriptors for traits Crop descriptors for traits (Bioversity International)(Bioversity International)
GCP data templates GCP data templates
GCP datasetsGCP datasetshttp://www.generationcp.orghttp://www.generationcp.org
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Developing the GCP ontologyDeveloping the GCP ontology
GCP crop ontologyGCP crop ontology
mapping
Plant Structure ontologyTrait Ontology
GCP concept ID
PO concept ID & TO concept ID DBXref
Data annotation with GCP ontologyData annotation with GCP ontology
GCP data Templates
1
2
3
4
Crop DB
www.plantontology.org/www.gramene.org/plant_ontology/
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP ontology term has:GCP ontology term has:
Term: plant height
ID: GCP_322*.0000021
Namespace: maize_traitDefinition: Measurement of plant height from soil surface
to the highest point in plant.
Synonyms: PHT, PTHT, Planth. Shoot height
Dbxrefs: PO:10202TO:0000207, IMIS_TRAITID:1008
is_a: GCP_322.0000108
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
Building ontology with OBO.EditBuilding ontology with OBO.Edit
Terms are linked by the relationships such as Terms are linked by the relationships such as is-ais-a part-ofpart-of has-a has-a disjoint fromdisjoint from derived from, etc.derived from, etc.
It is structured as a hierarchical directed acyclic It is structured as a hierarchical directed acyclic graph (DAG)graph (DAG)
Terms can have more than one parent and zero, Terms can have more than one parent and zero, one or more childrenone or more children
Draft releases of the OBO formatted ontology files for rice, Draft releases of the OBO formatted ontology files for rice, wheat and maize trait are available at wheat and maize trait are available at http://cropforge.org/projects/gcpontology/
http://oboedit.org/
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
%HSATIVUM_TILLER1_FLAG_1%HSATIVUM_TILLER1_FLAG_1
Complex trait name
%HSATIVUM_TILLER1_FLAG_1%HSATIVUM_TILLER1_FLAG_1
Complex trait name
Description: Description: The trait is scored for severity of The trait is scored for severity of the the disease disease caused by caused by Helminthosporium sativumHelminthosporium sativum (leaf (leaf spot) at tiller 1 and flag 1 stage in spot) at tiller 1 and flag 1 stage in percentage. percentage.
Description: Description: The trait is scored for severity of The trait is scored for severity of the the disease disease caused by caused by Helminthosporium sativumHelminthosporium sativum (leaf (leaf spot) at tiller 1 and flag 1 stage in spot) at tiller 1 and flag 1 stage in percentage. percentage.
Complex trait's names created Complex trait's names created by breeders in the crop by breeders in the crop databasesdatabases
to be decomposed into simple terms that are readable to be decomposed into simple terms that are readable for both human and computer and mapped against Ontologyfor both human and computer and mapped against Ontology
Plant Ontology
Qualities &Units Ontology
Assessment MethodsOntology
(e.g. ICIS)PATO
Qualifier
Phenoptype
“values” have “units”
(units implicitly indicates attribute)
Plant structure
Development stages
Markers/alleles/sequence ontology
Genotype Factor (G)
EFF
EC
TS
Treatment, Location, Climatic variables
/water, Growth conditions, Stress
Management/agronomy
External environmental data (E)
Time Ontology
Temporal factor (T)
Experimental design
Experiment factor (ED)
Ontology for Crops
phenotypic qualities
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
GCP Ontology – present and future prospects:GCP Ontology – present and future prospects:
GCP Ontology
GCP Ontology
Data Source CGIAR
GCP data templates
GCP Domain Module OntologyGCP Domain Module Ontology
ICIS dataset
Taxonomic Ontology
Plant Anatomy & Development Ontology
Plant Anatomy & Development Ontology Phenotype & Trait OntologyPhenotype & Trait Ontology
Structural & Functional Genomic Ontology
Location & Environment OntologyLocation & Environment Ontology
General Science Ontology
Web Interface (Chado/koios)
Query Linkage to external ontologies
Present S
tatu
s
Future
Plan
General Germplasm OntologyGeneral Germplasm Ontology
TDWG annual conference, 20-25 October 2008, Fremantle, Western Australia
http://pantheon.generationcp.org
http://pantheon.generationcp.org