Breeding for enhanced Zinc and Iron concentration in CIMMYT spring wheat germplasm
Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ......
Transcript of Integrated solutions for wheat data sets at CIMMYT - … solutions for wheat data sets at CIMMYT ......
Integrated solutions for wheat data sets at CIMMYT
David Marshall
Clermont-Ferrand November 2015
CIMMYT plant breeding
Broad scope of maize and wheat breeding
Land races in genebanks to elite lines
Maize or wheat growing areas in developing countries
300+ partners in germplasm testing
Annual budget US $ 150 million
Restricted project funding
Total staff 1500
9 Biometrics + consultants
8 in germplasm IT
Vision for breeding and Breeding IS
High‐throughput and lots of data on:
Genealogy
Phenotyping
Genotyping
Environmental and sensor data
Many options for integrated analysis of data
Simplified model of a breeding information system
Databases@ Phenotype, Genealogy, Seed Inventory
Data access tools
Data Collections Tools Field book, Field Log,Sample tracker,
LIMS
Read and write to DB
Data query and analysis tools, Statitics, visualisation, datamarts
Mainly read from DB
Simplified data model for breeding data
Experimental design
ManagementEnvironment
Plant/plots
Genealogy
Phenotyping
Molecular markers
Seed
Phenotypic data
Foundation for breeding decisions
Expensive to generate, and cost of traditional phenotyping
increasing
Costly to have people walk around repeatedly to each plot
Quality issues in phenotypic data
Potential for enhanced genetic gain
More effective / precise phenotyping, better decisions
More efficient phenotyping, larger breeding populations
In cereal breeding we need high throughput (precision?)
phenotyping
Remote sensing potential
Availability of low cost UAV & light high resolution
hyperspectral remote sensors are game changers for use of
remote sensing in phenotyping
Airborne remote sensing can be used for non‐destructive
screening of plant physiological properties
Enough resolution to obtain information at plot level while
being able to measure several hundreds plots in one take
Potential benefits of remote sensing
Reduce cost in human resources and time
Increase size of breeding cycle and higher selection intensity
Quicker selection process
Traits and current challenges
Potential traits or measures include for example:
Canopy temperature
Early vigor or biomass
Grain yield estimates
Flowering date
Plant height
Challenge 1: High‐throughput automated analysis procedures
must be developed to process high volumes of data
Challenge 2: Research needed on testing trait measure in
combinations of: Spectral band/ indices, time of measure, and
environmental or management conditions
Climate Data from Partners Metservices
CRU
Wordclim
NASA GPM
NASA Power
Issues: Accessibility, costs, quality, calibrations,
extrapolated data, time series, daily data, coverage: only
ca 5000 stations for the whole African continent
Climate data: On Station
Investment of between 1,000 to 15,000 USD for met stations
Mainly depending on durability, connectivity options and sensors
Maintenance (clogging pluviometer, insects, birds), have to be
calibrated, at least one person in charge, one technician in the
country who can travel for multi sites
Connectivity:
Cable
Wifi
GSM modem
Problems with sites without connectivity
Genealogy data
Simple data to document pedigree / family relationship
among breeding material
Very cheap data to generate
Full potential requires discipline, coordination, and central
genealogy data base
Main challenge is to render the information according to
crop / breeding program traditions
Useful for example for
Analyzing sources of traits
Calculating Coefficient of Parentage
Genotypic data
Getting cheaper, but phenotyping/genotyping data cost
not as drastically different as e.g. cattle or trees
We need to manage quality genotypic data at scale
Sample generation e.g. seed chipping or similar
Sample tracking
Ship for genotyping & get data
Quality control
Analysis of data in time for selection decisions
Particular challenges with crop research information systems Scientist unaware of cost of technological choices
Changes in breeding cost structure not reflected in budgets
IT staff don’t understand the biology of breeding (or
biometrics)
IT staff mostly trained on administrative systems with defined
workflows
Scientist change approach, workflow, data etc. frequently
Breeding is large numbers game, and automation valuable
Research institutions hesitant to impose institutional standards
or changes on research staff
Conceptual data integration model
Toolbox and collaborators
Georeferenced Passport and Climate Data
Flapjack components
Overview map
Traitheat map
Zoom
QTL tracks
Status info
Genotype display
Window map
Let’s start with a simple example
Sorting by traits – habit and row type
Sorting by similarity to line
Select markers under QTL
Other similarity options
Germplasm relatedness
Helium
Some wheat specific Issues at CIMMYT
Heterogeneity of marker platforms depending on source of
funding and which partners are involved
This imposes limitations on integration of genotypic data
Possibility of using genomics contigs/scaffolds as integration
substrate for wheat and for comparative cereal genomics
Need for a simple interface into the wheat genome from
marker based maps, GWAS etc.
Generic map dressed with landscape of known important
wheat genes
Heterogeneity of data sources and analysis tools.
Plant Breeding API
Some Conclusions
Acknowledgments
Jens Riis‐Jacobsen, Jose Crossa, Juan Burgueño, Maria
Tattaris, Kai Sonder, Sarah Hearne, Kate Dreher
Iain Milne, Gordon Stephen, Paul Shaw Sebastian
Raubach
IBP Team
DArT team
Breeding API group