Seeds of Discovery (SeeD) -...

28
Seeds of Discovery (SeeD) Sarah Hearne, Alberto Romero, Huihui Li, Carolina Sansaloni, Cesar Petroli, Martha Willcox, Aleyda Sierra, Hector Galvez, Manuel Martinez, Sukwinder Singh, Marc Ellis, Giovanny Soca, Gary Atlin, Andrzej Kilian, Ed Buckler, Peter Wenzl Large-scale application of GbS in the SeeD project: ‘Rightsizing’ of methods and initial results

Transcript of Seeds of Discovery (SeeD) -...

Page 1: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Seeds of Discovery (SeeD)

Sarah Hearne, Alberto Romero, Huihui Li, Carolina Sansaloni, Cesar Petroli,

Martha Willcox, Aleyda Sierra, Hector Galvez, Manuel Martinez, Sukwinder Singh, Marc Ellis, Giovanny Soca, Gary Atlin, Andrzej Kilian, Ed

Buckler, Peter Wenzl

Large-scale application of

GbS in the SeeD project:

‘Rightsizing’ of methods and

initial results

Page 2: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Seeds of Discovery (SeeD) New

genetic variation to raise future

crop production

Genetic resources

Breeding

programs

Cultivar

adoption,

agronomy

Increased agricultural

production

Take it to the

Farmers (TTF)

International Maize Improvement

Consortium (IMIC)

Wheat

Yield

Consortium

(WYC)

Page 3: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Why SeeD?

Climate change

Soil degradation and falling water tables

Costs of fertilizer and energy

Genetic erosion

Glo

ba

l av

era

ge

yie

ld (

ton

s

pe

r h

ec

tare

Year

1960 1970 1980 1990 2000 2010 2020 2030 2040 2050

8 7 6 5 4 3 2 1 0

Wheat

Maize

[Source: USDA PDS database]

Anticipated

demand by

2050 (FAO)

Genetic resources for food security

Page 4: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Upstream Breeding-oriented

Genetically

simple traits

[some diseases,

phenology]

Genetically

complex traits

[heat/drought

tolerance]

Main emphasis:

Mobilize novel alleles

for complex traits into

breeding programs

‘Low-hanging fruits’

for breeding

Se

ek c

olla

bo

ratio

ns to

min

e

da

ta fo

r ba

sic re

sea

rch

Research emphasis

Page 5: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

• Underutilized sources of genetic variation

• Selection imprints

• Heterotic patterns (maize)

• Hidden translocations (wheat)

• Rare recombinants

• Novel, beneficial alleles, haplotypes

• Markers linked to loci and alleles

that control priority traits

• Genetically distinct ‘donor

accessions’

Molecular

atlases

Asociación

genómica

Novel alleles

and allele donors

‘Bridging

germplasm’

1 2

3

Strategy

Page 6: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Project areas

Molecular atlases 1

2

3

4

5

Novel alleles and ‘allele donors’

Pre-breeding ‘bridging germplasm’

Information management

Capacities

(diversity surveys)

GbS

(GWAS)

(genetic-analysis service)

Page 7: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Genotyping by sequencing (GbS)

• Transition from genotyping-by-assay (gel, hybridization)

towards genotyping-by-sequencing

• Similar to analogue digital photography transition

• Simultaneously discovers DNA polymorphisms and classifies their allelic states advantage for characterizing unknown

genetic diversity in genebanks

• Minimizes ascertainment bias

• Configurable platform: adjust No. of markers vs. No. of DNA samples two ‘flavors’:

• DArT: ~60-70K markers, SNP & PAV, ~20-35% missing data, lower error rates, calling of heterozygotes for subset of SNP markers, no imputation

maize & wheat diversity surveys

• Cornell: ~800K markers: only SNP, ~60% missing data, higher error rates, no heterozygote detection, imputation maize GWAS

Page 8: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Genetic-analysis service (SAGA) ● Provide services, based on modern genomics platforms, which address

the needs of demand-driven, impact-oriented agricultural R&D

● Partnership with DArT (Diversity Arrays Technology) in Australia

● Objectives:

Economies of

scale for characterizing

SeeD samples

using GbS

Genome-

profiling &value-

adding services

to scientists in

Mexico and the

region

Vehicle for

capacity-

building

Page 9: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

IT ‘ecosystem’ of SeeD

Web portal & data warehouse

(Germinate): , and validated

genotypic & phenotypic data

High-level data repository (Genesys):

Passport & summarized data

Genebank

management

(GrinGlobal) Web

services

Database

modules

Data access layer

Database & interfaces for

primary data (KDDart,

IBFieldbook) for managing

experiments (inventories,

germplasm evaluation, etc.)

Visualization

tools (Flapjack,

CurlyWhirly, …)

To be Open-

Sourced from

the first

production

version onwards (2015)

Collaboration with DArT and James Hutton Inst.)

Page 10: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Wheat diversity survey

● 42,000 accessions sequenced to date using

DArTseq

● One individual per

accession

● ~30,000 SNP and ~30,000

PAV per sample

● Comprehensive diversity

analysis and design of AM

panels is underway

● Positioning of markers

using new consensus map

● Target: Characterize up

to 160,000 accessions

(120,000 from CIMMYT)

Page 11: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Building AM panels

Genetic diversity

Phenotypic values

Core set / AM panel

Page 12: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

● To get an accurate representation of maize

landraces we need to score heterozygotes

● DArTseq is based on multiple REs whose

combination deliberately generates a smaller

number of fragments for deeper sequencing

● PstI enzyme used for DArTseq partly overlaps with ApeKI (Cornell) partly overlapping

representations

● Can score heterozygotes in many loci as multiple

copies of each tag are sequenced (ca. 2 M

fragments are typically sequenced per sample)

Maize diversity survey

Page 13: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

● Most accessions are

genetically heterogeneous landraces need to

genotype multiple individuals

(SSRs: 15–30 individuals)

● Can genotype bulks and

derive population-level allele

frequencies

Reduces costs of diversity

survey by more than an order of magnitude

● The allele frequencies derived

are representative of allele

frequencies in the accessions

(populations)

● PAVs: Genetic distances

among populations

Genotyping bulks

Individual samples

Po

ols

Page 14: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

4 8 12 16 20 24 28 32 36 40 44 48

4 22.3 21.5 18.8 17.2 17.8 17.1 17.5 17 17.9 17.7 17.4 17.2

8 11.7 9.8 6.4 4.5 4.5 4.1 3.2 4.3 4.3 4.2 4.2 3.5

12 12.2 9 6.1 3.9 3.7 2.1 3.2 3.3 2.8 3.1 3.3 2.6

16 11.2 9 5.1 2.6 2.4 1.8 2.3 2.4 1.9 2.3 1.7 2

20 11.9 9.4 3.7 2.6 3.3 3.1 2.4 2.1 3 2.1 2.8 2.3

24 12.2 9.7 5.9 2.1 1.4 2.2 2.7 1.6 3 2.4 1.4 2

28 10.2 9.7 6.6 3.9 4 3.5 2.3 3 3.2 2.7 3.4 2.5

32 11.9 9.1 5.2 2.5 2.3 1.7 2.2 2 1.7 2.2 1.6 1.8

36 11.9 8.8 4.3 2.2 2.7 2.5 2.1 1.5 2.5 1.7 2.3 1.7

40 11.3 8.8 5 2.4 2.3 1.4 1.2 1.7 1.6 1.2 1.6 1

44 11.7 9 4.7 1.9 2.1 2.3 2.1 1.4 2.4 1.8 1.8 1.6

48 11.1 8.1 4.7 2.6 2.5 1.7 1.7 2.3 2.3 1.8 2 1.1

No. of individuals per bulk?

● Compared separately

assembled bulks of

increasing size

● Little change above

bulk sizes of 32

● Used bulks of 30 leaf

discs from 30

individuals for diversity

survey

● Pooling at leaf-disk

and DNA sample levels

gave indistinguishable

results

Page 15: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Accession 1 Accession 2 Accession 40,000 Accession 3 ...

30

plants

each

Allele frequencies

within accessions

High-density

genome

profiles from

“bulk” samples

Genetic relationships

amongst accessions,

selection footprints,

race classification, etc.

Molecular

Atlas

1 DNA

sample

each

Started to

genotype up

to 40,000

accessions

Page 16: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Just finished 20,000 accessions…

• > 230,000 SNP identified (likely to increase upon re-calling

the entire set)

• Only 20% map to B73 reference genome! Whole-genome

re-sequencing of ca. 20 landraces in progress..

• Enriched for gene-rich regions (methylation filtration

effect)

• Target: Characterize up to 40,000 accessions (27,000

from CIMMYT)

Position on chromosome

No. of

SNP within

window

Page 17: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

● Environmental-selection footprints

18,500 accessions with good-quality geo-location

data

Extracted long-term abiotic environment data

Identify allele/haplotype-frequency gradients

across environmental clines in entire genebank

collection

● Breeding-selection footprints

Multiple cycles of recurrent-selection populations

genotyped

Identify response to selection

● Race-specific footprints

Next steps

Page 18: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Maize GWAS

● Existing core collection of 4,500

landraces, three

adaptation zones

● Assumption:

haplotypes replicated across accessions

testcross one

individual per accession with

adaptation-zone-

specific hybrid

● Genotyped testcross

parents

Accession 1 Accession 4,500

Tester Tester

… GWAS

GbS

Field trials

Page 19: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000
Page 20: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Field trials for GWAS

Traits evaluated

Abiotic stresses

heat

drought

low N

Biotic stresses

tar spot, ear rot, stalk

rot, Turcicum, Cercospora

Grain quality

hardness, starch, oil, amino acids, phenolics

Collected 700,000 data

points from 34 trials

across 14 locations

Page 21: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

36 Latin American countries

Highland Subtropical

Tropical

GbS profiles of testcross parents

● Genotyped both with

Cornell GBS and

DArTseq methods

Maximize marker density (Cornell)

Enable identification of heterozygote regions (DArTseq)

● Imputation based on

prevalent haplotypes

detected in ca.

40,000 maize samples

genotyped on Cornell

platform

● Little genetic structure

Page 22: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Proof of concept: Days to silking

● GWAS approach works

● Marker density just sufficient

Page 23: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Anthesis: Teocinte-derived inversion

Page 24: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Tar spot disease complex

● Up to 46%

yield loss

● Caused by

Phyllachora

maydis and

Monographell

a maydis in

association

Chromosome 9 (position 139,172,758):

P = 1.01e -7

Tar spot incidence

Yie

ld

` Accessions

Testcrosses

Page 25: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

• Breeder-ready lines &populations with new, beneficial alleles for priority characters in elite genetic backgrounds joined linkage/association mapping & trait mobilization into breeding programs

• Molecular markers linked to beneficial alleles and statistical models for estimating breeding values to

accelerate genetic progress in breeding programs

Molecular

atlases

Asociación

genómica

Novel alleles

and allele donors

‘Bridging

germplasm’

1 2

3 Elite germplasm

selected by

breeders

New breeding

approaches and

technologies; new

tools such as GS

Next steps

Page 26: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Maize ‘bridging germplasm’

…using multiple strategies defined by trait complexity

and breeder needs

(desired input germplasm,

demand for new sources)

Useful novel

alleles & haplotypes

Early generation

lines & pools enriched for

favorable alleles

Breeder

demand

Trait complexity

Monogenic (1-3)

Oligogenic (4-10)

Polygenic (>10)

Urgent DH from landrace &

landrace / line crosses, selfing

DH from landrace &

landrace / line crosses, selfing

GS with MABC for

BC1S1 develop-ment

Medium-term

MABC MARS & prediction

index

GS with MABC for

BC1S2 develop-ment

Long-term MABC & GS

MARS, prediction index & GS

GS with MABC for BC1S2

develop-ment

Page 27: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

50:50

25:75

Elite

Elite

Family of

fixed lines

Exotic

Linked Topcross Panel (LTP) for joint linkage/association mapping

50:50

25:75

Elite 1

Elite 2

Family 1 of fixed lines

50:50

25:75

Elite 2

Elite 3

Family 2 of fixed lines

Exotic 1 Exotic 2

Linked topcrosses with partly

overlapping parents

Wheat ‘bridging germplasm’

Linked topcross

panel (LTP)

• TC chains with partly overlapping elite parents

• 200 exotics (synthetics, landraces; FIGS)

• 10 elites selected by

breeders • Currently at TC1F3

stage

Joint linkage/association mapping to identify novel exotic alleles

that are expressed across several elite genetic backgrounds

Page 28: Seeds of Discovery (SeeD) - KSIConnectksiconnect.icrisat.org/wp-content/uploads/2014/02/peter...Atlas 1 DNA sample each Started to genotype up to 40,000 accessions Just finished 20,000

Thank you!

Jonás Aguirre (UNAM), Flavio Aragón (INIFAP), Odette Avendaño (LANGEBIO), Ed Buckler (Cornell Univ.), Juan Burgueño, Vijay Chaikam,

Alain Charcosset (AMAIZING), Gabriela Chávez (INIFAP), Jiafa Chen, Charles Chen, Andrés Christen (CIMAT), Angelica Cibrian (LANGEBIO),

Héctor M. Corral (AGROVIZION), Moisés Cortés (CNRG), Sergio Cortez (UPFIM), Denise Costich, Lino de la Cruz (UdeG), Armando Espinosa

(INIFAP), Néstor Espinosa (INIFAP), Gilberto Esquivel (INIFAP), Luis Eguiarte (UNAM), Gaspar Estrada (UAEM), Juan D. Figueroa (CINVESTAV),

Pedro Figueroa (INIFAP), Jorge Franco (UDR), Guillermo Fuentes (INIFAP), Amanda Gálvez (UNAM), Héctor Gálvez (SAGA), Karen García,

Silverio García (ITESM), Noel Gómez (INIFAP), Gregor Gorjanc (Roslin Inst.), Sarah Hearne, Carlos Hernández, Juan M. Hernández (INIFAP),

Víctor Hernández (INIFAP), Luis Herrera (LANGEBIO), John Hickey (Roslin Inst.), Huntington Hobbs, Puthick Hok (DArT), Javier Ireta (INIFAP),

Andrzej Kilian (DArT), Huihui Li, Francisco J. Manjarrez (INIFAP), David Marshall (JHI), César Martínez, Carlos G. Martínez (UAEM), Manuel

Martínez (SAGA), Iain Milne (JHI), Terrence Molnar, Moisés M. Morales (UdeG), Henry Ngugi, Alejandro Ortega (INIFAP), Iván Ortíz,

Leodegario Osorio (INIFAP), Natalia Palacios, José Ron Parra (UdeG), Tom Payne, Javier Peña, Cesar Petroli (SAGA), Kevin Pixley, Ernesto

Preciado (INIFAP), Matthew Reynolds, Sebastian Raubach (JHI), María Esther Rivas (BIDASEM), Carolina Roa, Alberto Romero (Cornell Univ.),

Ariel Ruíz (INIFAP), Carolina Saint-Pierre, Jesús Sánchez (UdeG), Gilberto Salinas, Yolanda Salinas (INIFAP), Carolina Sansaloni (SAGA),

Ruairidh Sawers (LANGEBIO), Sergio Serna (ITESM), Paul Shaw (JHI), Rosemary Shrestha, Aleyda Sierra (SAGA), Pawan Singh, Sukhwinder

Singh, Giovanni Soca, Ernesto Solís (INIFAP), Kai Sonder, Maria Tattaris, Maud Tenaillon (AMAIZING), Fernando de la Torre (CNRG), Heriberto

Torres (Pioneer), Samuel Trachsel, Grzegorz Uszynski (DArT), Ciro Valdés (UANL), Griselda Vásquez (INIFAP), Humberto Vallejo (INIFAP), Víctor

Vidal (INIFAP), Eduardo Villaseñor (INIFAP), Prashant Vikram, Martha Willcox, Peter Wenzl, Víctor Zamora (UAAAN)

Contributed at the beginning: Gary Atlin, Michael Baum (ICARDA), David Bonnett, Paul Brennan (CropGen), Etienne Duveiller, Mustapha El-

Bouhssini (ICARDA), Marc Ellis, Ky Matthews, Bonnie Furman, Marta Lopes, George Mahuku, Francis Ogbonnaya (ICARDA), Ken Street

(ICARDA)

Participants from other countries

Paritipants from Mexican institutions

Participants from CIMMYT

http://seedsofdiscovery.org ([email protected])