Artificial Selection and the Genome: “ Deep Pedigree ” Analysis in an Elite Soybean Cultivar...

1
Artificial Selection and the Genome: “Deep Pedigree” Analysis in an Elite Soybean Cultivar Chris M Grainger, Elizabeth A Lee and Istvan Rajcan Department of Plant Agriculture, University of Guelph, Guelph ON, N1G 2W1 University of Guelph Pioneer Hi-Bred Agriculture Canada Ridgetown College La Coop Fédérée Semences Prograin Introduction Ancestral Group 1 Figure 1: Ancestral and next-generation cultivars comprising OAC Bayfield’s pedigree One strategy for characterizing molecular changes in the development of elite varieties is to genotype the members of the pedigrees that comprise elite varieties. As pedigrees represent a record of breeder manipulations [1] genotyping not only the ancestors of commercially successful varieties, but also elite varieties developed from these landmark varieties can be of great value in identifying genomic regions of importance. As selection purges unfavourable alleles while maintaining favourable allele combinations in the form of linkage blocks, the net result is further reduction in the genetic diversity within elite germplasm and a build-up of allelic structure. From a molecular breeding perspective, it is of interest to characterize this process and identify both the historical contribution of ancestral alleles that ultimately become incorporated into elite varieties developed for a specific environment. In addition to determining historical molecular changes, identifying genomic regions which exhibit molecular “selection signatures” through breeder imposed genetic bottlenecks (i.e. the use of elite varieties for subsequent variety development) can serve as a method to discover potential genomic regions of importance across a range of genotypes. OAC Bayfield represents a landmark variety for soybean growers in Ontario. Developed by the University of Guelph soybean breeding program in 1985, it was commercially released in 1994. At its peak in 1998, it was grown on over 400,000 acres which represented ~20% of the total soybean acreage in Ontario for that year. From1994-2004 the estimated economic value of OAC Bayfield to the Ontario economy is in excess of $750 million. This value is considerably higher given its role as a parent/grand parent in a number of commercially successful varieties derived from it such as OAC Wallace, OAC Champion and OAC Kent. The use of OAC Bayfield as a parent is not limited to cultivars developed from the University of Guelph. Through germplasm exchange, it has been used in independent breeding programs for cultivar development purposes. Given the overall impact OAC Bayfield has had on commercial soybean breeding in Ontario, its pedigree was selected for a detailed characterization of both genetic diversity and chromosome composition changes through multiple generations of the breeding process. • By understanding the nature of the molecular changes to the genome over multiple generations of selection through applied breeding, breeders can gain further knowledge about the specific molecular landscape of their germplasm with respect to a given environment. This can be used as a strategy for incorporating novel beneficial genetic diversity by screening germplasm introductions as well as targeting linkage blocks for recombination or identifying those linkage blocks which are being selectively maintained. As well, QTL which show evidence of selection can be examined in more detail for specific genes located in these regions, especially given the array of high density genomic technologies which are currently available. • The next phase of research will focus on extending this methodology across the pedigrees of the soybean breeding program at the University of Guelph for a comprehensive analysis of the molecular changes to the elite germplasm from over multiple generations of applied breeding. Methods Genetic Diversity Chromosome Composition and Transmission Microsatellite Scan for Selection Conclusion and Future Research References Ancestral Group 2 Ancestral Group 3 Food Grade/MG II Yield/Protein/MG 00-0 Yield/Oil/MG 00- 1 • In total 5-6 generations of cultivars comprising the pedigree of OAC Bayfield as well as 2 generations of cultivars developed from multiple breeding programs (private and public) utilizing OAC Bayfield as a parent or grandparent were evaluated (Figure 1). The allelic composition of the chromosomes were characterized and traced through the generations by genotyping all 20 chromosomes with SSR molecular markers at a density of approximately 1 marker every 10cM. • Various aspects of the breeding process were investigated. Genetic diversity was assessed with a phylogenetic analysis of the SSR data. In addition, the microsatellite alleles were displayed as graphical genotypes to visualize the allelic changes over the generations. Finally a genome- scan using two statistics to identify genomic regions which may have been subjected to breeder selection were compared to genetic maps on soybase as an in silico QTL mapping strategy. • The development of an elite variety and subsequent usage as a parent in multiple crossings to create new varieties can be viewed as a “breeding population bottleneck” as genetic diversity is eroded through generations of selection to create the elite variety. This elite variety in effect becomes a founder for a new population expansion when crossed with multiple lines when developing new varieties . For this study, OAC Bayfield is considered a genetic bottleneck event and the members of the pedigree are assigned to either an “ancestral” group (i.e. pre-OAC Bayfield) or a “current” group (i.e. post-OAC Bayfield). By comparing the changes in genetic diversity/ variation between the groups, genomic regions which show unusual patterns of allelic variation (which may be attributed to effects of selection) can be identified. • Two statistics were used to test for selection. The first was LN[(RH)], which is based on genetic hitchhiking, where candidate regions are identified by surveying selectively neutral markers and those markers which exhibit significant reduction (selective sweep) or diversification of genetic diversity between the two groups are indicators of a genomic region which may have experienced selection [2,3]. The other statistic was the classical F st test, where genetic variation is partitioned into within and among groups, with the highest F st values indicating a greater amount of genetic differentiation between the two groups at a given locus. • Cluster analysis grouped the varieties into six clades (Figure 2), which in general, showed high congruence with the known pedigree record. The dendrogram revealed three distinct ancestral clades that are separate from each other as well as the current varieties. • Ancestral group 3 was the most divergent clade, as cultivars in this group had the greatest number of rare alleles which are not transmitted to the future generations. • There was clear allelic stratification among the various cultivars derived from OAC Bayfield, which is related to the shared phenotypic characteristics of the cultivars that group together. The major phenotypic characteristics of the cultivars in the various clades are maturity (range of maturity-from 00-II) and either high total oil content (>20%) or high total protein content (>40%). • From the base generation of the pedigree, the change in the allelic structure of each chromosome can be observed as they are tracked through the pedigree. A striking feature of the transmission pattern through the pedigree is the high level of allelic structure that is built up in particular chromosomes which is conservatively transmitted, while a lack of any such structure in others. The allelic structure is primarily in the form of linkage blocks that fall into two general categories; ancestral linkage blocks that are combined through recombination with subsequent conserved transmission, and novel linkage groups introduced from Fiskeby-V. • As an example, four chromosomes which highlight this build-up of allelic structure are given below. The graphical genotype profiles depict the various allelic series (different alleles are different colours) at a given microsatellite locus (map position based on composite map in soybase). The boxes show regions of either high allelic structure which is conservatively transmitted (Figures 3a and 3b), or specific linkage blocks which trace back to Fiskeby-V, an early maturity plant introduction (Figures 3c and 3d). By identifying these regions, breeders can gain insight into what genomic regions are being manipulated through the breeding process, or in the case of Fiskeby-V, identify alleles or linkage blocks that have been mined from key ancestors. Chr. 1 Chr. 2 Chr. 4 Chr. 5 Chr. 6 Chr. 8 Chr. 7 Chr. 3 Chr. 9 Chr. 10 Chr. 11 Chr. 12 Chr. 13 Chr. 14 Chr. 15 Chr. 16 Chr. 17 Chr. 18 Chr. 19 Chr. 20 C hrom osom e SSR M arker D etected w ith LN [(R H )]/Fst/B oth M arker M ap P osition (cM ) TraitQ TL 1 S att531 LN [(RH)] 40.86 P rotein 2 S att274 LN [(RH)] 116.34 O il 3 S att387 LN [(RH)] 53.25 M ultiQ TL (yield/planthieght/pod m aturity) 4 S att578 LN [(RH)] 65.08 M ultiQ TL (protein/pod m aturity) 4 S att524 LN [(RH)] 120.1 N/A 5 S att050 LN [(RH)] 46.45 Leafw idth 6 S att319 Fst 113.4 E 1 G ene 6 S att357 B oth 151.91 SDS 9 S att242 B oth 14.35 P rotein 10 S att358 LN [(RH)] 5.44 M ultiQ TL (yield/seed w eight/protein) 11 S att426 Fst 28.3 M ultiQ TL (protein/oil) 11 S att332 Fst 80.9 S eed w eight 12 S att568 LN [(RH)] 27.6 N/A 12 S att442 LN [(RH)] 46.95 Leafw idth 12 S att469 Fst 58.9 C orn earw orm 15 S att369 Fst 56.3 Leafshape 15 S att230 Fst 71.3 P lantheight 16 S att249 B oth 11.74 M ultiQ TL (yield/isoflavon) 17 S att186 Fst 105.4 Y eild 17 S att386 Fst 125 S clerotina 19 S att561 Fst 71.4 M ultiQ TL (yield/oil) 20 S at104 B oth 65.6 N/A Q TL M ap P osition Interval(cM ) 39.80-41.80 115.35-117.35 52.25-54.25 64.08-66.08 45.50-47.50 113 149-151 5.80-25.80 4.44-6.44 28.17-30.17 80.3-82.31 45.95-47.95 53.40-61.30 55.30-57.30 69.23-71.23 10.70-12.70 104.15-106.15 123.31-125.31 70.44-72.44 Ancestors Current • The graph and table summarize the results from the genome scan for selection. The coloured arrows in the graph indicate specific SSR makers which were significant for either or both of the test statistics. • In a reverse genetics approach, 19/22 loci mapped in regions of previously identified QTL on soybase, with the QTL spanning a wide range of traits, especially adaptation-related traits (e.g. maturity/disease resistance). Figure 3a: Chromosome 1 Figure 3b: Chromosome 7 Figure 3c: Chromosome 8 Figure 3d: Chromosome 16 Figure 2. Dendrogram of genetic relatedness among member s of OAC Bayfield’s pedigree. [1] Shoemaker, R.C., R.D. Guffy, L.L. Lorenzen, and J.E. Specht. 1992. Molecular genetic mapping of soybean: Map utilization. Crop Sci. 32:1091-1098. [2] Schlotterer C. A microsatellite-based multilocus screen for the identification of local selective sweeps.Genetics 2002; 160(2):753-763. [3] Casa, A.M., Mitchell, S.E., Hamblin, M.T., Sun, H., Bowers, J.E., Paterson, A.H., Aquadro, C.F. and Kresovich, S. 2005. Diversity and selection in sorghum: simultaneous analysis using simple sequence repeats. Theor Appl Genet; 111:23-30. Contact: [email protected] LN[(RH) ] Fst Both

Transcript of Artificial Selection and the Genome: “ Deep Pedigree ” Analysis in an Elite Soybean Cultivar...

Page 1: Artificial Selection and the Genome: “ Deep Pedigree ” Analysis in an Elite Soybean Cultivar Chris M Grainger, Elizabeth A Lee and Istvan Rajcan Department.

Artificial Selection and the Genome: “Deep Pedigree” Analysis in an

Elite Soybean CultivarChris M Grainger, Elizabeth A Lee and Istvan RajcanDepartment of Plant Agriculture, University of Guelph, Guelph ON, N1G 2W1

University of Guelph

Pioneer Hi-Bred

Agriculture Canada

Ridgetown College

La Coop Fédérée

Semences Prograin

Introduction

Ancestral Group 1

Figure 1: Ancestral and next-generation cultivars comprising OAC Bayfield’s pedigree

One strategy for characterizing molecular changes in the development of elite varieties is to genotype the members of the pedigrees that comprise elite varieties. As pedigrees represent a record of breeder manipulations [1] genotyping not only the ancestors of commercially successful varieties, but also elite varieties developed from these landmark varieties can be of great value in identifying genomic regions of importance. As selection purges unfavourable alleles while maintaining favourable allele combinations in the form of linkage blocks, the net result is further reduction in the genetic diversity within elite germplasm and a build-up of allelic structure. From a molecular breeding perspective, it is of interest to characterize this process and identify both the historical contribution of ancestral alleles that ultimately become incorporated into elite varieties developed for a specific environment. In addition to determining historical molecular changes, identifying genomic regions which exhibit molecular “selection signatures” through breeder imposed genetic bottlenecks (i.e. the use of elite varieties for subsequent variety development) can serve as a method to discover potential genomic regions of importance across a range of genotypes. OAC Bayfield represents a landmark variety for soybean growers in Ontario. Developed by the University of Guelph soybean breeding program in 1985, it was commercially released in 1994. At its peak in 1998, it was grown on over 400,000 acres which represented ~20% of the total soybean acreage in Ontario for that year. From1994-2004 the estimated economic value of OAC Bayfield to the Ontario economy is in excess of $750 million. This value is considerably higher given its role as a parent/grand parent in a number of commercially successful varieties derived from it such as OAC Wallace, OAC Champion and OAC Kent. The use of OAC Bayfield as a parent is not limited to cultivars developed from the University of Guelph. Through germplasm exchange, it has been used in independent breeding programs for cultivar development purposes. Given the overall impact OAC Bayfield has had on commercial soybean breeding in Ontario, its pedigree was selected for a detailed characterization of both genetic diversity and chromosome composition changes through multiple generations of the breeding process.

• By understanding the nature of the molecular changes to the genome over multiple generations of selection through applied breeding, breeders can gain further knowledge about the specific molecular landscape of their germplasm with respect to a given environment. This can be used as a strategy for incorporating novel beneficial genetic diversity by screening germplasm introductions as well as targeting linkage blocks for recombination or identifying those linkage blocks which are being selectively maintained. As well, QTL which show evidence of selection can be examined in more detail for specific genes located in these regions, especially given the array of high density genomic technologies which are currently available.

• The next phase of research will focus on extending this methodology across the pedigrees of the soybean breeding program at the University of Guelph for a comprehensive analysis of the molecular changes to the elite germplasm from over multiple generations of applied breeding.

Methods

Genetic Diversity

Chromosome Composition and Transmission

Microsatellite Scan for Selection

Conclusion and Future Research

References

Ancestral Group 2

Ancestral Group 3

Food Grade/MG II

Yield/Protein/MG 00-0

Yield/Oil/MG 00-1

• In total 5-6 generations of cultivars comprising the pedigree of OAC Bayfield as well as 2 generations of cultivars developed from multiple breeding programs (private and public) utilizing OAC Bayfield as a parent or grandparent were evaluated (Figure 1). The allelic composition of the chromosomes were characterized and traced through the generations by genotyping all 20 chromosomes with SSR molecular markers at a density of approximately 1 marker every 10cM.

• Various aspects of the breeding process were investigated. Genetic diversity was assessed with a phylogenetic analysis of the SSR data. In addition, the microsatellite alleles were displayed as graphical genotypes to visualize the allelic changes over the generations. Finally a genome-scan using two statistics to identify genomic regions which may have been subjected to breeder selection were compared to genetic maps on soybase as an in silico QTL mapping strategy.

• The development of an elite variety and subsequent usage as a parent in multiple crossings to create new varieties can be viewed as a “breeding population bottleneck” as genetic diversity is eroded through generations of selection to create the elite variety. This elite variety in effect becomes a founder for a new population expansion when crossed with multiple lines when developing new varieties . For this study, OAC Bayfield is considered a genetic bottleneck event and the members of the pedigree are assigned to either an “ancestral” group (i.e. pre-OAC Bayfield) or a “current” group (i.e. post-OAC Bayfield). By comparing the changes in genetic diversity/ variation between the groups, genomic regions which show unusual patterns of allelic variation (which may be attributed to effects of selection) can be identified.

• Two statistics were used to test for selection. The first was LN[(RH)], which is based on genetic hitchhiking, where candidate regions are identified by surveying selectively neutral markers and those markers which exhibit significant reduction (selective sweep) or diversification of genetic diversity between the two groups are indicators of a genomic region which may have experienced selection [2,3]. The other statistic was the classical Fst test, where genetic variation is partitioned into within and among groups, with the highest Fst values indicating a greater amount of genetic differentiation between the two groups at a given locus.

• Cluster analysis grouped the varieties into six clades (Figure 2), which in general, showed high congruence with the known pedigree record. The dendrogram revealed three distinct ancestral clades that are separate from each other as well as the current varieties.

• Ancestral group 3 was the most divergent clade, as cultivars in this group had the greatest number of rare alleles which are not transmitted to the future generations.

• There was clear allelic stratification among the various cultivars derived from OAC Bayfield, which is related to the shared phenotypic characteristics of the cultivars that group together. The major phenotypic characteristics of the cultivars in the various clades are maturity (range of maturity-from 00-II) and either high total oil content (>20%) or high total protein content (>40%).

• From the base generation of the pedigree, the change in the allelic structure of each chromosome can be observed as they are tracked through the pedigree. A striking feature of the transmission pattern through the pedigree is the high level of allelic structure that is built up in particular chromosomes which is conservatively transmitted, while a lack of any such structure in others. The allelic structure is primarily in the form of linkage blocks that fall into two general categories; ancestral linkage blocks that are combined through recombination with subsequent conserved transmission, and novel linkage groups introduced from Fiskeby-V.

• As an example, four chromosomes which highlight this build-up of allelic structure are given below. The graphical genotype profiles depict the various allelic series (different alleles are different colours) at a given microsatellite locus (map position based on composite map in soybase). The boxes show regions of either high allelic structure which is conservatively transmitted (Figures 3a and 3b), or specific linkage blocks which trace back to Fiskeby-V, an early maturity plant introduction (Figures 3c and 3d). By identifying these regions, breeders can gain insight into what genomic regions are being manipulated through the breeding process, or in the case of Fiskeby-V, identify alleles or linkage blocks that have been mined from key ancestors.

Chr. 1 Chr. 2 Chr. 4 Chr. 5 Chr. 6 Chr. 8Chr. 7Chr. 3 Chr. 9 Chr. 10

Chr. 11 Chr. 12 Chr. 13 Chr. 14 Chr. 15 Chr. 16 Chr. 17 Chr. 18 Chr. 19 Chr. 20

Chromosome SSR Marker Detected with LN[(RH)]/Fst/Both Marker Map Position (cM) Trait QTL 1 Satt 531 LN[(RH)] 40.86 Protein 2 Satt 274 LN[(RH)] 116.34 Oil 3 Satt 387 LN[(RH)] 53.25 Multi QTL (yield/plant hieght/pod maturity)4 Satt 578 LN[(RH)] 65.08 MultiQTL (protein/pod maturity)4 Satt 524 LN[(RH)] 120.1 N/A5 Satt 050 LN[(RH)] 46.45 Leaf width6 Satt 319 Fst 113.4 E1 Gene6 Satt 357 Both 151.91 SDS9 Satt 242 Both 14.35 Protein

10 Satt 358 LN[(RH)] 5.44 Multi QTL (yield/seed weight/protein)11 Satt 426 Fst 28.3 Multi QTL (protein/oil) 11 Satt 332 Fst 80.9 Seed weight12 Satt 568 LN[(RH)] 27.6 N/A12 Satt 442 LN[(RH)] 46.95 Leaf width12 Satt 469 Fst 58.9 Corn ear worm15 Satt 369 Fst 56.3 Leaf shape15 Satt 230 Fst 71.3 Plant height16 Satt 249 Both 11.74 Multi QTL (yield/isoflavon)17 Satt 186 Fst 105.4 Yeild17 Satt 386 Fst 125 Sclerotina 19 Satt 561 Fst 71.4 Multi QTL (yield/oil)20 Sat 104 Both 65.6 N/A

QTL Map Position Interval (cM)39.80-41.80

115.35-117.3552.25-54.2564.08-66.08

45.50-47.50113

149-1515.80-25.804.44-6.44

28.17-30.1780.3-82.31

45.95-47.9553.40-61.3055.30-57.3069.23-71.2310.70-12.70

104.15-106.15123.31-125.3170.44-72.44

Ancestors

Current

• The graph and table summarize the results from the genome scan for selection. The coloured arrows in the graph indicate specific SSR makers which were significant for either or both of the test statistics.

• In a reverse genetics approach, 19/22 loci mapped in regions of previously identified QTL on soybase, with the QTL spanning a wide range of traits, especially adaptation-related traits (e.g. maturity/disease resistance).

Figure 3a: Chromosome 1 Figure 3b: Chromosome 7 Figure 3c: Chromosome 8 Figure 3d: Chromosome 16

Figure 2. Dendrogram of genetic relatedness among member s of OAC Bayfield’s pedigree.

[1] Shoemaker, R.C., R.D. Guffy, L.L. Lorenzen, and J.E. Specht. 1992. Molecular genetic mapping of soybean: Map utilization. Crop Sci. 32:1091-1098. [2] Schlotterer C. A microsatellite-based multilocus screen for the identification of local selective sweeps.Genetics 2002; 160(2):753-763.[3] Casa, A.M., Mitchell, S.E., Hamblin, M.T., Sun, H., Bowers, J.E., Paterson, A.H., Aquadro, C.F. and Kresovich, S. 2005. Diversity and selection in sorghum: simultaneous analysis using simple sequence repeats. Theor Appl Genet; 111:23-30.Contact: [email protected]

LN[(RH)]

Fst

Both