M. Taylor Perkins1, Tatyana Zhebentyayeva2, Paul H. Sisco3 ... › wp-content › uploads › 2019...

1
M. Taylor Perkins 1 , Tatyana Zhebentyayeva 2 , Paul H. Sisco 3 , J. Hill Craddock 1 1 Department of Biology, Geology, and Environmental Science, The University of Tennessee at Chattanooga 2 The Schatz Center for Tree Molecular Genetics, Department of Ecosystem Science and Management, The Pennsylvania State University 3 The American Chestnut Foundation (Carolinas Chapter) Characterizing the genetic diversity of species within the North American Castanea is critical to restoration of the American chestnut (Castanea dentata) because such efforts provide clues regarding past patterns of hybridization between American chestnut and its sister species, North American chinquapin (C. pumila), as well as information regarding the partitioning of genetic diversity within the genus. Historical reports based on morphology and recent studies involving small numbers of molecular markers have posited that natural hybridization between American chestnut and Allegheny chinquapin has produced phenotypically intermediate populations in the Southeast. These reports have resulted in description of multiple putative hybrid taxa, including Castanea alabamensis, a taxon from northern Alabama. We investigated the hypothesized hybrid origin of C. alabamensis by performing sequence-based genotyping of plants representing American chestnut, Allegheny chinquapin, Ozark chinquapin, Japanese chestnut, Chinese chestnut, and several populations matching the taxonomic description of C. alabamensis. Alignment of Illumina reads to the Chinese chestnut reference genome v1.1 yielded 190,656 single nucleotide polymorphism (SNP) loci for analyses. Phylogenetic analysis indicated that C. alabamensis samples cluster in a distinct group within the broader chinquapin clade. Analyses of population structure provided evidence of limited hybridization between American chestnut and Allegheny chinquapin and more extensive admixture among different chinquapin varieties. C. alabamensis, however, exhibited no signature of American chestnut ancestry. Principal component analysis revealed the existence of four genetic clusters in our North American samples: American chestnut, Allegheny chinquapin, Ozark chinquapin, and C. alabamensis. Our results do not support the hybridization hypothesis, but instead suggest that C. alabamensis is a distinct variety of chinquapin, and we therefore refer to it as C. pumila var. alabamensis. The results of our study demonstrate the potential of high-throughput sequencing to uncover and characterize cryptic diversity in the North American Castanea species. Abstract Morphological and DNA evidence support recognition of Alabama chinquapin, Castanea pumila var. alabamensis American Chestnut Foundation Annual Meeting, October 26-28, 2018, Huntsville, Alabama Fig. 1. Type specimen of C. alabamensis. Collected in 1924 by WW Ashe and described as a species by Ashe (1925). C. pumila var. pumila C. mollissima C. crenata C. pumila var. ozarkensis C. alabamensis C. dentata * * Plant materials—We collected herbarium vouchers and leaf tissue for DNA extraction from C. dentata, C. pumila var. pumila, C. pumila var. ozarkensis, C. crenata, C. mollissima, and five populations matching the taxonomic description of C. alabamensis. Phenotyping—We made morphological comparisons of our samples to the entire collection of Castanea specimens (870 in total) at the UNC-Chapel Hill herbarium, which contains representatives of all extant Castanea species and the type specimens of C. alabamensis (Fig. 1). Specimens were assessed using trichome, leaf, twig, flower, and fruit characters that differentiate the various Castanea taxa (examples in Fig. 2). Illumina library prep and DNA sequencing—Illumina libraries were prepared using a modification of the genotyping-by-sequencing protocol of Elshire et al. (2011), as described by Zhebentyayeva et al. (in press). Library prep included double-digestion with Pst1 and Mse1 restriction enzymes. GBS libraries were paired-end sequenced on an Illumina HiSeq 2500. Bioinformatics—Raw Illumina reads were processed using Stacks v1.45 software (Rochette and Catchen 2017). Demultiplexed reads were aligned to the C. mollissima reference genome v1.1 (https://www.hardwoodgenomics.org) using GSNAP software (Wu and Nacu 2010). We called single nucleotide polymorphisms (SNPs) against the C. mollissima reference genome and filtered the SNPs to retain only one SNP per locus, remove tightly linked SNPs, and ensure good coverage of the entire genome. Data analysis—We inferred a phylogeny of C. alabamensis and its congeners using maximum likelihood in RAxML software (Stamatakis 2014). We used STRUCTURE software (Pritchard et al. 2000) to detect admixture and population structure. To provide a second test of admixture we performed principal component analysis (PCA) using the SNPrelate package (Zheng et al. 2012) in R software (R Core Team 2018). Materials and Methods C. alabamensis C. pumila var. ozarkensis C. pumila var. pumila C. dentata Fig. 5. Partitioning of genetic variation in North American Castanea based on 138,722 SNPs. Dot colors correspond to the following taxa: blue = C. pumila var. pumila, green = C. pumila var. ozarkensis, black = C. alabamensis, and red = C. dentata. C. dentata C. pumila sensu lato E. Asian spp. var. pumila alabamensis and var. ozarkensis A B A B C D Fig. 2 Morphological features distinguishing C. alabamensis from C. dentata and C. pumila var. ozarkensis. (A) Eciliate leaf margin of C. dentata. (B) Ciliate leaf margin of C. alabamensis. (C) Abaxial leaf surface of C. pumila var. ozarkensis covered with stellate and simple trichomes. (D) Abaxial leaf surface of C. alabamensis lacking stellate trichomes. Scale bars = 5 mm. Phenotyping—Comparison of our samples to those from Ruffner Mtn., AL, and Floyd Co., GA, showed that the putative hybrids analyzed by Li and Dane (2013) and Shaw et al. (2012) match the taxonomic description of C. alabamensis (Ashe 1925). Plants matching the C. alabamensis morphology occur from northwestern GA to central AL. Sequencing and bioinformatics—A total of 385 million reads were obtained for 96 plants; 98% of these were retained as high-quality. An avg. of 4 million reads were retained per individual. 190,656 genome-wide SNPs met filtering criteria and were used for analyses. Phylogenetic inference—C. alabamensis samples clustered together as a distinct group within the broader chinquapin clade (Fig. 3). Population structure analysis—No genomic contribution from C. dentata was observed in C. alabamensis (Fig. 4A). STRUCTURE software grouped C. alabamensis within C. pumila sensu lato. (Fig. 4A). In an analysis of only chinquapins, STRUCTURE determined two distinct genetic groups: (1) C. pumila var. pumila and (2) C. pumila var. ozarkensis + C. alabamensis (Fig. 4B). Low to moderate admixture between these two groups was observed in most chinquapin populations. PCA—North American Castanea samples were clustered into four discrete groups: C. dentata, C. pumila var. pumila, C. pumila var. ozarkensis, and the C. alabamensis samples. The first two principal components separated samples into distinct species (y-axis in Fig. 5) and botanical varieties within species (x-axis in Fig. 5). Results Nonhybrid origin of C. alabamensis—Our results do not support the hypothesis of a hybrid origin for C. alabamensis, which contradicts the conclusions of Li and Dane (2013) and Shaw et al. (2012), who used smaller numbers of molecular markers to analyze morphologically intermediate plants from the same populations. Genetic affinities of C. alabamensis—The placement of C. alabamensis in all of our ancestry analyses indicates that this taxon falls within the chinquapin group, and is closely related to Ozark chinquapin, which aligns partially with Johnson’s (1988) assertion that these plants are Ozark chinquapin. We tentatively conclude that these plants are a distinct variety of chinquapin, and we therefore refer to them as C. pumila var. alabamensis. Future directions—Using genome-wide data, we have carried out the most thorough analysis of evolutionary relationships and admixture among North American Castanea to date. The ability of our genome-wide approach to detect admixture and fine population structure in Castanea species demonstrates that our methods can be applied to important questions regarding chestnut evolution and conservation. Future avenues include investigations into the nature of gene flow between different chinquapin varieties. Concluding Remarks Foundation for the Carolinas, David and Judi Morris, Bruce and Francine Hutchinson, Glenda Frames, Dr. Jack Agricola, Will Calhoun, Dr. Jim Lacefield, Marty Schulman, Dr. Larry Brasher, Dr. Lisa W. Alexander, Ed Schwartzman, Dan Thornton, Dr. Stylianos Chatzimanolis, Tennessee Chapter-TACF, Alabama Chapter-TACF, Kendra Collins, Eric Evans, Dr. Penny Xia, Dr. Christopher Saski, the Clemson University Genomics and Computational Biology Laboratory. Acknowledgments Ashe (1925) Notes on woody plants. Quarterly of the Charleston Museum 1:28-32. Elshire et al. (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6, e19379. Johnson (1988) Revision of Castanea sect. Balanocastanon (Fagaceae). Journal of the Arnold Arboretum 69:25-49. Li and Dane (2013) Comparative chloroplast and nuclear DNA analysis of Castanea species in the southern region of the USA. Tree Genetics and Genomes 9:107-116. Pritchard et al. (2000) Inference of population structure using multilocus genotype data. Genetics 155:945-959. R Core Team (2018) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Rochette and Catchen (2017) Deriving genotypes from RAD-seq short-read data using Stacks. Nature Protocols 12:2640. Shaw et al. (2012) Phylogeny and phylogeography of North American Castanea Mill. (Fagaceae) using cpDNA suggests gene sharing in the Southern Appalachians (Castanea Mill., Fagaceae). Castanea 77:186-211. Stamatakis (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312-1313. Wu and Nacu (2010) Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 26:873-881. Zhebentyayeva et al. (in press) Genetic characterization of world-wide Prunus domestica (plum) germplasm using sequence-based genotyping. Horticulture Research. Zheng et al. (2012) A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28:3326-3328. References Fig. 3 Evolutionary relationships based on 103,616 genome-wide SNPs. Asterisks (*) indicate interspecific hybrids identified by STRUCTURE analysis. Scale bar = 0.03 subst./site. Fig. 4. Levels of admixture between Castanea taxa based on 583 SNPs. Sample site information along the lower edge of the plots indicate species, site code, and state. (A) Analysis of North American and E. Asian Castanea species. (B) Analysis using only North American chinquapins.

Transcript of M. Taylor Perkins1, Tatyana Zhebentyayeva2, Paul H. Sisco3 ... › wp-content › uploads › 2019...

  • M. Taylor Perkins1, Tatyana Zhebentyayeva2, Paul H. Sisco3, J. Hill Craddock11Department of Biology, Geology, and Environmental Science, The University of Tennessee at Chattanooga

    2The Schatz Center for Tree Molecular Genetics, Department of Ecosystem Science and Management, The Pennsylvania State University3The American Chestnut Foundation (Carolinas Chapter)

    Characterizing the genetic diversity of species within the North American Castanea is critical to restoration of the American chestnut (Castanea dentata) because such efforts provide clues regarding past patterns of hybridization between American chestnut and its sister species, North Americanchinquapin (C. pumila), as well as information regarding the partitioning of genetic diversity within the genus. Historical reports based on morphology and recent studies involving small numbers of molecular markers have posited that natural hybridization between American chestnut and Alleghenychinquapin has produced phenotypically intermediate populations in the Southeast. These reports have resulted in description of multiple putative hybrid taxa, including Castanea alabamensis, a taxon from northern Alabama. We investigated the hypothesized hybrid origin of C. alabamensis byperforming sequence-based genotyping of plants representing American chestnut, Allegheny chinquapin, Ozark chinquapin, Japanese chestnut, Chinese chestnut, and several populations matching the taxonomic description of C. alabamensis. Alignment of Illumina reads to the Chinese chestnutreference genome v1.1 yielded 190,656 single nucleotide polymorphism (SNP) loci for analyses. Phylogenetic analysis indicated that C. alabamensis samples cluster in a distinct group within the broader chinquapin clade. Analyses of population structure provided evidence of limited hybridizationbetween American chestnut and Allegheny chinquapin and more extensive admixture among different chinquapin varieties. C. alabamensis, however, exhibited no signature of American chestnut ancestry. Principal component analysis revealed the existence of four genetic clusters in our NorthAmerican samples: American chestnut, Allegheny chinquapin, Ozark chinquapin, and C. alabamensis. Our results do not support the hybridization hypothesis, but instead suggest that C. alabamensis is a distinct variety of chinquapin, and we therefore refer to it as C. pumila var. alabamensis. Theresults of our study demonstrate the potential of high-throughput sequencing to uncover and characterize cryptic diversity in the North American Castanea species.

    Abstract

    Morphological and DNA evidence support recognition of Alabama chinquapin, Castanea pumila var. alabamensis

    American Chestnut Foundation Annual Meeting, October 26-28, 2018, Huntsville, Alabama

    Fig. 1. Type specimen of C. alabamensis. Collected in 1924 by WW Ashe and described as a species by Ashe (1925).

    C. pumila var. pumila

    C. mollissima

    C. crenata

    C. pumila var. ozarkensis

    C. alabamensis

    C. dentata

    *

    *

    Plant materials—We collected herbarium vouchers and leaf tissue for DNA extraction from C. dentata, C. pumila var. pumila, C. pumila var. ozarkensis, C. crenata, C. mollissima, and five populations matching the taxonomic description of C. alabamensis.Phenotyping—We made morphological comparisons of our samples to the entire collection of Castanea specimens (870 in total) at the UNC-Chapel Hill herbarium, which contains representatives of all extant Castanea species and the type specimens of C. alabamensis (Fig. 1). Specimens were assessed using trichome, leaf, twig, flower, and fruit characters that differentiate the various Castanea taxa (examples in Fig. 2).Illumina library prep and DNA sequencing—Illumina libraries were prepared using a modification of the genotyping-by-sequencing protocol of Elshire et al. (2011), as described by Zhebentyayeva et al. (in press). Library prep included double-digestion with Pst1 and Mse1 restriction enzymes. GBS libraries were paired-end sequenced on an Illumina HiSeq2500.Bioinformatics—Raw Illumina reads were processed using Stacks v1.45 software (Rochetteand Catchen 2017). Demultiplexed reads were aligned to the C. mollissima reference genome v1.1 (https://www.hardwoodgenomics.org) using GSNAP software (Wu and Nacu2010). We called single nucleotide polymorphisms (SNPs) against the C. mollissima reference genome and filtered the SNPs to retain only one SNP per locus, remove tightly linked SNPs, and ensure good coverage of the entire genome. Data analysis—We inferred a phylogeny of C. alabamensis and its congeners using maximum likelihood in RAxML software (Stamatakis 2014). We used STRUCTURE software (Pritchard et al. 2000) to detect admixture and population structure. To provide a second test of admixture we performed principal component analysis (PCA) using the SNPrelatepackage (Zheng et al. 2012) in R software (R Core Team 2018).

    Materials and Methods

    C. alabamensisC. pumila var. ozarkensis

    C. pumila var. pumila

    C. dentata

    Fig. 5. Partitioning of genetic variation in North American Castanea based on 138,722 SNPs. Dot colors correspond to the following taxa: blue = C. pumila var. pumila, green = C. pumila var. ozarkensis, black = C. alabamensis, and red = C. dentata.

    C. dentata C. pumila sensu lato E. Asian spp.

    var. pumila alabamensis and var. ozarkensis

    A

    B

    A B

    C D

    Fig. 2 Morphological features distinguishing C. alabamensis from C. dentata and C. pumila var. ozarkensis. (A) Eciliate leaf margin of C. dentata. (B) Ciliate leaf margin of C. alabamensis. (C) Abaxial leaf surface of C. pumila var. ozarkensis covered with stellate and simple trichomes. (D) Abaxial leaf surface of C. alabamensis lacking stellate trichomes. Scale bars = 5 mm.

    Phenotyping—Comparison of our samples to those from Ruffner Mtn., AL, and Floyd Co., GA, showed that the putative hybrids analyzed by Li and Dane (2013) and Shaw et al. (2012) match the taxonomic description of C. alabamensis (Ashe 1925). Plants matching the C. alabamensis morphology occur from northwestern GA to central AL.Sequencing and bioinformatics—A total of 385 million reads were obtained for 96 plants; 98% of these were retained as high-quality. An avg. of 4 million reads were retained per individual. 190,656 genome-wide SNPs met filtering criteria and were used for analyses.Phylogenetic inference—C. alabamensis samples clustered together as a distinct group within the broader chinquapin clade (Fig. 3). Population structure analysis—No genomic contribution from C. dentata was observed in C. alabamensis (Fig. 4A). STRUCTURE software grouped C. alabamensis within C. pumila sensulato. (Fig. 4A). In an analysis of only chinquapins, STRUCTURE determined two distinct genetic groups: (1) C. pumila var. pumila and (2) C. pumila var. ozarkensis + C. alabamensis (Fig. 4B). Low to moderate admixture between these two groups was observed in most chinquapin populations.PCA—North American Castanea samples were clustered into four discrete groups: C. dentata, C. pumila var. pumila, C. pumila var. ozarkensis, and the C. alabamensis samples. The first two principal components separated samples into distinct species (y-axis in Fig. 5) and botanical varieties within species (x-axis in Fig. 5).

    Results

    Nonhybrid origin of C. alabamensis—Our results do not support the hypothesis of a hybrid origin for C. alabamensis, which contradicts the conclusions of Li and Dane (2013) and Shaw et al. (2012), who used smaller numbers of molecular markers to analyze morphologically intermediate plants from the same populations. Genetic affinities of C. alabamensis—The placement of C. alabamensis in all of our ancestry analyses indicates that this taxon falls within the chinquapin group, and is closely related to Ozark chinquapin, which aligns partially with Johnson’s (1988) assertion that these plants are Ozark chinquapin. We tentatively conclude that these plants are a distinct variety of chinquapin, and we therefore refer to them as C. pumila var. alabamensis.Future directions—Using genome-wide data, we have carried out the most thorough analysis of evolutionary relationships and admixture among North American Castanea to date. The ability of our genome-wide approach to detect admixture and fine population structure in Castanea species demonstrates that our methods can be applied to important questions regarding chestnut evolution and conservation. Future avenues include investigations into the nature of gene flow between different chinquapin varieties.

    Concluding Remarks

    Foundation for the Carolinas, David and Judi Morris, Bruce and Francine Hutchinson, Glenda Frames, Dr. Jack Agricola, Will Calhoun, Dr. Jim Lacefield, Marty Schulman, Dr. Larry Brasher, Dr. Lisa W. Alexander, Ed Schwartzman, Dan Thornton, Dr. Stylianos Chatzimanolis, Tennessee Chapter-TACF, Alabama Chapter-TACF, Kendra Collins, Eric Evans, Dr. Penny Xia, Dr. Christopher Saski, the Clemson University Genomics and Computational Biology Laboratory.

    Acknowledgments

    Ashe (1925) Notes on woody plants. Quarterly of the Charleston Museum 1:28-32.Elshire et al. (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6, e19379.Johnson (1988) Revision of Castanea sect. Balanocastanon (Fagaceae). Journal of the Arnold Arboretum 69:25-49.Li and Dane (2013) Comparative chloroplast and nuclear DNA analysis of Castanea species in the southern region of the USA. Tree Genetics and Genomes 9:107-116.Pritchard et al. (2000) Inference of population structure using multilocus genotype data. Genetics 155:945-959.R Core Team (2018) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.Rochette and Catchen (2017) Deriving genotypes from RAD-seq short-read data using Stacks. Nature Protocols 12:2640.Shaw et al. (2012) Phylogeny and phylogeography of North American Castanea Mill. (Fagaceae) using cpDNA suggests gene sharing in the Southern Appalachians

    (Castanea Mill., Fagaceae). Castanea 77:186-211.Stamatakis (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312-1313.Wu and Nacu (2010) Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 26:873-881.Zhebentyayeva et al. (in press) Genetic characterization of world-wide Prunus domestica (plum) germplasm using sequence-based genotyping. Horticulture Research.Zheng et al. (2012) A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28:3326-3328.

    References

    Fig. 3 Evolutionary relationships based on 103,616 genome-wide SNPs. Asterisks (*) indicate interspecific hybrids identified by STRUCTURE analysis. Scale bar = 0.03 subst./site.

    Fig. 4. Levels of admixture between Castanea taxa based on 583 SNPs. Sample site information along the lower edge of the plots indicate species, site code, and state. (A) Analysis of North American and E. Asian Castanea species. (B) Analysis using only North American chinquapins.