Genomic diversity and population structure in switchgrass, Panicum virgatum:...
-
Upload
norman-evans -
Category
Documents
-
view
218 -
download
1
Transcript of Genomic diversity and population structure in switchgrass, Panicum virgatum:...
![Page 1: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/1.jpg)
Genomic diversity and population structure in switchgrass, Panicum virgatum:
Genotyping-by-sequencing and population genomics
Geoff Morris*, Paul Grabowski, Justin BorevitzDept. of Ecology and Evolution
University of Chicago
![Page 2: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/2.jpg)
Genomic diversity and population structure
• Geographic patterns of genomic diversity reflect: drift, migration, and adaptation
• Genomic diversity: nucleotide variation and insertions/deletions across many loci in the nuclear and organellar genomes.
• Leads to design of mapping populations for quantitative genetics and molecular breeding
![Page 3: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/3.jpg)
Genomic diversity and natural history
Emerson et al. PNAS 2010
Example: Pitcher plant mosquito (Wyeomyia smithii)
![Page 4: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/4.jpg)
Ecotypic diversity in switchgrass
• Switchgrass and other wide-ranging grassland species have many ecotypes
• Great variability in size, shape, color, and habitat preference• Example: Upland/lowland divergence
Upland (Michigan) Lowland (Oklahoma)
Adapted to: Shorter growing season,Drier climates
Adapted to: Long growing season,Wet climates
![Page 5: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/5.jpg)
Effects of ecotype diversity of productivity
• Three year plot (6m2) experiment at Fermilab• ~20% overyield in switchgrass mixtures compared to
monocultures
![Page 6: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/6.jpg)
“Genomic diversity and population structure in switchgrass, Panicum virgatum: from the continental scale to a dune landscape”
Morris, Grabowski, and BorevitzAccepted, Molecular Ecology
![Page 7: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/7.jpg)
Biogeography of Indiana Dunes flora
Coastal Plain flora: e.g. Seaside spurge, Marramgrass
Boreal flora: e.g. Jack Pine, Bearberry
Great Plains flora: e.g. Sandreed, Little Bluestem
Eastern deciduous flora: e.g. Tulip tree
Recolonized post-glaciacation: ~10,000 years ago
![Page 8: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/8.jpg)
Switchgrass gene pools
Zhang et al. 2011
?
![Page 9: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/9.jpg)
Landscapes in Indiana Dunes
Landscape features are dynamic and can be dated:•100s – 1000s of years for dunes•10s – 100s of years for blowouts
Big blowout ~ 150 years old
![Page 10: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/10.jpg)
Study questions
• Can switchgrass population structure be confirmed with a genome-wide sample of non-ascertained markers?
• In a hierarchical sample of switchgrass, how much diversity is there on a landscape, regional, and continental scale?
• Did multiple switchgrass gene pools contribute to the Indiana Dunes populations?
• Is there genomic diversity in a single landscape feature (blowout)?
• Is there local (private) genetic diversity in the Indiana Dunes?
![Page 11: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/11.jpg)
Switchgrass plant samples
• Switchgrass cultivated varieties (cultivars)– Kanlow (Oklahoma - lowland)– Blackwell (Oklahoma - upland)– High Tide (Maryland - Coastal)– Forestburg and Sunburst (South Dakota)– Dacotah (North Dakota)– Cave-in-Rock (Illinois)– Southlow (Southern Michigan “ecopool”)
• Indiana Dunes switchgrass– Big Blowout– Jack pine savanna– Interdune
![Page 12: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/12.jpg)
Problems with traditional markers systems
• Locus sampling:– Typically only a few kb are sequenced in a few loci (rDNA, cp introns)
– Large stochastic error and loci-specific bias
– e.g. Plant chloroplast has 100X lower rate of evolution than animal mitochondria
• Ascertainment bias:– Occurs whenever markers are discovered and typed separately
– Worst when ascertainment panel is geographically restricted subpopulation
– e.g. Inferred genetic diversity in Africans is spuriously low when when European markers are used
![Page 13: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/13.jpg)
= restriction site1) PstI digest of genomic DNA
2) End-polish, blunt-end ligation; Illumina barcodes
3) PCR amplify and pool fragments from multiple samples
4) Assemble and map reads to “stacks” and call SNPs
Genomic diversity from de novo sequencing
• Reduced representation + multiplexing = more samples• 10,000+ candidate SNPs• No reference genome needed• Data here from 76 or 100 bp paired end reads• 40 billion base pair data set
![Page 14: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/14.jpg)
Plastome sequence in RRLs
• Nuclear whole genome shotgun sequence is too light (<<1X) for assembly
• Plastome WGS is very high (>>1X)
1) PstI digest of genomic DNA, with star activity and random shearing
2) End-polish, blunt-end ligation
![Page 15: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/15.jpg)
Analysis of chloroplast data
• Chloroplast genome sequence (plastome) included in data• Random (shotgun) sequence + 20 PstI sites• Switchgrass chloroplast reference available (Upland and
Lowland)• Mapped reads to both ~140,000 base pair chloroplast
genomes• Coverage (# of times each position is read): 1X – 786X
![Page 16: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/16.jpg)
Chloroplast coverage and polymorphisms
Position (kb)
ChloroplastGenomeCoverage
![Page 17: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/17.jpg)
Chloroplast phylogeny
• Neighbor joining tree based on 140kb
• Named haplogroups have >50% bootstrap
• Unfilled lines indicate low-coverage sample
![Page 18: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/18.jpg)
Chloroplast phylogeny
![Page 19: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/19.jpg)
Chloroplast phylogeny
![Page 20: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/20.jpg)
Population analysis of nuclear loci
• Create “pseudoreference” of RRL loci with de novo assembly
• Map reads to pseudoreference to create stacks (150-1500 reads)
• Map reads to switchgrass chloroplast and sorghum mitochondria, and drop stacks that match organelles
• Select single-nucleotide variants that:
– Have high sequence quality (PHRED score < 0.001 for both alleles)
– Vary in frequency across samples (chi-square < 0.01)
– Are nearest to restriction site, closest to beginning of read
• Randomly select one allele per sample (weighted by observed frequency)
![Page 21: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/21.jpg)
Coding sequence variation in the chloroplast
• 77 coding genes in chloroplast (including Rubisco, ribosome, etc)
– 60kb of coding sequence
• Constraints in non-synonymous (NS) vs. synonymous (S) variation provides biological validation for SNPs
• Upland vs. Lowland (~1 million years):
– 23 NS : 16 S (ratio = 1.4)
• Within upland ( < 0.5 millions years)
– 16 NS : 3 S (ratio = 5.3)
![Page 22: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/22.jpg)
Nuclear genome: Multidimensional scaling
~11000 nuclear loci, mean of 100 random allele samples
![Page 23: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/23.jpg)
Nuclear loci: Structure analysis
Bayesian clustering algorithm ~11000 nuclear loci, random allele sample, Burn-in 10K, Run 10K
![Page 24: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/24.jpg)
![Page 25: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/25.jpg)
Conclusions
• Confirmed upland vs. lowland differentiation and differentiated a local population using non-ascertained markers
• Lake Michigan switchgrass is distinct from broader upland population in midwest and Great Plains.
• Post-glacial gene flow into the Indiana Dunes included genotypes from across the Great Plains and Midwest
• The chloroplast diversity in the Indiana Dunes did not evolve in the current midwestern population, but originated one or more glacial cycles ago
• A single blowout in the dunes can have as much chloroplast diversity as the Midwest
![Page 26: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/26.jpg)
New GBS methods for population genomics
• For true population analysis we need 10+ individuals in multiple populations
• Illumina multiplexing is too expensive – separate prep cost for each library adds $100s/sample
• Read count overdispersion (up to ~200X more Poisson) requires technical replicates to even counts
• Sticky-end ligation increases specificity and removes random sequence (including plastome)
![Page 27: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/27.jpg)
Genotype-By-Sequencing (GBS)Based on Elshire et al. 2011, PlosONE
![Page 28: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/28.jpg)
GBS on continental + dunes switchgrass
![Page 29: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/29.jpg)
New population genomic studies with GBS
1. Continental population structure (126 individuals)– 50/50 deep diversity and shallow diversity based on chloroplast
markers and SSRs
2. Tetraploid cultivars (24 each for TX, OK, NE, ND cultivars)– Ploidy differences may be confounded with genetic diversity– High sample size should allow traditional pop gen analyses (Fst etc...)
3. Dune half-sibs (4 mothers and 10 offspring each)– True SNPs will segregate in the offspring while homeologous
substitutions will not
![Page 30: Genomic diversity and population structure in switchgrass, Panicum virgatum: Genotyping-by-sequencing and population genomics Geoff Morris*, Paul Grabowski,](https://reader035.fdocuments.us/reader035/viewer/2022062515/56649cec5503460f949b8930/html5/thumbnails/30.jpg)
Bioinformatics overview
• No software package for population genomic analysis on GBS• Stacks (U. Oregon) comes closest but multinomial sampling
model expects high frequency SNPs (e.g. mapping population)• Buckler lab TASSEL package (Java) may be appropriate • We’ve been using custom pipeline (CLC, MySQL, R) for
analysis– http://create.ly/gefxsub43