Genotyping by sequencing on barley with Ion PGM™ Sequencer

10
Genotyping by Sequencing (GBS) on Barley using Ion PGM™ Sequencer: a Feasibility Study Alexander Sartori, Ph.D. Market Development, Agriculture Biotechnology May 2012 (Rev. B) www.lifetechnologies.com/gbs

description

Drs. Nils Stein (IPK, Gatersleben, Germany) and Jesse Poland (USDA-ARS2, Manhattan, KS, USA) partnered with Life Technologies to develop a protocol for plant genotyping by sequencing (GBS) in barley using two restriction enzymes. From their first pilot, they concluded that Ion PGM™ Sequencer has a great potential for large GBS studies due to the high SNP calling accuracy, attractive cost per sample, and unmatched speed in the sequencing workflow. Whether used for the discovery and identification of SNPs or as a screen for panels of thousands of known markers, genotyping by sequencing (GBS) using next-generation sequencing technologies is becoming increasingly important as a cost-effective and unique tool for association studies and genomics-assisted breeding in a range of plant species—including those with complex genomes that lack a reference sequence. The Ion PGM™ Sequencer is the perfect tool for GBS due to its scalability, simplicity, and speed.

Transcript of Genotyping by sequencing on barley with Ion PGM™ Sequencer

Page 1: Genotyping by sequencing on barley with Ion PGM™ Sequencer

Genotyping by Sequencing (GBS)on Barley using Ion PGM™ Sequencer: a Feasibility Study

Alexander Sartori, Ph.D.Market Development, Agriculture BiotechnologyMay 2012 (Rev. B)www.lifetechnologies.com/gbs

Presenter
Presentation Notes
The work presents the results of collaboration with leading researchers in the barley community aiming to co-develop a Genotyping by Sequencing workflow using the Ion PGM™ Sequencer. The partners in this project are Dr. Nils Stein at IPK in Gatersleben, Germany and Dr. Jesse Poland, a USDA researcher at Kansas State University in USA. The following slides represent the first phase of the collaboration in which the partners tested the principal feasibility and adaptation towards Ion semiconductor technology on a small subset of samples and at a low throughput.
Page 2: Genotyping by sequencing on barley with Ion PGM™ Sequencer

2 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Molecular Markers in Crop Plants

Goal: Development of crops with improved traits (e.g. drought tolerance, higher yield)

− Discovery of high-density molecular markers in crops required for better understanding of genetics of complex traits for breeding

− Approach: whole genome association studies and genomic selection

Challenge for the development of molecular markers in crops like barley and wheat

− massive, complex genomes; no complete genome sequence available to date

− barley genome ~5.5 GB (diploid)

− wheat genome ~16 GB (hexaploid)

NGS has greatly increased SNP discovery in crop plant species

− e.g. rice, maize, soybean, sorghum, even in wheat predecessor, Aegilopstauschii

Presenter
Presentation Notes
The development of molecular markers and genomic resources in barley and wheat has always been a challenge due the massive, complex, and, in the case of wheat, polyploid genomes. The diploid barley genome is over 5.5 GB and the hexaploid wheat genome is roughly three times larger at 16 GB. The development of new sequencing technologies has greatly increased the discovery of SNPs in many species, including important model and non-model crop plants such as rice, maize, soybean and some more. SNP discovery in the wheat D-genome predecessor, Aegilops tauschii, was recently completed using next-generation sequencing, marking a step forward for SNP markers in large and complex genomes. The discovery of high-density molecular markers in crop species will lead to a better understanding of the genetic architecture of complex traits and its application in breeding programs for crop improvement through whole genome association studies and genomic selection.
Page 3: Genotyping by sequencing on barley with Ion PGM™ Sequencer

3 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

GBS Approach

Genotyping by Sequencing (GBS) – a simple but robust approach for complexity reduction in large genomes combined with multiplex sequencing

GBS target the genomic sequence flanking restriction enzyme sites

GBS is similar to RAD (restriction-site associated DNA) tagging but

GBS has greatly simplified library construction that …

− … requires less DNA and avoids random shearing

− … is completed in two steps followed by PCR of the pooled library

For barley the original GBS protocol [1] was extended to a two-restriction-enzyme system [2]

Here we describe a GBS feasability study using Ion Torrent PGM™ Sequencer

[1] Elshire et al. (2011) A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species. PLoS ONE 6(5): e19379. doi:10.1371/journal.pone.0019379

[2] Poland et al., (2012) Development of High-Density Genetic Maps for Barley and Wheat Using a Novel Two-Enzyme Genotyping-by-Sequencing Approach. PLoS ONE 7(2): e32253. doi:10.1371/journal.pone.0032253

Presenter
Presentation Notes
Genotyping by Sequencing is a method that is capable to reduce the enormous complexity of the genomes in many crop plants such as wheat, barley that not only lack complete reference sequences but also show large genomes, often with ploidy-levels higher than two. In the GBS approach the genome is cleaved with a species-specific restriction enzyme – or a set of two enzymes in the given case for barley – and the sequences flanking the restriction sites are targeted subsequently. GBS is a method comparable to RAD (restriction-site associated DNA), however with a simplified library construction concept and a lower required amount of starting material (gDNA). In this work we referred to two related publications in PLoS One that describe the original protocol for GBS and the adaptation for barely, respectively. Jesse Poland, one of our partners in this collaboration, was co-author of the first and first author of the second paper. For adapting to the Ion semiconductor sequencing technology the original protocol only needed to be slightly modified.
Page 4: Genotyping by sequencing on barley with Ion PGM™ Sequencer

4 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

GBS Principle

Simplified workflow chart for GBS library preparation using two restriction enzymes for barley; (1) plant gDNA cleavage using PstI and MspI for desired restriction fragments, (2) ligation of specific and common adapters, and (3) fragment pre-amplification followed by NGS on Ion PGM™ Sequencer

1

2

3

Presenter
Presentation Notes
This flowchart illustrates the principle of GBS library construction followed by Ion PGM™ sequencing. The first pane suggests how genomic barley DNA is targeted by the two selected restriction enzymes with MspI being a frequent cutter and PstI being a rare cutter, indicated by the red and yellow triangles, respectively. For the desired depth of sequencing information, only those fragments carrying the two restriction sites are ‘wanted’. For their enrichment two kinds of double stranded adapters are ligated to the ends, each being specific to each restriction site. Also, one adapter pair contains the DNA barcode sequences required for sample multiplexing. Figures 2 and 3 are simplified for demonstration purposes. The actual and slightly more complex adapter design helps to prevent the amplification of unwanted side products. These are mainly fragments with twice the same restriction site. The ligation product is then slightly amplified for the final library that was pooled for subsequent emulsion PCR and Ion PGM™ sequencing (pane 3).
Page 5: Genotyping by sequencing on barley with Ion PGM™ Sequencer

5 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Barley GBS using Ion PGM™ Sequencer

Feasibilty study

4 barley samples; 2 parental, 2 DH-derived lines

Sample prep using custom protocol

Individual library preparation (previous slide)

Multiplexed sequencing (barcodes)− Ion PGM™ Sequencer, Ion 316 Chip, 200 bp sequencing

2-day protocol− Day 1: Library prep, template prep (Ion OneTouch™ System)

− Day 2: Enrichment (Ion OneTouch™ ES), Ion PGM™ Sequencing

Presenter
Presentation Notes
The aim of the study was to demonstrate technical feasibility of the application. For this purpose a simple experimental setup was chosen, consisting of 4 barley samples of known genotypes, of which 2 are parental and the other two are doubled haploid lines. The four libraries were pooled and templates were prepared with the Ion OneTouch™ system. Sequencing of the four samples was conducted on one Ion 316™ Chip using the 200-bp chemistry. All steps of the workflow were completed within two days and even for increased sample numbers; this duration won’t change. DH = double haploidy; more see: http://en.wikipedia.org/wiki/Doubled_haploidy
Page 6: Genotyping by sequencing on barley with Ion PGM™ Sequencer

6 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Sequencing and SNP Results

~200 Mb Q20 sequence − approx. 500 k restriction fragments sequenced at 200 bp per sample

− 1-fold base coverage achieved in this study

Good sample separation through barcodes− >90% barcodes separated

− Barcode sequence followed by exact match to restriction site

Roughly 5,000 SNPs per sample called− SNP agreement >99.5% between Ion PGM™ Sequencer and Illumina HiSeq System

(currently used NGS platform)

− Customer statement: “Concordance is as high as between runs on our platform”

Technical feasibility acknowledged

Presenter
Presentation Notes
With about 200 Mb of Q20 we achieved very high sequencing quality. The main analysis of the sequencing results was conducted by our collaboration partners at Kansas State University and compared to the known genotypes derived from sequencing on Illumina’s HiSeq System. As expected approximately 500,000 restriction fragments were sequenced per sample at an average coverage of 1-fold. This value was one unknown of the feasibility study and supposed to be used for the right scaling of the second project phase with a higher sample number. However, barcode separation of the four samples worked nicely and the adapter to insert junction was showing the correct sequence in all reads. Our partners identified approximately 5,000 SNPs per sample and SNP agreement towards previous results is >99.5%, i.e. the concordance between platforms is as high as observed for different runs on the same platform. These results are very promising and triggered the collaboration move to the second phase.
Page 7: Genotyping by sequencing on barley with Ion PGM™ Sequencer

7 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Data Analysis

Mapping/Alignment− Torrent Suite Software v2.1; TMAP (Torrent Mapping

alignment program)

− Input is SFF file format, output is SAMtools BAM file format

SNP Calling – Two approaches1. KSU: TASSEL pipeline [1]

2. Life Technologies: SAMtools [2] mpileup> (http://samtools.sourceforge.net/mpileup.shtml)

> Output is ‘Variant Call Format’ (VCF)

[1] More on GBS bioinformatics and TASSEL pipeline: Buckler Lab for Maize Genetics and Diversity, a USDA-ARS lab at Cornell‘s Institute for Genomic Diversity (http://www.maizegenetics.net/)

[2] The Sequence Alignment/Map (SAM) Format and SAMtools: http://bioinformatics.oxfordjournals.org/content/early/2009/06/08/bioinformatics.btp352

Presenter
Presentation Notes
The raw reads were mapped against a barley assembly using the Torrent Suite Software v2.1. The assembly file was provided by our collaboration partners since there is no complete reference genome available for barley yet, a common issue for other crops. SNP analysis was conducted by Dr. Poland at Kansa State University using the TASSEL pipeline developed in Dr. Ed Buckler’s lab for maize at Cornell University. In parallel, bioinformaticians at Life Technologies used the SAMtools mpileup pipeline. SNP positions were visualized in the IGV viewer. In this example we can see a homozygous A to G base change in three samples.
Page 8: Genotyping by sequencing on barley with Ion PGM™ Sequencer

8 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Conclusions and Outlook

Promising results led to extend study (phase 2; in progress)

Design:− Increased sample number (2*24-plex pools)

− Increased coverage for higher SNP-counts per sample

> Ion 318 Chips, 200 bp sequencing

− Comparison of Life Tech sample prep solutions with customer protocol

Data to be compared to Illumina HiSeq results

Ion Semiconductor Sequencing has huge potential for large GBS studies:− High SNP calling accuracy

− Highly competitive cost per sample

− Unmatched sequencing workflow speed

Presenter
Presentation Notes
In phase 1 we demonstrated technical feasibility of the Ion PGM™ workflow and the project team agreed to proceed to a second phase. Here the aim was to test the application with a higher sample number. Precisely, technical replicates of two library pools with 24 samples each will be sequenced on Ion 318™ Chips. This will increase the base coverage and lead to a higher number of SNP calls. Results will be compared with Illumina HiSeq data for the same set of samples in terms of SNP calling accuracy, cost efficiency and time to result. It is expected that phase 2 will keep what phase 1 was suggesting: Ion Semiconductor Sequencing is the ideal solution for the GBS application in large plant genomes.  
Page 9: Genotyping by sequencing on barley with Ion PGM™ Sequencer

9 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Acknowledgements

Academic Partners− Nils Stein, IPK1) Gatersleben, Germany

− Jesse Poland, USDA-ARS2), KSU3), Manhattan, KS, USA> Interview with Dr. Poland

Life Technologies, Europe− Alain Rico, FALCON – Ion PGM™ Sequencing

− Robert Greither – Ion PGM™ Sequencing

− Samuel Thoraval – Bioinformatics support

1) The Leibniz Institute of Plant Genetics and Crop Plant Research2) United States Department of Agriculture, Agricultural Research Service 3) Kansas State University

Presenter
Presentation Notes
We want to thank Dr. Nils Stein at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) in Gatersleben, Germany, and Dr. Jesse Poland at the USDA Agricultural Research Service at Kansas State University in Manhattan, Kansas, USA for the very productive and open collaboration and the willingness to proceed to the second phase of this project. Thanks also to Alain Rico, Robert Greither and Samuel Thoraval for the great internal support on this project.
Page 10: Genotyping by sequencing on barley with Ion PGM™ Sequencer

10 6/5/2012 | Life Technologies™ Proprietary and confidentialFor Research Use only, not intended for diagnostic purposes

Life Technologies Products for GBS Studies

Ion Xpress™ Plus Fragment Library Kit

Ion PGM™ 200 Xpress™ Template Kit

Ion OneTouch™ 200 System Template Kit

Ion OneTouch™ System

Ion PGM™ Sequencer

Ion 316™ Chip Kit

Ion 318™ Chip Kit

Torrent Suite Software v2.1

Learn morewww.lifetechnologies.com/gbswww.lifetechnologies.com/iontorrent

Presenter
Presentation Notes
This list of Life Technologies’ products that we used in this study will also help you to design your own Genotyping by Sequencing experiments. Also visit our new website on Plant Agricultural Biotechnology and the Ion Torrent web portal: www.lifetechnologies.com/gbs www.lifetechnologies.com/iontorrent