Ken McGrath - Next Gen Sequencing - Game of Thrones edition
-
Upload
australian-bioinformatics-network -
Category
Science
-
view
4.808 -
download
3
description
Transcript of Ken McGrath - Next Gen Sequencing - Game of Thrones edition
Next-Generation Sequencing: an overview of technologies and applications
July 2014
Ken McGrathAustralian Genome Research Facility
Next-Gen Sequencing Edition
• Current rulers of the “throne”
• Sequencing by synthesis
• Each cycle extends and reads a single base
• Reads of up to 2x300bp
DNA(0.1-1.0 ug)
Sample preparation Cluster growth
5’
5’3’
G
T
C
A
G
T
C
A
G
T
C
A
C
A
G
TC
A
T
C
A
C
C
TAG
CG
TA
GT
1 2 3 7 8 94 5 6
Image acquisition Base calling
T G C T A C G A T …
Sequencing
Illumina Sequencing TechnologyRobust Reversible Terminator Chemistry Foundation
MiSeq
Illumina
NextSeq500HiSeq2500
MiSeq
Illumina
NextSeq500HiSeq2500
GAIIx
Illumina X Ten
ILLUMINA SEQUENCING SYSTEMS
•150 bp paired end reads ~120Gbp / run (~1 day)
NextSeq500•15
0 bp paired end reads ~ 180 Gbp/ run (2 days)
Illumina HiSeq 2500 Rapid SBS
•125 bp paired end reads ~ 1000 Gbp/ run (6 day)
Illumina HiSeq 2500 v4 SBS
•300 bp paired end reads ~15 Gb/run (2.3 days)
MiSeq v3
• 150bp paired end reads ~1800 Gb/run (3 days)HiSeq X Ten
ILLUMINA SEQUENCING SYSTEMS
•10 -15 million pass filter clusters per run
MiSeq v2•50
bp single reads (0.5 – 0.75 Gb/run)
~6hrs
•≥ 90% bases higher than Q30 at 50 bp
50 cycles
•150 bp paired end reads (3.0 – 4.5 Gb/run)
~24 hrs
•≥ 80% bases higher than Q30 at 2x150 bp
300 cycles
•2x250 bp paired end reads (5.0 - 7.5 Gb/run)
~40 hrs
•≥75% bases higher than Q30 at 2 x 250 bp
500 cycles
•20-25 million pass filter clusters per run
MiSeq v3
•2x 75 bp paired end reads (3.0 – 2.5 Gb/run)
~20 hrs
•≥ 85% bases higher than Q30 at 2 x 75 bp
150 cycles
•2x300 bp paired end reads (12.0 – 15.0 Gb/run)
~55 hrs
•≥ 70 % bases higher than Q30 at 2 x 300 bp
600 cycles
Illumina Summary Strengths Weaknesses
Lots of data Too much data
Low error rates Slower run times
Great choice of platform sizes Shorter reads
Paired-end reads
Pretty awesome Slept with brother
• Competing with illumina for market share
• Two technologies (sequencing by ligation, and semiconductor sequencing)
• Reads of up to 400bp
Ion Torrent
• Ion Semiconductor Sequencing
• Detection of hydrogen ions during the polymerization DNA
• Sequencing occurs in microwells with ion (pH) sensors
– No modified nucleotides
– No optics
Ion Torrent• DNA Ions Sequence
– Nucleotides flow sequentially over Ion semiconductor chip
– One sensor per well per sequencing reaction
– Direct detection of natural DNA extension– Millions of sequencing reactions per chip– Fast cycle time, real time detection
Sensor Plate
Silicon SubstrateDrain SourceBulk
dNTP
To column receiver
∆ pH
∆ Q
∆ V
Sensing Layer
H+
SOLiD
Life Technologies
Ion Torrent PGM Ion Torrent Proton
• 100 bp reads ~20 Gbp/run (Coming soon!)
Ion Torrent Chips
• 200bp and 400bp reads, 30-100Mb/run (1.5 hrs)314 Chip
• 200bp and 400bp reads, 300-1000 Mbp / run (2 hrs)316 Chip
• 200bp and 400bp reads, 600Mb-2Gbp / run (4.5 hrs)318 Chip
• 200 bp reads, 5-10 Gbp/run P1 Chip
P2 Chip
PG
MP
RO
TO
N
Life Technologies Summary Strengths Weaknesses
Fast run times Lower maximum data output
Scalable data outputs Read quality can vary
Longer reads (400bp)
Pretty Haven’t done much recently
• Current rulers of the throne
• Sequencing by synthesis
• Each cycle extends and reads a single base
• Reads up to 2x300bp
• Current rulers of the throne
• Sequencing by synthesis
• Each cycle extends and reads a single base
• Reads up to 2x300bp
• One of the first NGS platforms
• Pyrosequencing based
• Each cycle allows extension of a single base (A, C, G or T)
• Reads up to 800bp
454 Pyrosequencing
454 Pyrosequencing
454: Data Processing
Image Processing
Base-calling
Quality Filtering
SFF File
T Base Flow
A Base Flow
C Base Flow
G Base Flow
Raw Image Files
GS-FLX
Roche
FLX Jr
GS-FLX
Roche
FLX Jr
Roche
• Not over yet…
Stratos Genomics Genia Something else?
Roche Summary Strengths Weaknesses
Long reads (up to 800bp) High $ per base
Older technology
Platform soon unavailable
Had wolves Pretty much dead
• Competing with illumina for market share
• Two technologies (sequencing by ligation, and semiconductor sequencing)
• Reads of up to 400bp
• Competing with illumina for market share
• Two technologies (sequencing by ligation, and semiconductor sequencing)
• Reads of up to 400bp
• Single-molecule real-time sequencing (SMRT)
• Detection of individual bases as they extend (by light emission)
• Long Reads (up to 4x2.5kb)
PacBio
PacBio
• Higher error rates (~90%)
• Compensate by “looping” DNA to create multiple passes
PacBio
Zero-Mode Waveguides (ZMW)
PacBio Summary Strengths Weaknesses
Long reads (4x2.5kb) High $ per base
Single-molecule detection Higher error rate
Capable of Epigenetics Still to prove itself
Freakin’ Dragons! Keeps losing dragons
Oxford Nanopore
• Direct detection of individual bases as pass through a “nanopore”
• MinION and GridION
• No synthesis/extension
• Capable of VERY Long Reads (>100kb)
Oxford Nanopore Summary Strengths Weaknesses
Extra-Long reads (>100kb) Not yet available (alpha testing)
Single-molecule detection Very high error rates
Capable of Epigenetics Immature platform
Very cost effective
Exotic and powerful Steal babies
NGS Applications
• Whole genome sequencing (today)» De novo assembly» Structural variant detection» Comparative genomics
• RNAseq (later today)» Gene expression» Splice variants» Transcriptomics » MicroRNA
• Epigenomics (tomorrow)» Indirect (bisulphite)» Direct
• Targeted sequencing (Wed)» Hybrid capture» Amplicon resequencing
Data Quality
Read Length
Yield/Coverage
Hodor! (Thank You)