Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity
description
Transcript of Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity
![Page 1: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/1.jpg)
Introduction to DNA Sequencing Technologies
Advanced Genetic Epidemiology and Statistical Molecular Genetics Workshop
October 22, 2010
Gregory A. Buck, Ph.D.Director, Center for the Study of Biological Complexity
Professor, Microbiology and Immunology
Virginia Commonwealth University
![Page 2: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/2.jpg)
Holy Grail: the Human Genome
Complexity (number of bases per haploid genome) of the human genome:
- 3x109 base pairs (nucleotides)
![Page 3: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/3.jpg)
Human Genome
How much does it cost to sequence?
- First genome: $3-5 billion
- James Watson: ~$300,000
- Today: $5,000 - $100,000
- Goal: $1000 (soon < $100?)
![Page 4: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/4.jpg)
Human GenomeHow much time to sequence?
- First genome sequenced (2004):
. Estimated - 15 years (1990’s)
. Actual - 13 years (capillary sequencing) - James Watson (2008): ~ 2 months
. So-called ‘next generation’ sequencing- Now: two weeks?- Goal: tricorder (Star Trek)
![Page 5: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/5.jpg)
X Prize: $10 million award is set for faster DNA maps
(2006)
By Nicholas WadePublished: THURSDAY, OCTOBER 5, 2006
A $10 million prize for cheap and rapid sequencing of the human genome was announced by the X Prize Foundation of Santa Monica, California.
The terms of the prize require competitors to sequence 100 human genomes of their choice within 10 days, and within six months, those of a further 100 people chosen by the foundation.
http://www.iht.com/articles/2006/10/05/news/genome.php
![Page 6: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/6.jpg)
NHGRI Grants Support for 'Revolutionary' Sequencing for $1,000 Genome
August 5, 2008 By a GenomeWeb staff reporter
Under one program, NHGRI may grant as much as $5 million in fiscal 2009 to between two and seven awardees. Applicants for these funds may seek up to $1.5 million per year for a period of up to five years. A parallel grant program would give up to $2 million over three years to between two and seven grantees, for direct costs of up to $200,000 per year. A Small Business Innovation Research Grant from NHGRI will grant between four and six small businesses up to a total of $3.6 million in fiscal 2009 to propose novel technologies to bring down the cost of sequencing. Phase I of this program will give up to $250,000 of total costs per year for up to two years, and Phase II applicants may seek up to $1.5 million total costs per year for up to three years. A parallel Small Business Technology Transfer program will spend up to $2 million in fiscal 2009 to support between two and five awards to small businesses investigating the development of new sequencing methods. This program will award up to $250,000 total costs per year for up to two years for Phase I programs, and it will support up to $1.5 million in total costs per year for up to three years for Phase II programs.
![Page 7: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/7.jpg)
Sequencing Technologies
1977: Fred Sanger (Cambridge, England) and Walter Gilbert (Harvard University)– Chemical sequencing (Gilbert)
– Dideoxy Nucleotide Triphosphate chain termination sequencing (Sanger)
– Both used for 8-10 years (different strengths/drawbacks)
Chain termination sequencing proves most versatile, robust– Applicable to automation
– First automated sequencers commercially available ~1985
![Page 8: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/8.jpg)
H
BASE CH 2
H
H
HH
O
O
OO
H
BASE CH 2
H
H
HH
O
OP OO
O
BASE
H
BASECH2
H
H
HH
O
OPOOO
H
BASECH2
H
H
HH
O
OPOOO
H
BASECH2
H
H
HH
O
OPOOO
H
BASECH2
H
H
HH
O
OPOOO
H
BASE CH2
H
H
HH
O
OP OO
O
H
BASE CH 2
H
H
HH
O
O
P OO
O
H
CH2
H
H
HH
O
O
H
BASE CH2
H
H
HH
O
H
CH 2
H
H
HH
O
OP OO
O
H
CH2
H
H
HH
O
O
P O
P OO
O
OP OO
OOP OO
H
BASECH2
H HH
O
O
5'
5'
3'
3'
HH
BASE
BASE
H
BASECH2
H HH
OOP P P
HH
H
BASECH2
H HH
OOP P P
HOH
Sanger or di-deoxy- method of sequencing
![Page 9: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/9.jpg)
Sequencing TechnologiesCommercially available (1985):- Dideoxy- (Sanger, enzymatic, termination method)
- Applied Biosystems, Inc., uses fluorescent primers
- Requires four primers (four dyes) per sequence read
- Requires four reactions (one for each primer)
- Works, but expensive, laborious
- DuPont: Genesis 1000 DNA Sequencer- Fluorescent chain termination sequencing
- One primer, four terminators (one for each base, A, G, C, T)
- One reaction per sequence read
- Very efficient
- Sells IP to ABI……….
![Page 10: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/10.jpg)
Fluorescent chain termination sequencing
See video
![Page 11: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/11.jpg)
High Throughput Genome Sequencing: The main player...
The PE/ABI 3700 Prism:- automated, easy to use- capillaries (not slab gel)- 10 runs per day- 96 sequences per run - ~1000 sequences/day- >300,000 sequences/ year- >150 million bases/ year- $300,000 per machine
First truly automated high throughput sequencing
Sequenced the first human genome…..
![Page 12: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/12.jpg)
Output from Fluorescent Chain Termination Sequencing
![Page 13: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/13.jpg)
Fluorescent chain termination sequencing: dominates market until ~ 2005:
Next Generation (NextGen) Sequencing
First out of the blocks:
Roche 454 FLX Genome Sequencer
![Page 14: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/14.jpg)
Genome Sequencer FLX System Customer Training Technical Overview
400 million bases/ day (5th floor, Sanger Hall)(equal to 2 years output from cap sequencer!!)
www.roche-applied-science.com
![Page 15: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/15.jpg)
Based on Pyrosequencing…..
![Page 16: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/16.jpg)
Roche 454 Flx Technologies
- Based on Pyrosequencing –
- Pyrosequencing video
- Roche 454 FLX workflow video
![Page 17: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/17.jpg)
http://454.com/products-solutions/how-it-works/sequencing-chemistry.asp
Pyrosequencing – 454/Roche
![Page 18: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/18.jpg)
Roche 454 FLX Output:- Based on Pyrosequencing –
- Currently: - ~400 base maximum read length
- ~1 X 106 reads
- ~ 400 X 106 bases per run
- 1 run ~ 8 hours (1 day) - [compare to 200 X 106 / year for capillary sequencing]
- Good for de novo sequencing, assembly- Cost: $10,000 per run
![Page 19: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/19.jpg)
Current Market Leader: Illumina Genome Analyzer
Solexa/Illumina
![Page 20: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/20.jpg)
Reversible Terminator Chemistry
O
PPP
HN
N
O
O
cleavagesite fluor
3’block
Next cycle
IncorporationDetectionDeblock; fluor removal
O
DNA
HN
N
O
O
3’
O
5’
free 3’ end
X
OH
• All 4 labeled nucleotides in 1 reaction
Solexa/Illumina
![Page 21: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/21.jpg)
5’
G
T
C
A
G
T
C
A
G
T
C
A
G
T
3’
5’
C
A
G
T
C
A
T
C
A
C
C
T
A
G
C
G
T
A
First base incorporated
Cycle 1: Add sequencing reagents
Remove unincorporated bases
Detect signal
Cycle 2-n: Add sequencing reagents and repeat
• All four labelled nucleotides in one reaction
• High accuracy • Base-by-base sequencing• No problems with
homopolymer repeats
Solexa/Illumina
![Page 22: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/22.jpg)
Base Calling
1 2 3 7 8 94 5 6
T T T T T T T G T …
T G C T A C G A T …
The identity of each base of a cluster is read off from sequential images
(sequencing genomes with the Illumina video)
Solexa/Illumina
![Page 23: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/23.jpg)
Illumina/Solexa Technologies- Based on Sequencing by Synthesis (bridge PCR) –
- Currently: - ~100 base maximum read length
- ~ 500 X 106 reads/run
- ~ 50 X 109 bases per run (100 X 109 in paired end reads)
- 1 run ~ 10 days
- Good for re-sequencing, CHiP Seq, RNA seq
- Cost: $10 – 20,000 per run
New Illumina HiSeq2000: 200 X 109 bases/run
- ~ 10 X 1012 bases/year
- 100,000 fold increase over fluor. chain termination seq
- >3,000 human genomes!
![Page 24: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/24.jpg)
Applied Biosystems: Solid 4/HQ Sequencing by Ligation…
![Page 25: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/25.jpg)
http://marketing.appliedbiosystems.com/images/Product_Microsites/Solid_Knowledge_MS/video/SOLiD_video_final.wmv
See video…..
![Page 26: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/26.jpg)
SOLiD Sequencing Technology
Currently:
- 50 base reads (75?)- Up to 400 Giga bases (billion bases) per run- >20 X 1012 bases per year (~2X Illumina)- Reduced costs (<50%/base cost)
Best for applications where short reads are sufficient:
CHiP seq, RNA Seq…. (not de novo sequencing)
![Page 27: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/27.jpg)
Single Molecule Technologies
Holy Grail:- No bias (due to replication, amplification)- Should work with limiting amounts of template- Long reads: for de novo sequencing
Contenders:- true Single-Molecule Sequencing (tSMS) – Helicos- SMRT (Single Molecule, Real Time Sequencing) –
Pacific BioSciences
see videos
![Page 28: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/28.jpg)
Single Molecule Technologies
Advantages:- No amplification, cloning biases- Use small quantities of substrate (DNA)- Fast (rate of replication)
Challenges- Signal to noise ratios- Sensitivity- Error rates
To date: still largely experimental: - Short reads (Helioscope)- Low output; e.g., < 100,000 reads/run (Pac Bio)
![Page 29: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/29.jpg)
Other Technologies
Looming:- Ion Torrent: based on release of H+ ions
- requires emulsion PCR
- Inherent biases
- Current read length < 100 bp; high error rate
- Oxford Nanopore Technologies: passage of bases through a nanopore in a lipid bilayer- No data available
- Others coming
![Page 30: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/30.jpg)
http://www.iontorrent.com/technology/
![Page 31: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/31.jpg)
http://www.iontorrent.com/technology/
![Page 33: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/33.jpg)
![Page 34: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/34.jpg)
Wish list:- Longer reads
- Today: 25 - 800 bases
- Looming: 1 – 20 Kbases?
- Ideal: entire chromosome [metagenomics]
- Low amounts DNA required- No amplification bias
- No replication bias
- Can sequence hard to get DNA
- High accuracy and fidelity- Rapid (currently over a week per run)- Lower cost ($100/human genome?)
![Page 35: Gregory A. Buck, Ph.D. Director, Center for the Study of Biological Complexity](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815b61550346895dc9482a/html5/thumbnails/35.jpg)
Thank you!