Introduction to DNA Sequencing Technology. Dideoxy Sequencing (Sanger Sequencing, Chain Terminator...
-
date post
22-Dec-2015 -
Category
Documents
-
view
228 -
download
1
Transcript of Introduction to DNA Sequencing Technology. Dideoxy Sequencing (Sanger Sequencing, Chain Terminator...
Introduction to DNA Sequencing
Technology
Dideoxy Sequencing (Sanger Sequencing, Chain Terminator
method).• Clone the fragments to be sequenced
into the virus M13.
• Why M13?
• The clones that are isolated are single-stranded DNA.
M13
. . . . . TGATGTCGAGCGAGTCGTACGGT-----^^^
Primer
Fragment to be deciphered
DNA sequencing reaction:1) DNA fragment to be sequenced cloned into the
vector M13
2) DNA polymerase
3) “Universal” primer
4) All 4 DNA building blocks
5) One ddNTP tagged with a radioactive tracer
The most popular technique is based on the dideoxynucleotide.
Purine
• Pyrimidine
Set up 4 separate reactions. Each reaction contians one of the 4 ddNTPs. Each ddNTP is tagged with a radioactive tracer.
A reaction (with ddA) 21, 26, 29, . . . .T reaction (with ddT) 25, 31, 35, . . . . .C reaction (with ddC) 22, 23, 27, . . . . G reaction (with ddG) ??
M13
. . . . . TGATGTCGAGCGAGTCGTACGGT-----^^^
Primer (20 nt.)(3’ end of primer)
• Each reaction generates a set of unique fragment lengths.
• All fragment lengths are represented (from 21 - > 1,000 nucleotides).
• None of the fragments are present in more than one reaction.
• DNA sequencing technology requires gel electrophoresis system with the ability to separate DNA fragments that separate by one b.p.
DNA sequencing, as performed in the 1980s (manually) is slow
and labor intensive.
• NCBI HomePage
~1988- First big change in DNA sequencing technology:
• Introduction of ‘automated DNA sequencing’:
• This technique uses 4 fluorescent labels (red, yellow, blue, green) rather than one radioactive tag.
• The bases are read by a laser/detector rather than by humans.
• York University
? Questions ?
Newest Innovations in DNA Sequencing
Technology
• 1) Capillary Electrophoresis
• 2) Robotics
• CE Theory
Capillary Gel Electrophoresis:
“The capillaries we typically use in CE are inexpensive and commercially available. We use capillaries that range about 30 to 50 centimeters in length, 0.150 to 0.375 millimeters in outer diameter, and a 0.010 to 0.075 millimeter diameter channel down the center. “
DNA sequencing with CE
# of capillary tubes/machine:
Initally- one (Introduced ~ 1998)
State of the Art- 2000: 96 tube CE (cost $300k)
Today- 384 tube CE (cost of one unit- $500k)
• DOE Joint Genome Institute
HUMAN GENOME PROJECT (HGP)
• The ultimate goal of the HGP is to decipher the 3.3 billion b.p. of the human genome.
• When the project was initiated, its was technologically unfeasible.
Genomic Sequencing
Organisms sequenced
•Year # genomes sequenced •1994 0•1995 2•1996 4•1997 8 (est.)•1998 30 (est.)•2001 ~75
Genomics Research Funding(selected programs; $ millions)
PROGRAM 1998 2000
NHGRI (U.S.) 211 326
WELCOME TRUST (U.K.)
61 121
STA (JAPAN) 39 115
ENERGY (U.S.)
85 89
GHGP 19 79
SWEDEN 5 35
Why such a sudden increase in funding??
• It became apparent that if the public agencies didn’t get their act together, an upstart organization might sequence the HG before they did (despite their ~ 8 year head start).
Sequencing the human genome suddenly had become a race.
• The competitors:
• Publicly funded genome centers, scattered throughout the U.S., Europe, and Japan.
• Celera, the private company directed by J. C raig Venter.
The story of how J. Craig Venter brought about a paradigm shift
in genomic sequencing has now entered the mythology of
science.
Craig VenterScientist of the Year
• from Time Magazine: What was perhaps the most important scientific event of the past century occurred this year when scientists announced the cracking of the human genetic code. And what everyone, including his numerous critics, acknowledges is that the brash and impatient Venter is the man who made it happen years before it would have otherwise by throwing computing power at the traditional, laborious process of manually examining every bit of human DNA to find the genes within.
Why did Craig Venter and his new company Celera threaten the
established genome sequencers?
• Venter’s new company had 300 $300k state-of-the-art sequencing machines and an $80 million dollar supercomputer.
• Venter suggested Celera could sequence the genome in but 3 years at a cost of $300 million.
Venter’s first company, TIGR, pioneered the ‘shotgun
sequencing’ approach to sequencing a genome:
• 1) Shear the DNA into thousands of random pieces.
• 2) Sequence the DNA of each fragment.
• 3) Use a computer to align the overlapping fragments to produce a single, contiguous DNA sequence of the entire organism.
Advantages/Disadvantages of the ‘shotgun approach’:
Disadvantages- Requires significant over-sequencingRequires powerful alignment softwareThere may be problems ‘finishing’ certain
regions
Advantages-Eliminates the needing for mapping
Sequencing of Archaeoglobus fulgidus:
• 29,000 sequencing reactions
• 500 bp. Average ‘read’
• 14,500,000 bases aligned 2,178,400 bp.
• 6.7- fold sequence coverage
(14,500,000 / 2,178,400 = 6.7)
Even with remarkable success sequencing bacterial genomes,
skeptics doubted a whole genome random sequencing approach would
work with a eukaryotic genome. Why?
2 Reasons-
• Eukaryotic genomes are much larger.
• Eukaryotic genomes carry significant amounts of repetetive DNA.
Who won the race?
• With much fanfare, the rough draft of the human genome was ‘declared’ a draw. Both Celera and the various public agencies shared credit for the rough draft of the human genome (‘announced Feb. 2000).
Insert Video (10’)
What is meant by the term mapping?
• Mapping to a geneticist means the same as it does to a non-scientist:
• A drawing showing the spatial relationship between a series of points.
Traditional map: Gene Map:
Western U.S.-
Seattle-
Portland-
S.F. -
L. A. -
Human Chromosome # 11
Hemoglobin-
Insulin
Albinism
Parathyroid Hormone
Mouse Clickable Cytogenetic MapChromosome X is selected
Restriction Enzyme Map
HinDIII EcoRI HinDIII HinDIII
• ____|__________|________|_________|_
Construction of various maps has been a major goal of genetic
research. Why?
• Maps serve as navigational tools. They are useful in finding genes or other genetic features and ordering fragments of DNA.
• There is a direct correlation between the usefulness of a map, and the number of points on the map. Analogy??
The STS map:• STS = sequence-tagged site.
• STS are short, unique fragments of DNA generated by PCR.
• Verification of a human STS: PCR amplification of the human genome generates one small fragment unique lanckmark
Usefulness of STSs
• STSs are used to find overlaps between fragments of genomic DNA.
• Finding overlaps ordering of fragments (see handout).
Expressed Sequence Tags (ESTs)
• As of June 2000, the 4.6 million EST records comprised 62% of the sequences in GenBank. Although the original ESTs were of human origin, NCBI’s EST database (dbEST) mow contains ESTs from over 250 organisms.
What is an EST?
Short DNA sequence representing a gene expressed in a particular tissue. A given EST often represents a fraction of the gene.
ESTs are often produced by sequencing the ends of a cDNA (complementary DNA).
What is the value of ESTs?
• Rapid identification of genes.
Feb. 1992- Craig Venter and 14 co-workers published the partial DNA sequence of of 2,375 genes expressed in the human brain. This represented about half of the total human genes known at the time.
How to sequence a genome???
• 1) Quickly- focus on the genes and their regulatory regions and human polymorphisms.
• 2) Thoroughly and completely- every nucleotide with 99.99% accuracy.
Extra Slides
• Does completion of HGP identification of all disease genes?
• A Timeline of The Human Genome
• YEAR# human genes mapped to a definite chromosome location# years it would take to sequence the human
genome• 1967 none sequencing not possible yet
• 1977 3 genes mapped
• 4,000,000 years to finish at 1977 rate
• 198712 genes mapped • 1000 years to finish at 1987 rate
1997 30,000 genes mapped • 50 years to finish at present rate
First Sequenced Genome:
• May 1995, TIGR researchers led by Robert Fleischmann closed the last gaps in the Haemophilus genome. In total, 26,708 sequences had been assembled to span the 1,830,137 base pair genome of the bacterium. The genome was published in July. (Fleischmann, et al, Science, 269: 496-512, 1995).
• DNALC: Cycle Sequencing
In the February 16 issue of Science, Venter et al. announce the sequencing of the euchromatic
portion of the human genome by a whole-genome shotgun sequencing approach. The
sequencing achievement was accomplished by Celera Genomics in nine months in a factory-
scale project involving 300 automatic squencing machines producing 175,000 sequence-reads
per day. The company generated 14.8 gigabases (Gb) of DNA sequence and combined
data with the public GenBank database to generate a 2.91 Gb consensus sequence (94%
coverage) representing over eight-fold coverage of the genome.