Carbon meets Silicon (& the $1000 human genome) Oct 9, 2002 HBS
description
Transcript of Carbon meets Silicon (& the $1000 human genome) Oct 9, 2002 HBS
Carbon meets Silicon (& the $1000 human genome)
Oct 9, 2002 HBS
gggatttagctcagttgggagagcgccagactgaa gatttg gaggtcctgtgttcgatccacagaattcgcacca
Post- 300 genomes &
3D structures
6
Commericial Advisory Roles & Technology-transfer
Genome Pharmaceuticals 98-02 Caliper Technologies 94-02CodonCode 96-02GenProfileAG 97-02Gendaq 00-1EngeneOS 00-2BeyondGenomics 00-2 Newcogen & Flagship 00-2Longenity 01-2Xeotron 01-02Genomatica 01-2Genome Therapeutics 89-94; Biogen 84-5Tecan/Gamera 98-00FamilyGenetix 00-1;
Biorad-Sadtler 79-81Affymetrix 90-02Millipore 89-90Lynx 00-02Pyrosequencing 01-2Bruker Daltonics 93-7Mosaic Technologies 93-01 Agilent 01-2Aventis ‘98-01MJ Research Inc. 86-02Hamilton Co. 86-90Intelligent Automation 92-6Eli Lilly 98Dupont 82-4
Famous human mutations
PKU (preventable mental retardation)HbS (Malaria resistance)ApoE4 (dementia resistance)CCR532 (HIV resistance)
Pharmacogenomics Gene/Enzyme Drug Quantitative
effect
Cisapride Drug-induced torsade de pointesKvLQT1 Terfenadine, disopyramide, meflaquine Drug-induced long QT syndrome
CYP2C9Tolbutamide, warfarin, phenytoin, nonsteroidal anti-inflammatories
Anticoagulant effect of warfarin
CYP2D6
Beta blockers, antidepressants, antipsychotics, codeine, debrisoquin, dextromethorphan, encainide, flecainide, guanoxan, methoxyamphetamine, N -propylajmaline, perhexiline, phenacetin, phenformin, propafenone, sparteine
Tardive dyskinesia from antipsychotics; narcotic side
effects, efficacy, and dependence; imipramine dose requirement; beta-
blocker effect
Dihydropyrimidine dehydrogenase Fluorouracil Fluorouracil neurotoxicity
ACE Enalapril, lisinopril, captoprilRenoprotective effects, cardiac
indices, blood pressure, immunoglobulin A nephropathy
Thiopurine methyltransferase Mercaptopurine, thioguanine, azathioprineThiopurine toxicity and efficacy; risk
of second cancers
HERG Quinidine Drug-induced long QT syndrome
hKCNE2 Clarithromycin Drug-induced arrhythmia
Potassium channels
Examples of clinically relevant genetic polymorphisms influencing drug metabolism and effects. Additional data
2-Oct-2002 Boston GSAC Panel Discussion"The Future of Sequencing Technology: Advancing Toward the $1,000 Genome"
Moderators: •J. Craig Venter, Ph.D., The Center for Advancement of Genomics •Gerald Rubin, Ph.D., Howard Hughes Medical Institute Speakers:•George Church, Ph.D., Harvard University •Eugene Chen, Ph.D., US Genomics •Tony Smith, Ph.D., Solexa •Trevor Hawkins, Ph.D., Amersham Biosciences Corporation •Susan Hardin, Ph.D., VisiGen Biotechnologies, Inc. •Michael P. Weiner, 454 Corporation •Daniel H. Densham, Mobious Genomics, Ltd
The impact of new technologies
Digital computers & Networks 1968-93WWW 1993-94Recombinant DNA 1976-1986Genome Project 1985-2002Stem cells 1983-2002Nanotechnology 1984-2002
Bionano-machines
Types of biomodels. Discrete, e.g. conversion stoichiometryRates/probabilities of interactions
Modules vs “extensively coupled networks”
Maniatis & Reed Nature 416, 499 - 506 (2002)
Steeper than exponential growth$GDP/person (W.Europe)
100
1000
10000
100000
1000 1200 1400 1600 1800 2000
0.001
0.01
0.1
1
10
100
1000
10000
1970 1980 1990 2000 2010
bp/$
bp/$
R2 = 0.985
R2 = 0.992
-5-3-113579
111315
1830 1850 1870 1890 1910 1930 1950 1970 1990 2010
log(IPS/$K)
log(bits/sectransmit)Quadratic
Quadratic
http://www.faughnan.com/poverty.htmlhttp://www.kurzweilai.net/meme/frame.html?main=/articles/art0184.html
Moore's law of ICs 1965
How to do single DNA molecule manipulations?
Important alleles occur in “noncoding” non-conserved regions
Lesch KP, et al Science 274:1527-31 Association of anxiety-related traits with a polymorphism in the serotonin transportergene regulatory region
Piedrafita FJ, et al. JBC 271: 14412Alu repeat SNP near the human Myeloperoxidase gene:“severalfold less transcriptional activity”"-463 G creates a stronger SP1 binding site ... overrepresented in acute promyelocytic leukemia"
The issue is not speed, but hidden costs (e.g. accuracy & integration)
Sub-microliter scale: 1m = femtoliter (10-15)Instruments <$100K per CPU.
Why low-cost, high quality sequencing? & how much?
Human genotypes 1019 bpImmune B&T cell receptor spectra 1010 bp (per year)Environment & pathogen monitoring ?RNA splicing in situ : 1012 bits/mm3
Compact storage 105 now to 1017 bits/ mm3 with DNA
& How?
Projected costs greatly affect our priorities
bp/$ $/genome Method 1977 0.1 30B manual (pBR322)1985 1 3B HGP goal 2002 10 300M de novo high-quality sequencing2002 300 10M dd-polyphred raw-reseq 2002 2K 2M Perlegen, Lynx2002 3M 1K per diploid? de novo? This session!2002 1013 .0003 other data types (e.g. video)
New sequencing approaches in commercial R&DMethod liter/bp Length Error Test-set $/device bp/hr
Capil fluidics e-6 600 <0.1% 1e11 350k 80k
ABI, Amersham, GenoMEMS, Caliper*, RTS*
SeqByHyb e-12 1 <5% 1e9 200k 1M
Perlegen-Affymetrix*, Xeotron*
Mass Spectrometry Sequenom, Bruker*
Single molecule >e-24 >>40 ? >80 30k-1M 180k
Pore(Agilent*) Fluor(USGenomics, Solexa) FRET(VisiGen,Mobious)
In vitro DNA-Amplification (e.g. Polonies) -- Multiplex cycles:
Lynx* e-15 20 <3% 1e7 ? 1M
Pyroseq.* e-6 >40 <1% 1e6 100k 5k
CisTran* e-13<1% 40 90k >1M?
ParAllele, 454, RTS**GMC has a potential financial interest (or Harvard license)
$1K per diploid human sequence
Input: buccal cells, blood, or forensic samples. Output: prioritized list of deviant bps (e.g. non-conservative).
Raw data rate: 16 pixels/bp, 1Mpixel per 6sec/CPU = 24 CPU days. Amortization: 5 yr for camera/CPU/transport @ $50K total = $200 per 1011 bp Overhead: $200 /sq ft/yr * 40 sq.ft (400 cu.ft) = $40Reagents: At 20 m per (5 m) polony and 40 bp reads means 10000 cm2 area, 800 ml of fluor dNTP, $100/mg = $40 5 ml PCR reactions = $200Disposables: 500 slides = $50 Electricity: 2 kwatts 24hr*24days* 0.13$/kwatt-hr = $150Labor for repair: 10% of instrument cost = $10 Labor for operation: Slide PCR, slide dips, scans, etc. = $20R&D: Initially NIH grants (i.e. 0% of this unbalanced budget).
Total: per genome $710
Long-range continuity inspired by DNA-Fiber Fluorescent In Situ Hybridization
300 kb = 100 microns
http://allserv.rug.ac.be/~fspelema/neubla/content/images_r.htm
Polony amplification & sequencing
Human DNA:Cystic
Fibrosis CFTR gene
45 kbp
Rob MitraVincent ButtyJay ShendureBen WilliamsDavid HousmanHitomi Hutzell
A
AA
A
A
A
B
BB
B
BB
A
Single Molecule (library or natural A,B tags)
B
BA
A
Primer is Extendedby Polymerase
B
A
BA
Polymerase colony (polony) In situ amplification (PCR, RCA, etc.)
Primer A has 5 immobilizing Acrydite
Mitra & Church Nucleic Acids Res. 27: e34
1. Remove 1 strand of DNA.2. Hybridize Universal Primer.3. Add Red (Cy3) dTTP.
B B
3 5
AGT..
T
4. Wash; Scan Red Channel
B B
3 5
GCG..
Sequence polonies by sequential,fluorescent single-base extensions
5. Add Green (FITC) dCTP
6. Wash; Scan Green Channel
B B
3 5
AGT.
TC
B B
3 5
GCG..
C
Sequence polonies by sequential, fluorescent single-base extensions
Base added: (C) A G T (C)
(A) G (T) C (A)
(G) T C A
3 TCACGAGT AGTGCTCA
Sequencing multiple polonies
Mitra &Shendure
Alignment precision0.4 pixel
Polony exclusion principle &Single pixel sequences
Mitra & Shendure
Inexpensive, off-the-shelf equipment
MJR in situ cycler
Histology slide rack
Microarrayscanner
Polony in situ Sequencing Summary
•Integrated!: (purify), amplify, sequence, (separate)•Femtoliter (1m) scale•Off-the-shelf equipment•Chromosome haplotyping & RNA splice-typing•In situ tissue compatible
Types of phenotypic effects of mutations
PKUTrisomy 21HbS