Greedy Algorithms in the Libraries of Biology
description
Transcript of Greedy Algorithms in the Libraries of Biology
17-Apr-2008 3:30-3:45 PMAvogadro-Scale Computing MIT Bartos E15
Thanks to:
Greedy Algorithms in the Libraries of Biology
PGP
Present26720 km/h4500m
pm-Mm3oK2000 yr
Is biology optimal?
Human PastLocomotion 50 km/hOcean depth 75mVisible .4-.7 Cold 0oCMemory 20 yr
1E-4
1E-2
1E+0
1E+2
1E+4
1E+6
1E+8
1E+10
1E+12
1E+14
1840 1860 1880 1900 1920 1940 1960 1980 2000 2020
Daltons synth
Bits/sec
Seq bp/$
3 Exponential technologies1 to 18 month doubling times
Shendure J, Mitra R, Varma C, Church GM, 2004. Carlson 2003; Kurzweil 2002; Moore 1965.
urea B12tRNA
telegraph
Computation &Communication
Analytic tRNA
Synthetic chemistry
human
Gb chips
Avogadro scale, >>Yottaflops (from CMOS to sea moss)
Ultra-parallel 1038 units (lab libraries:108 to 1015 25mers)
AdaptableEvolution (years), Immune (days), Neural (seconds)
Thermodynamic limit 2x1019 op/J (irreversible) 3 x1020 for polymerase (1010 for current computers)
Memory density: Neural: (1012 op/s & 106 bits)/mm3, DNA: (103 op/s & 1 bit)/nm3
Error rate: DNA: 10-9 ; RNA/protein: 10-4
Biofuel: 4x107 J/kg (~=$) Adleman 1994
DNA error rates
Ellis et al. PNAS 2001Constantino & Court. PNAS 2003
DNA Replication Fork
3. Mismatch repair
1. Incorporation 5’to 3’
2. Proofreading exonuclease 3’to 5’
Bionano – Inorganic-microfab interfaces
• Metal-oxide-semiconductors (sponge silicateins for Ti & Ga oxides) • Magnetic components (magnetosomes in magnetotactic bacteria)• Optical fibers & lenses (e.g. venus basket sponge) • Bacterial reduction of salts to metals (e.g. Se, Au, Ag)
• Reading and writing DNA
Reading DNA : Open-source hardware, software, wetware Polonator G007
~10 to $400/Gbp 1E-6 @ >3X redundancy
Synthetic Biology: augmentation & combinatorics (not minimization)
1. Synthetic DNA: 1Mbp per month (Codon Devices)
2. New polymers in vitro – affinity selection (Vanderbilt)
3. Hydrocarbon & other chemical syntheses in E.coli (LS9)
4. Bacterial & stem cell therapies (SynBERC & MGH)
5. New codes: Viral resistant cells & new aminoacids (MIT)
6. Synthetic Ecosystems – Evolve secretion & signaling
7. Interfaces of Genomics & Society
Hierarchical, modular, evolvable
DNA origami -- highly predictable 3D nanostructures
DNA-nanotube-induced alignment of membrane
proteins for NMR structure determination
RothemundNature’06
Douglas, et al. PNAS’07
10 Mbp of DNA / $300 chip
8K Atactic/Xeotron/Invitrogen
Photo-Generated Acid
12K Combimatrix Electrolytic
44K Agilent Ink-jet standard reagents
380K Nimblegen/GA Photolabile 5'protection
Tian et al. Nature. 432:1050 Carr & Jacobson 2004 NAR
Smith & Modrich 1997 PNAS
Spatially patterned chemistry
Amplify pools of 50mers using flanking universal PCR primers &
3 paths to 10X error correction
Mirror world : resistant to enzymes, parasites, predators
Mirror aptamers, ribozymes, etc. require mirror polymerases
352 aminoacid long Dpo4 Sulfolobus DNA polymerase IV347 peptide bonds done; 4 to go.
L-aminoacidsD-nucleotides
(current biosphere)
D-aminoacidsL-nucleotides (Mirror-biopolymers)
• Molecular Biology Central Dogma DNA > RNA > Protein
PCR, T7 RNA pol, in vitro translation.
• Production of devices larger than or toxic to cells.• Directed evolution of drugs & affinity agents.
• Mirror-image proteins
Tony Forster(Vanderbilt)
Duhee Bang (HMS)
Why synthesize (minimal) in vitro self-replication?
113 kbp DNA 151 genes
ideal for comprehensiveatomic, ODE &
stochastic models
Forster & Church
MSB ‘05 GenomeRes.’06Shimizu, Ueda
et al ‘01
Pure in vitro
translating & replicating
system
Genome engineering CAD
70b 15Kb 5Mb 250 Mb
Polymerase in vitro
Isaacs, Carr, Emig, Gong, Tian, Reppas, Jacobson, Church
Recombination in vivo E.coli
Error CorrectionMutS 1E-4
Recombination in human cells
Bacterial (Artificial) Chromosomes
BACs
Human(Artificial) Chromosomes
HACs
Sequencing 1E-7
Chemical Synthesis
1E-2
Native DNA computing : Lab Evolution
Reppas/Lin Trp/Tyr exchangeTolonen Ethanol resistance Lenski Citrate utilizationPalsson Glycerol utilizationEdwards Radiation resistanceIngram Lactate productionMarliere ThermotoleranceJ&J Diarylquinoline resistance
(TB)DuPont 1,3-propanediol production
About 3 serial additive changes per 30 days vs 2^30 exhaustive search
rE.coli Strategy #3: ss-Oligonucleotide Repair
Obtain 25% recombination efficiency in E. coli strains lacking mismatch repair genes (mutH, mutL, mutS, uvrD, dam)
Ellis et al. PNAS 2001Constantino & Court. PNAS 2003
DNA Replication Fork
Improved Recombination Frequency:10-4 0.25 (> 3 log increase!)
Multiplex Automated Genome Engineering (MAGE)
Wash with water &
DNA pool (50)
Concentrate, electroporate
Resuspend, bubble, select
O-ring membrane
Concentrate
Wang, Isaacs, Terry
GEMASS Prototype
H. Wang, Church Lab, Harvard, 2008
Recombination-Cycling for Combinatorial Accelerated Evolution
0
5
10
15
20
25
0 1 2 3 4 5 6 7
# mutations/clone
Fre
qu
en
cy
Mutation Distribution: 11 oligos, 15 cycles Mutation Distribution: 54 oligos, 45 cycles
Oligo Pool
# cycles Best Clone (98 %tile)
Fraction of mutated sites Time*
11 15 7 7/11 3 days
54 45 23 23/54 9 days
* Continuous cycling
Scaling & Automation Increase Efficiency of Recombination
Wang, Isaacs, Carr, Jacobson, Church
Avogadro scale, >>Yottaflops (from CMOS to sea moss)
Ultra-parallel 1038 units (lab libraries:108 to 1015 25mers)
AdaptableEvolution (years), Immune (days), Neural (seconds)
Thermodynamic limit 2x1019 op/J (irreversible) 3 x1020 for polymerase (1010 for current computers)
Memory density: Neural: (1012 op/s & 106 bits)/mm3, DNA: (103 op/s & 1 bit)/nm3
Error rate: DNA: 10-9 ; RNA/protein: 10-4
Biofuel: 4x107 J/kg (~=$) Adleman 1994
.
Multiplex Automated Genome Engineering (MAGE)
syringe pump
electrically actuated valves
electroporation cuvette w/ membrane filter
OD sensor
data acquisition systemcomputer communication /
Wang, Isaacs, Terry
Fab vs. Bio-fab+ Plays well with digital computers - No habla C++- Doesn’t get DNA + DNA is it’s native digital media- Needs us to replicate + We need them- Needs expensive Fab (e.g. ICs) + Simple or complex inputs - Intelligent Design + Evolution
Cross-feeding symbiotic systems:aphids & Buchnera
• obligate mutualism• nutritional interactions: amino acids & vitamins• established 200-250 million years ago• close relative of E. coli with tiny genome (618~641kb)
Aphids
http://buchnera.gsc.riken.go.jphttp://buchnera.gsc.riken.go.jp
MILKFTWVMILKFTWV HR
Shigenobu et al. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp.APS. Nature 407, 81-86 (2000).
Pink= enzymes apparently missing in Bucherna
trp/tyrA pair of genomes shows best co-growth
Reppas, Lin et al. ; Accurate Multiplex Polony Sequencing
of an Evolved Bacterial Genome 2005 Science
SecondPassage
First Passage
Synthetic genome pair evolution
Co-evolution of mutual biosensors/biosynthesissequenced across time & within each time-point
Independent lines of Trp & Tyr co-culture
5 OmpF: (pore: large,hydrophilic > small)
42R-> G,L,C, 113 D->V, 117 E->A
2 Promoter: (cis-regulator) -12A->C, -35 C->A
5 Lrp: (trans-regulator) 1b, 9b, 8b, IS2 insert, R->L in
DBD.
Heterogeneity within each time-point .
Reppas, Shendure, Porecca -12 -11 -10 -9 -8 -7 -6
At late times Tyr- becomes prototroph!
Reducing costs of open-sourcehardware & wetware
Factor • 30 Equipment speed: from 1 up to 30 Mpixels/sec camera• 4 Equipment cost: from $500K down to $150K (Danaher Inc)• 36 Parallelism: 36 flow-cells per camera, 2 billion beads ------------------• 75 Flow cell volume: 1.5 mm down to 0.0085 mm thin• 40 Kit costs: $2000 down to $50 at standard enzyme costs• 10 Enzymes: $4000/mg down to <$400 (Enzymatics Inc)• 50 Genomic subset (Exome – 1% genome)