Audit of Compliance with Standards Governing Combined DNA ...
Seminar - Intro to DNA Computing (Two Lectures Combined)
Transcript of Seminar - Intro to DNA Computing (Two Lectures Combined)
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
1/57
Seminar: Introduction to DNAComputing
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
2/57
Introduction
The DNA Computing field began in 1994: L. Adleman solved a small instance of Hamiltonian Path (HPP) using only:
DNA molecules to encode the problem (Data) Operations from biotechnology (Program)
Although simple, this was the first instance of true massive parallelism: Roughly DNA 1015 processors, working together to solve the problem by search.
It was also the first instance of a feasible alternative to silicon technology. Capable, in principle of competing with existing silicon technology:
Superior overall processor speed Superior information storage potential Near optimal energetic efficiency Etc
Since then, many methods for computing with DNA have been developed: Including Whiplash PCR (my main research field)
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
3/57
The Promise of DNA Computing DNA Computing holds many promises for mankinds future:
Massively parallel computing > 1018 processors, working together in solution. Promise for applications to intractable problems (e.g., HPP)
Seamless integration with biological systems Inputs can be biological or molecular signals.
Promise for bypassing the solution of unsolved problems (e.g., protein folding)
Smart medical therapeautics Developed DNA computers could be applied in vivo Directly computing the solutions (cures) to medical problems Promise for new cures for diseases, smart immune systems, etc.
Programmable nanotechnology The nano-machinery for DNA information processing is already well-developed in nature:
Information encoding: DNA bases; the DNA triplet code Information processing: Enzymes for making / breaking DNAs, etc
Thus, DNA nanotech competes well with other nanotechnologies.
Required: Architectures, algorithms, and tools Including new software for predicting DNA computer behavior!
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
4/57
Outline (Two Lectures)
Part I DNA Basics A. DNA Structure B. Basic DNA Operations
Synthesis, Hybridization, etc. C. The Polymerase Chain Reaction
Part II - Introduction to DNA Computing A. The Adleman-Lipton Paradigm (End of Lecture 1)
B. Satisfiability by Protection and Digestion (Start of Lecture 2)
Part III - Design and Error Estimation A. DNA Strand Design Problem (DSD) B. Using Equilibrium Chemistry:
Tm-based Analysis General Equilibrium Models
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
5/57
Part I DNA Basics:Structure and Biosteps
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
6/57
Nucleotides
The monomer building blocks of Nucleic Acids areNucleotides. All have a D-stereoisomeric configuration. Each nucleotide consists of:
a phosphate (PO4-),
attached to the 5 Carbon = 5 nucleotide. attached to the 3 Carbon = 3 nucleotide.
a 5-member, sugar ring; a Nucleobase;
attached to the 1 Carbon.
There are two major classes of Nucleotides, classed based upon the sugar: by the group, X attached to the 2 Carbon.
RNA contains a ribose sugar (X = OH). DNA contains a 2-deoxyribose sugar (X = H)
This is our focus...
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
7/57
The Nucleobases of DNA
Nucleotides in DNA contain 4 types of Nucleobases: 2 Purines (2-ring bases):
Adenine (A) Guanine (G)
2 Pyrimidines (1-ring bases): Thymine (T) Cytosine (C)
All are planar, and thus achiral. R indicates point of attachment to the 1 C of 2-deoxyribose.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
8/57
DNA Primary Structure
Each DNA strand is a linear chainof nucleotides. linked by 5,3 phosphate diester
bonds. chain forms a negatively charged
backbone (hydrophilic). Each chain has definite polarity:
two chemically distinct ends: 5 end (top). 3 end (bottom).
by convention, oriented 5 to 3.
Primary Structure: sequence of Nucleobases, 5 to 3.
Nucleobases are hydrophobic.
e.g., 5-TAGC-3 written TAGC.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
9/57
Helix Formation in DNA
In genomic DNA, helices usually formed by 2 polymers: double-stranded DNA (dsDNA). shown conceptually, at right. here, helical structure omitted.
Strands oriented anti-parallel: 5-3 vs. 3-5. each pair of bases aligned and H-bonded;
Watson-Crick base pairing. base pairing is intermolecular.
unit behaves as a single polymer. described in terms of number of base-pairs.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
10/57
Watson-Crick Base Pairing
Base-pairing in natural DNA is Watson-Crick: dG is paired with dC (3 H-bonds) dT is paired with dA (2 H-bonds) the 2 strands are thus related by sequence:
referred to as Watson-Crick
complementarity. Many pairs can form H-bondsso why
these 2 base-pairs? points of attachment to the backbones
are equally spaced. allows a regular helix. will define a uniformly wide major
groove.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
11/57
The B-Helix of Watson and Crick
The standard helix forDNA. right-handed, anti-parallel double-helix. favored by high humidity conditions.
B-helix has 101 symmetry: motif = 1 base-pair (monomer). helical repeat, c = 10 base-pairs/turn.
actually, varies from 10-10.5 bps/turn.
Parameters: rise, h = 0.34 nm/base-pair. tilt, = 1o (bps almost to the axis).
Torsion angles: nucleotides in the anticonformation.
sugars primarily 2-endo.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
12/57
B-Helix (cont.)
Two Gross Features: Major groove: this is where the bases are
exposed wide and quite deep. involved in protein recognition.
Minor groove: narrow and also quite deep. lined by a permanent spine of H20
molecules.
The B-helix not adopted by RNA. due to steric hindrance:
between each 2-0H, and the adjacent 5-phosphate.
even a single ribonucleotide causesDNA to shift to the A-form.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
13/57
Part II: Intro to DNA Biotechnology Now, lets learn about basic DNA operations
These biosteps are used to compute with DNA There are Seven basic operations, or bio-steps:
Synthesis making DNA; Hybridization/Annealing DNA to DNA recognition; Ligation joining DNA; Restriction cutting DNA; Polymerization copying DNA; Electrophoresis DNA separation by length; Extraction DNA separation by sequence;
And the work-horse of biotechnology: The Polymerase Chain Reaction sequence-specific DNAamplification.
These will be useful for DNA-based Computing.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
14/57
Making DNA: Synthesis
Oligonucleotide synthesis via phosphoramidite chemistry; Resin-anchored strands 5-grown in parallelone residue at a time.
Basic Procedure (automated): All strands begin 1 base in length; Each round consists of 3 steps:
Coupling (addition of an activated monomer); Oxidation to PO4 (iodine); Removal of protecting DMT group (dichloroacetic acid);
One base added per round (up to ~ 100 bases).
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
15/57
Recognizing DNA: Hybridization Def.: Sequence-specific annealing of 2 or more DNAs
In specific proportions (in terms of the strands); forming a dsDNA product;
Sequence-recognition property useful for DNA computing: hybridization = computation;
For modeling, we note three aspects: Energetics: what duplexes/loops will form?
B-DNA helices generally assumed; Chemistry: how many strands are involved?
Bi-molecular (2 strands); e.g.: DNA annealing (Fig; left-hand process);
Multi-molecular (3+ strands); e.g.: Adlemans algorithm; Uni-molecular (1 strand)
e.g.: hairpin formation is self-hybridization.
Equilibrium: does the process attain it? Each strongly influences process characteristics
e.g., Ratio of product concentrations; Tm of products.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
16/57
Joining DNA: Ligation Ligation = covalent linkage of 2 adjacent DNA backbones:
5 end of strand A + 3 end of strand B; Splinted Ligation (shown):
Process assisted by a 3rd strand C; Imposes process sequence-dependence;
Generally implemented via a DNA Ligase e.g.: T4 DNA ligase.
B-helical substrate required: Strands must form a B-helix;
Note: some allow blunt ends; Also: quite mismatch tolerant.
A and B must be adjacent: no gap. Strand A must have a 5 PO4.
Energy required: ATP (or NAD+)
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
17/57
Cutting DNA: Restriction
The DNA backbone is cut by Restriction Endonucleases. Cut-site (restriction site) is sequence-dependent:
4 common sites are shown at right Cuts often form sticky ends;
Useful for directing later annealing/ligation.
Most sequence-specific endonucleases: Type IIR-M cut at the restriction site (shown); Also: Type IIS cut away from restriction site.
Restriction sites have C2 symmetry Thus, are (fully or partially) palindromic;
Often 6 bps in length; Enzyme cuts both backbones symmetrically;
Cytosinic methylation protects the site; Animal DNA: 2-7% of Cs methylated; Allows restriction-based cellular defense.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
18/57
Copying DNA: Polymerization
DNA Polymerase: Implements a 5 to 3 copying operation;
3 end of a primer strand is extended No de novo synthesis;
Also: no 3 to 5 ever observed.
Note characteristic hand shape; Substrate Requirements:
Two DNA strands required: Primerstrand: to be 3-extended; Template strand: to be copied;
Basically a gapped helix. Incoming dNTP monomers:
Both base and energy source;
Polymerase fills in the substrate helix; Copy operation thus Watson-Crick:
A copied to T, G copied to C, etc.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
19/57
Amplifying DNA: PCR The Polymerase Chain Reaction (PCR; K. Mullis):
Amplifies a target dsDNA sequence, T Requirements: two short ssDNA primers, flanking T.
Each PCR round = melting + primer-annealing + extension This simple procedure applied recursively via thermal cycling:
adding primer + dNTP each round.
Exponential: n rounds of PCR produces 2n
copies of T.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
20/57
Separating DNA: Electrophoresis
Size Fractionization of dsDNA: mobility in a gel matrix length-dependent;
Migration faster for smaller DNAs; Property can be exploited:
To segregate a DNA mixture by size.
Gel Electrophoresis: DNA mixture loaded onto the gel w/ buffer;
Polyacrylamide gel (10 - 500 bps); Agarose gel: longer DNAs (500 bps);
E-field applied, parallel to the gel matrix; Since DNA is poly-anionic: strands migrate towards the anode
Longer strands move more slowly; Provides logarithmic separation w/ length.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
21/57
Separating DNA: Extraction
ssDNAs can also be segregated by sequence By exploiting the specificity of hybrization. Fishing Procedure on DNA mixture T:
Objective: Remove the subset, TS of strandsin T containing sub-sequence, S;
Figure: S = AGCATA; Prepare biotinylized strands, F with S*;
* denotes Watson-Crick complementation. Conjugate F to streptavidin-coated magnetic
beads.
Mix F with mixture T: F hybridizes to strands in T containing S.
Remove F magnetically, from T: Also removes hybridized subset of T (with S*); TS recovered by melting/washing F.
Overall operation = Extract(S, T)
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
22/57
Part II: Intro. to DNA Computing
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
23/57
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
24/57
Promises of DNA Computing DNA Computing holds many promises for mankinds future:
Massively parallel computing > 1018 processors, working together in solution. Promise for applications to intractable problems (e.g., HPP)
Seamless integration with biological systems Inputs can be biological or molecular signals.
Promise for bypassing the solution of unsolved problems (e.g., protein folding)
Smart medical therapeautics Developed DNA computers could be applied in vivo Directly computing the solutions (cures) to medical problems Promise for new cures for diseases, smart immune systems, etc.
Programmable nanotechnology The nano-machinery for DNA information processing is already well-developed in nature:
Information encoding: DNA bases; the DNA triplet code Information processing: Enzymes for making / breaking DNAs, etc
Thus, DNA nanotech competes well with other nanotechnologies.
Required: Architectures, algorithms, and tools Including new software for predicting DNA computer behavior!
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
25/57
Approaches to DNA Computing
Many architectures have been proposed Here, we can cover only a few;
Broadly, classifiable into 2 categories:1. Single-instruction, Multiple-data (SIMD):
DNA mixture data-parallel, but executes 1 set of instructions.
The Adleman-Lipton Paradigm First successful application (Adleman): Hamiltonian Path; Modified version (Lipton): SATISTIABILITY
Chip-based DNA computing (Liu, Wood, etc) Solution of SAT instances;
Many other important architectures: e.g., Computation via self-assembly (Seeman, Winfree, etc)
2. Multiple-Instruction, Multiple-Data (MIMD): Whiplash PCR (Hagiya, Sakamoto, Rose, etc)
Each strand executes its own program
Particularly useful for evolutionary programming.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
26/57
Hamiltonian Path Problem (HPP) An instance of HPP is a problem on directed-graph, G:
Set ofvertices, V = {Vi}; Set of 1-way directed edges, eij connecting (Vi, Vj) V: Distinguished vertices: start (Vin), finish (Vout); Example instance:
7 vertices; 12 edges; Vin = 0; Vout = 6; Very simple instance
HPP asks the decision-question: Does a path through G exist which passes through each vertex in V
exactly once? (not necessarily in order) Usually constrained: path should be between Vin and Vout; HPP NP-complete (no known efficient algorithm) Solution TIME scales exponentially with |V|HARD!
For our instance, the answer is yes: Satisfying path:
DNA Computing Algorithm for HPP:
A Adl Al i h
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
27/57
A. Adlemans Algorithm Consider our instance graph Encoding [O(|V|2) biosteps]:
Synthesize a DNA strand, Si for each vertex Vi V; Following these, synthesize a splinting strand for each edge, eij G;
Path Generation (1 biostep): Anneal and Ligate all strands:
Result: parallel production of a ssDNA for each path in G; Ligation makes path molecules permanent.
Path Screening: Gel Electrophoresis (1 biostep):
Keep only solution-length paths; PCR Amplify, using primers for Vin and Vout (1 biostep);
Result: amplifies only paths ending at Vin/beginning at Vout; Affinity Extract recursively on T, for each Vi in V (O|V| biosteps):
Each time, keep only the extracted paths
5. Check Answer: Detect via UV Spectroscopy (1 biostep):
If DNA remains, it must encode a satisfying path (YES)otherwise: NO. Note: In practice, we may also sequence the DNA result (if YES)
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
28/57
Adlemans Algorithm
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
29/57
B. Chip-Based SATISFIABILITY
CNF-SAT Instance: Boolean expression in conjunctive-normal form: e.g.: S = (x y) (y z) (x y)
3 variables (x,y,z): 3 clauses, each expressed in terms of , the logical OR;
Clauses connected by , the logical AND; SAT asks the decision question:
Does a variable assignment exist thatsimultaneously satisfies all clauses (and thus S)?
NP-complete; For our instance, the answer is yes. Two satisfying assignments:
(x,y,z) = {(011), (111)}.
DNA Chip-based Algorithm for SAT: Liu, et al.: Proc. DNA 2; Nature, 2000.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
30/57
Liu, et al. DNA chip-based Algorithm:
Species for all variable assignments attached to DNA Chip; Array not indexed.
For each clause in S (TIME complexity polynomial): MARK: protect satisfying ssDNAs with a ssDNA probe; DESTROY: digest non-satisfying (unmarked) ssDNAs via exonuclease.
If any DNAs remain..the answer is yes.
Example:
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
31/57
Part III: DNA Design and Error
Estimation
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
32/57
Hybridization Fidelity
DNA-based Computing is stochastic: Many potental sources of error during Annealing; Ligation; Polymerization; etc.
Focus: analysis/design of hybridization error.
Three classes of models:1. Sequence similarity-based models. Combinatorial measures.
2. Equilibrium chemistry measures/methods. Simple: consider only isolated equilibria. General: treat as a problem in structural prediction.
3. Away from equilibrium Kinetic models (not dealt with, here).
The DNA Strand Design Problem
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
33/57
The DNA Strand Design Problem
Instance: X = (S, R, C, t), where S = set of ssDNAstrands:
Each to be encoded as a 5 to 3 string over {A,T,G,C}. R = set of hybridizationrules:
each a mode of annealing between strands in S. C = set of encodingconstraints:
rules in C impose relations on encodings in S External: some strands is S may encode biological targets. Internal: some words may be repeated in several strands.
DNA Strand Design (DSD) on X (Decision):Given constraints C, may S be encoded to anneal inaccordance with R, with per-duplex probability > pt= 1 t?
More usual is the optimization version: encode to minimize t
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
34/57
Example Instance
Consider a (very) simple DSD instance: S = {x,y}; 2 strands, with | x | = | y | = 8 bases. R = {R1}; R1 = [ (x, y); { (5,8), (6,7), (7,6), (8,5)}],
i.e., 4 H-bonded base-pairs between x and y:
5-x1x2x3x4x5x6x7x8-3
| | | |
3-y8y7y6y5y4y3y2y1-5
Real instances much more difficult
Generally, we use a STOCHASTIC method Guess and Check; Apply optimization algorithm (e.g., GA, Greedy Alg.);
For this, we need a measure ofgoodness;
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
35/57
Example Trial Solution
Lets try the encoding: J = {x,y} = {TGCTGCAC, AGCAGTGC}
J satisfies the 1 rule specified by R1 (easy):
5-TGCTGCAC-3| | | |
3-CGTGACGA-5
However, J may also form a large errorduplex:
5-TGCTGCAC-3| | | |
3-CGTGACGA-5
Clearly, J is not a good solution (encoding). Two approaches to evaluate goodness
Combinatoric measures; Equilibrium Chemistry-based analysis/measures:
Our focus.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
36/57
Methods ofSolving DSD
Most current methodologies stochastic: Phase I Express mixture as an instance of DSD.
i.e., a quadruple, X = (S, R, C, t).
Phase II Generate Initial Population Each encodes S. Each obeys constraints, C and rules, R.
Phase III - Apply a tool for encoding analysis. Analysis: assign a value, to each member.
if a member satisfies our goodness criterion ( < t) select and halt. Else
Phase IV - Apply stochastic optimization method: Genetic algorithm, greedy algorithm, etc.
Note: we still need a goodness criterion
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
37/57
Approach 1: Combinatorics
Idea: Minimize unplanned sequence-similarity. Simplest: Hamming-distance, d(X,Y)
computed between each pair, {X,Y} in S. let Y* denote Ys Watson-Crick reverse-complement. X,Y* assumed perfectly-aligned, with no bulges.
d(X,Y) = # of Watson-Crick mismatches b/w X and Y*. R. Deaton, et al.: Proc. DNA 2 (1996), Phys. Rev. Lett (1998).
Definition of reliability: S is error-free if d(x,y) dmin for undesired pairs, {X,Y} in S; Strategy: co-maximize d(x,y), for all unplanned pairs.
Fundamental assumption: Occupancy of conformations with > dmin mismatches
negligible. Example:
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
38/57
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
39/57
Hamming EncodingExpanded:
Condon, et al., J. Comp. Biol. (1999). Three flavors of Hamming encoding defined:
The Hamming constraint: H(X,Y) dmin. H(X,Y) is the classic Hamming distance.
(Standard) The reverse-complement constraint: H(XC,YR) dmin.
XC, XR = complement, reverse of X, respectively. The reverse constraint: H(X,YR) dmin.
Together, relax many interaction-type approximations.
Garzon, et al.,Proc. DNA 5(1999).
H-measure of {X,Y} = minimum d(X,Y), over all frames. Prevention of misaligned hybridization Arita, et al., New Gen. Comp.20 (2002).
The Template method non-stochastic design. Encodes S so that H(XC,YR) length/3
for all pairs, frameshifts, and catenations of S.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
40/57
Approach 2: Equilibrium Analysis
Consider the coupled equilibrium, below Two ssDNA species (A,B); Two dsDNA species:
ABp - full-length planned duplex ABe - a shorter error duplex
Two Approaches for modeling error: i.e., the occupancy of ABe.
1. Melting-Temperature (Tm
) analysis: e.g.: treat in terms of the Tms of each isolated duplex; Idea: completely Ignore the coupling.
2. General Equilibrium Analysis: First: Estimate equilibrium concentrations of all species Then: use these to compute average error probability, .
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
41/57
Equilibrium Strategy I: Tm Analysis
Basic Idea: melting curves determined for isolated equilibria.
coupled equilibrium modeled in terms of these. Coupling ignoredweak coupling is argued.
e.g.: error Keqs are small.
Overall Strategy: Write expressions for isolated equilibria. Compute Tmsof planned and unplanned structures
Tm = Ho/[So + R ln(Ctot/4)] (distinct strands, A and B)
Carry out reactions at a stringent temperature: beneath the Tm of (isolated) planned structure(s). above the Tm of all (isolated) unwanted structures.
Basic Expectation: unwanted structures unstable at stringent Trxs. Hope: minimal occupancy of unwanted conformations.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
42/57
Duplex Melting Temperature
For each species of duplex:
Assume a simple, isolated equilibrium:
Obtain ext via. a simple, equilibrium analysis, as before:
Mass Action: KD = CACB / CAB = 1/Kassoc
Conservation of ssDNAs: CAo
= CA + (1+AB)CAB
Combine with ext= 2CAB/Ctot ; and solve forext:
ext = [1 + (aCtotKassoc.)-1] {[1 + (aCtotKassoc.)
-1]2 b}1/2
Identical A and B: AB =1, a = 4, b = 1.
Distinct A and B: AB = 0; a = 1, b = 4 CAoCB
o / Ctot2
Choose a Melting Temperature Model:
Full model: Tm = temp at which = extint = .
E l M 20
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
43/57
Example: Mean-energy 20-mer
For a 20-mer with mean stacking energetics:
Note: substantial width, especially for short oligos;
i.e.: T approx. 10
o
C for 10-mers. All-or-none assumption usually made: Formally: Tm Trx at which ext = . Resulting Expression:
Tm = Ho/[So + R ln(Ctot/4)] (A,B distinct)
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
44/57
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
45/57
Eq. Strategy II: Coupled model Error rate = ratio of equilibrium [dsDNA]s:
Generally, we will need to re-express in terms of: Equilibrium constants, K
eq;
Total strand concentrations, [A]o and [B]o;
Simple tools (as before): Law of Mass Action (for each component equilibrium):
e.g., [ABe] = [A] [B] Ke;
Strand Conservation (for each ssDNA): e.g., [A]o = [A] + [ABe] + [ABp]
Statistical weighting (for each Keq) e.g., net Keq(AB) = k(k) ;
k indexes all conformations b/w A and B
Combine, Approximate if necessary, and Solve
S l ti Di t d Di F ti
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
46/57
Solution: Directed Dimer Formation For our simple case, solution seems easy:
quickly reduces to a ratio of Keqs.
Note: for A = B, result is exact. However, for B != A, our approx. is too severe
We should have accounted for [AA] and [BB] when defining .
Now need to solve two coupled, non-linear equations Strand conservation Eqs. for A and B: Not so easy.
For a real problem, many competing species Requires solving a larger system of coupled quadratics
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
47/57
Approximate General Treatment
One approach: assume uniform strand-saturations i (ext) = (Ci
oCi)/Cio = j.(ext), for all i,j;
J. Rose, et al., Proc. DNA 6 (1999), Natural Computing, 2004. Assumes intelligent system biasing during
design/operation.
Solution:
Note: yields previous solution for [A] = [B].
Problems: Expected to fail given large excesses in is.
Monotonic temperature dependence
Not ex ected via a T anal sis.
Real example: The TAT System
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
48/57
Real example: The TAT System
Recent Application: DNA Computing-based Gene-Expression Profiling
Suyama, et al (U. Tokyo).
D i P bl TAT Fid li
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
49/57
Design Problem: TAT Fidelity
Goal: given a TAT encoding
assess the fidelity ofthe hybridization process. Occupancy of interest: Error TAT hybrids. Let = equilibrium error probability/hybridized Tag. Let equilibrium constants be denoted by Keq.
Notation: Ci, Cj* = equilibrium concentration of Tag i, Antitag j*.
Keij* = total Keq of error duplex formation for i and j*. Kij* = total Keq of duplex formation b/w i and j*.
Khp
i, Khp
j* = total Keqs of folding. TAT pair is matching when i = j*
i.e.: tag 2 and antitag, 2* are matching.
Basic Equilibrium Expression:
= i j* Cij*(error)/ i j*Cij*
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
50/57
Apply Mass Action
Decompose complex equilibrium: apply Mass Action to each simple equilibrium. Hairpin formation:
each Tag species, i: Cihp = Ci Ki
hp
each Antitag species, j*: Cj*hp = Cj*Kj*
hp
Duplex formation: each Tag-Antitagpair, {i,j*}:
(total) Cij* = Ci Cj*Kij*
(error) Cij* = Ci Cj*Kij*e
each Tag-Tag pair, {i,i}: Cii = Ci CiKii
Group appropriate equilibria: parallel equilibria grouped for convenience
Keqs then sums over many related conformations.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
51/57
A l A i i
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
52/57
Apply Approximations
Starting point: = i j* CiCj*K
e
ij*/ i j*CiCj*Kij*
Approximations: All Antitags (bound): equal, excess concentration, Ca.
We assume a dilute, multi-tag input. Negligible Tag-Tag interaction:
Kii*>> Kij, for all i, j;
3. Relatively Low Error Rate:Kii*>> Kij*, for all i, j* != i*;weak orthogonalityThen, Cj* = Ca (1 + K
hpj*)
-1
Matching hairpins equivalent:Khpi K
hpi*, for each i.
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
53/57
Tag-Antitag System Fidelity Error Probability per hybridized Tag:
Dilute input (Rose, et al Proc. DNA 7, 2001) For the mean, multi-tag input:
Non-dilute inputRose, et al., Proc. CEC(2003). Combined model in submission, J. Comp. Biol. Allows a comparison with the Tm-based model (Next slide).
Design Strategy: Encode to minimize w , via a stochastic search method.
M d l C i
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
54/57
Model Comparison
Predictions for a small TAT system Antitag[1] = 5 AACCGACTACGTCACCAA 3 Antitag[2] = 5 TTGGGACTACGTCAGGTT 3
Input of only Tag[1]: error duplex = 10/18 bps.
[top]Coupled, -based model: Full Model:
Red curve = excess input (10x); Blue curve = dilute input (0.1x);
Uniform Sat. Approx. =Dashed curve
[bottom]Uncoupled, Tm-based model:
Red, blue = excess, dilute melting Isolated (PLANNED) and (error) duplexes.
Approx. pictures model opposing limits: Excess input (10x) :
agrees with Tm- model (gray lines) Dilute Input (0.1x):
agrees with unif-saturation, -based model.
The Inverse Problem: Design
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
55/57
The Inverse Problem: Design
-Based Method for TAT System Design
Evolution via a Standard Genetic Algorithm Basic Idea: minimize mean, excess single-tag error,
Target Performance: Minimized Mean Error Rate,
25oC, 1.0 M [Na+], pH 7.0; Excess input (worst-case);
Uniform target TAT Keqs Within about +/- 30%
3. Negligible folding
4. Negligible Tag-Tag interaction
5. No Quad-G or Quad-C motifs Evolved System (at right)
Hi-Fidelity, 100-strand TAT system Antitags illustrated. Tags = Watson-Crick complements.
Performance: Evolved System
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
56/57
Performance: Evolved System Predicted system performance ( values ):
[Left Panel] Mean over Single-Tag Inputs: Designed System (25oC, excess): = -4.43 +/- 0.55
Random Encodings (25oC, excess): = -2.47 +/- 0.21 Good Improvement!: > 9 standard deviations.
[Right Panel] Dilute, Multi-tag Input
F d
-
8/14/2019 Seminar - Intro to DNA Computing (Two Lectures Combined)
57/57
Forward In my 3rd Year seminar, our survey of molecular computing will
be continued with3. An overview of Whiplash PCR:
In which each strands computes autonomously. We examine a problem: Back-hybridization. ..
Reduces computational efficiency; Analysis: pseudo-equilibrium approach to modeling efficiency; One proposed solution: PNA-mediated Whiplash PCR;
4. In vitro evolutionary computing: As an alternative to generate-and-search . Generally, via WPCR/PWPCR:
Poker (in vitro co-evolution ofplayer/dealer strategies); In vitro evolution ofcustom proteins.
Development of a full-featured software tool for DNA Computing