DNA Computing and Assembly ADN_animation.gif.

DNADNA Computing Computing

andand Assembly Assembly

http://en.wikipedia.org/wiki/File:ADN_animation.gif

Molecular Computation of Solutions to Combinatorial Problems(Adleman, 1994)

Used molecular biology to solve a small instance of the directed Hamiltonian path problem. Hamiltonian path problem is NP complete.

Scientific American August 1998

Algorithm1.Generate random paths through the graph2.Keep only those paths that start with vin and end with vout 3.If the graph has n vertices, then keep only those paths that enter exactly n vertices4.Keep only those paths that enter all of the vertices of the graph at least once5.If any paths remain, say “Yes”; otherwise, say “No.”

0 1 2 3 4 5 60 0 1 0 1 0 11 0 0 1 1 0 0 02 0 1 0 1 0 0 03 0 0 1 0 1 0 04 0 1 0 0 0 1 05 0 1 1 0 0 0 16 0 0 0 0 0 0 0

Adjacency matrix

Algorithm as implemented

1. Generate random paths through the graphAssociate each vertex with random sequence of DNA (O denotes oligonucleotide)

O2 TATCGGATCGGTATATCCGAO3 GCTATTCGAGCTTAAAGCTAO4 GGCTAGGTACCAGCATGCTT

Associate each edge with a molecule that preserves edge orientation (create edge molecule from the two 10-mer ends of the corresponding vertex molecules)

O23 GTATATCCGAGCTATTCGAGO34 CTTAAAGCTAGGCTAGGTACPreserves edge orientation (e.g. O23 and O32 are different)

Mix the edge molecules together in a chemical soup (containing complementary sequences) to effect the binding of molecules representing compatible edges

Ō3 CGATAAGCTCGAATTTCGAT (binds Ox3 and O3y ;Ō3 is the complementary sequence to O3)

For example, the binding of O23 and O34 is accomplished as:GTATATCCGAGCTATTCGAG|CTTAAAGCTAGGCTAGGTAC CGATAAGCTC|GAATTTCGAT(where | is for visualization purposes only)

Result is a collection of DNA molecules representing the required random paths



(continued)

2. Keep only those paths that start with vin and end with vout

Amplify product of step 1 via a polymerase chain reaction using primers that amplify only those molecules encoding paths starting and ending at the proper vertices

3. If the graph has n vertices, then keep only those paths that enter exactly n verticesUse bio-molecular technique on product of step 2 to select molecules that encode paths that enter exactly 7 vertices (i.e., select molecules with 140 base pairs)

4. Keep only those paths that enter all of the vertices of the graph at least oncePurify the product of step 3. This involves producing single strand DNA from the double strand output of step 3, then for each vertex reacting the single-strand DNA with a specific oligonucleotide that selected for a particular vertex (e.g., react Ō1 with the single strand DNA molecules to select paths that enter vertex 1 at least once; repeat with remaining product for remaining vertices).

5. If any paths remain, say “Yes”; otherwise, say “No.”Amplify the product of step 4 (via PCR) and examine

Discussion

Adleman could have probably used substantially smaller quantities of oligonucleotides for the given graph, or solved a more complicated graph with the quantities used. How well does this scale? It took Adleman 7 days of lab work to solve a problem that could be done visually in a matter of seconds and via traditional computing in a matter of micro-seconds or less. What is the advantage of his approach?

Using the stated algorithm, Adleman expects that:

1.Number of procedures should grow linearly with the number of vertices

2.Number of different oligonucleotides should grow linearly with number of edges

3.Quantity of each nucleotide should grow exponentially with number of vertices

4.For more complex graphs, possibility of errors will need to be carefully examined

5.When computation is complete, should confirm that an alleged path actually exists

Discussion (continued) Potential power of Adleman’s method

1.Speed: standard computers @ 106 – 1013 operations per second; defining the concatenation (ligation) of two DNA molecules as an operation, then 1014 – 1020 operations may be feasible (using larger quantities of source molecules).

2.Energy: 2x1019 operations per joule for the molecular approach (compare to a theoretical maximum of 34x1019); supercomputers do perhaps as many as 109 operations per joule.

3.Storage: information density of 1 bit per cubic nanometer using DNA; 1 bit per 1012 nm3 for video tapes.

4.May offer viable approach to problems that are ill suited to traditional Von Neumann processors (i.e., problems requiring massively parallel search)

Discussion (continued)

Consider how to solve the traveling salesman problem via this technique… Evaluate this approach as a search technique… Certain problems may be ill-suited for molecular computation (e.g., multiply two 100 digit numbers) – Adleman notes that, “It is a research problem of considerable interest to elucidate the kinds of algorithms that are possible with the use of molecular methods and the kinds of problems that these algorithms can efficiently solve.” Could probably encode instantaneous description of Turing machine in a DNA molecule and use current protocols and enzymes to effect sequence modifications corresponding to operation of the machine. (But, is this a reasonable use of the technology? Key is probably not to think of one TM but of a whole collection, operating in parallel…)

DNA-based computers Micro-chip based computersSlow individual operations Fast individual operationsBillions of simultaneous operations Substantially fewer simultaneous operationsHuge memory in small space Smaller memorySetting up problem may involve considerable preparations

Setting up a problem only requires keyboard input

DNA is sensitive to chemical deterioration

Electronic data are vulnerable but can be backed up easily

Comparison of DNA and conventional computers

(from Gramss et al., 1998)

Sources of error in DNA computation

Algorithm can fail if:

1.A good strand is damaged or lost

2.A bad strand is not removed and many are left at end

3.Every operation can cause an error (extraction). Extraction is not perfect (usually 95% of the strands match the desired pattern). In addition, strands that do not match will sometimes be removed anyway (rates typically 1 part in 106).

4.DNA has a half-life, and decays at a finite rate. If an algorithm takes months, good solutions will dissolve away.

5.DNA replication is naturally prone to error. It is usually necessary to eliminate (or reduce below some threshold) the errors associated with solutions (or steps toward solutions) in a problem. On the other hand, error can be good (e.g., exploration via, perhaps, mutation).

(from Omair Quraishi: http://pages.cpsc.ucalgary.ca/~jacob/Courses/Winter2003/CPSC601-73/Slides/05-DNA-Computing-Apps.pdf)

Intractable problems have no polynomial time algorithms. P is the collection of all sets recognizable by Turing machines in polynomial time. NP is the collection of all sets recognizable by non-deterministic Turing machines in polynomial time (NP Non-deterministic Polynomial) PNP but is P=NP or PNP? NP Complete problems are a sub-set of problems in NP, such that if a polynomial time solution is ever found for any one of them, then NP would be equal to P. Note: even our definitions of complexity are based on the concept of sequential processing!

I. van Rooij/Cognitive Science 32 (2008)

Computational Complexity

And Finally…

A key aspect of DNA computing is to keep the number of biochemical steps linear with respect to the size of the problem. Adleman’s solution: produce all possible guesses (hopefully) and check in polynomial time using massively parallel check Note that, for large problem spaces, all possible solutions cannot be formed with certainty. Furthermore, even if they could, since the number of possible candidates grows exponentially (or factorially) for NP problems, there are practical ramifications to be considered. For example, consider the number of strands of DNA that would be required to encode a graph of 200 nodes. The amount of DNA required to solve the Hamiltonian path problem via Adleman’s method has been computed by Hartmanis at 3x1025 Kg (the mass of the Earth is only 6x1024 Kg). Re-examine Adleman’s algorithm and consider it from the standpoint of a mathematical proof…

DNA Self-Assembly

http://en.wikipedia.org/wiki/DNA

http://www.blc.arizona.edu/Molecular_Graphics/DNA_Structure/DNA_Tutorial.HTML

http://www.google.com/imgres?imgurl=http://www.mun.ca/biology/scarr/CG_ATGC.gif&imgrefurl=http://www.mun.ca/biology/scarr/Base_pairing_rules.html&h=550&w=700&sz=13&tbnid=q6VCg8XK-zsJ::&tbnh=110&tbnw=140&prev=/images%3Fq%3Ddna%2Bbases&hl=en&usg=__FEGPhsSALnE1IYalmHkvLpFz7bY=&sa=X&oi=image_result&resnum=6&ct=image&cd=1

DNA Self-Assembly

Self-Assembly: The Science of Things That Put Themselves Together (Pelesko)

DNA Self-Assembly

http://en.wikipedia.org/wiki/DNA

Holliday Junction


http://seemanlab4.chem.nyu.edu/cross.html

Double Cross-Over Molecules

Improve rigidity of DNA structures…

http://www.rsc.org/delivery/_ArticleLinking/DisplayArticleForFree.cfm?doi=b605208h&JournalCode=OB

Self-Assembled Nano-structures

“Combinatorial self-assembly of DNA nanostructures” Lund, Liu , Yan, Org. Biomol. Chem., 2006, 4, 3402–3403

Fig. 2 AFM images. A. Square. B. Chair. C. Line. The image sizes are 45 nm × 45 nmfor A, 80 nm× 80 nm for B, and 95 nm × 108 nm for C.

Fig. 1 A. The two kinds of tiles used in construction. i. A cross-shaped tile acts as 4-way junction. ii. A holiday junction tile acts as a 2-way linker. B. A square from four cross-shaped tiles (T1–T4, in light purple) and four linker tiles (L1–L4, in dark purple). The unique sticky ends are labeled as short segments in different colors. C. A chair from four cross-shaped tiles and three linker tiles (L2, L3 and L5). D. A short line from four cross-shaped tiles and three linker tiles (L1, L2 and L6).

DNA Tiles


Parents: 11110000 11001100 10101010 --------Children: 01111110

1=white; 0=black

Sierpinski Gasket realized via a 1D cellular automaton evolved over time

DNA Tiles

Evolution of a 1D cellular automata implemented as an XOR operator


DNA Tiles(actually…)

Evolution of a 1D cellular automata implemented as an XOR operator


“Algorithmic self-assembly of DNA Sierpinski triangles” (Rothemund, Papadakis, and Winfree)

Two tile versions

“Algorithmic self-assembly of DNA Sierpinski triangles” (Rothemund, Papadakis, and Winfree)

DNA Tiles

“Synthesis of crystals with a programmable kinetic barrier to nucleation” Rebecca Schulman* and Erik Winfree, PNAS September 25, 2007 vol. 104 no. 39http://www.pnas.org/content/104/39/15236.full.pdf+html

DNA Barcodes

DNA hairpin loop


DNA Barcodes

“Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices,” Yan, LaBean, Feng, Reif, PNAS, July 8, 2003, vol. 100, no. 14, 8103–8108

http://www.cs.duke.edu/~thl/papers/Barcode.PNAS.pdf

Fig. 1. Self-assembly of 01101 barcode lattice around scaffold DNA strand.(a) (Upper) DAE tile, one type of antiparallel DNA DX tile. The tile drawing shows the five strands (three black and two red). The two red strands are continuous strands going through the tile in opposite directions (arrowheads mark 3 ends). There are two crossover points connecting the two domains. There are two helical turns between the two crossover points. (Lower) DAE 2J tile. This tile type has two hairpin loops protruding out of the central helix region of theDAEcomplex; one loop (thick line) is coming out of the plane and the other (thinner line) into the plane. The hairpin loops serve as topographic markers in AFM imaging of the lattices. (b) Schematic of self-assembly of barcode lattice layers based on DAE tiles around a scaffold strand. (Left) A five-tile crenellated horizontal layer is shown with an input scaffold strand running through the layer (red). The scaffold strand is required for the tiles to assemble. (Right) A lattice of four layers is illustrated (note that sticky ends are still available on the upper and lower layers for addition of more layers). The sticky ends are represented by different colored pads matching one other. The barcode information (01101) is represented by either the presence (designated 1) or the absence (designated 0) of a stem loop (shown as a black circle) protruding out of the tile plane. (c) Strand structure of one barcode layer. This layer represents barcode information of 01101. The red strand is the scaffold strand required for the tile assembly. The distance between adjacent hairpin loops is indicated by the number of helical turns. (d) AFM visualization of DNA barcode lattice (01101). The scale of each image is indicated in its lower right corner. Up to 24 layers of DNA have been self-assembled; the desired stripe pattern is clearly visible. Each layer contains five DX tiles and is 75 nm wide. The distance between the two closer adjacent stripes is 16 nm. The distance between the two further adjacent stripes is 31 nm. See Fig. 9, which is published as supporting information on the PNAS web site, for a large-area scan AFM image.

DNA Barcodes

“Directed nucleation assembly of DNA tile complexes for barcode-patterned lattices,” Yan, LaBean, Feng, Reif, PNAS, July 8, 2003, vol. 100, no. 14, 8103–8108

http://www.cs.duke.edu/~thl/papers/Barcode.PNAS.pdf

Fig. 2. Inverse pattern barcode lattice (10010). (a) Schematic drawing of the barcode lattice representing the bit sequence 10010, which is the inverse of the first barcode of 01101. A single layer is shown (Left), and a four-layer lattice fragment is given (Right). (b) Strand structure of one barcode layer that represents barcode information of 01101. The red strand is the scaffold strand required for the tile assembly. The distance between adjacent hairpins is indicated in helical turns. (c) An AFMimage at scale of 400400 nm. Each layer contains five DX tiles,75 nm wide, and the distance between the two stripes (designated 1) is 45 nm, as expected. See Fig. 9 for a large-area scan AFM image.

DNA Origami

http://www.dna.caltech.edu/~pwkr/

Figure 1 Design of DNA origami. a, A shape (red) approximated by parallel double helices joined by periodic crossovers (blue). b, Ascaffold (black) runs through every helix and forms more crossovers (red). c, As first designed, most staples bind two helices and are 16-mers. d, Similar to c with strands drawn as helices. Red triangles point to scaffold crossovers, black triangles to periodic crossovers with minor grooves on the top face of the shape, blue triangles to periodic crossovers with minor grooves on bottom. Crosssections of crossovers (1, 2, viewed from left) indicate backbone positions with coloured lines, and major/minor grooves by large/small angles between them. Arrows in c point to nicks sealed to create green strands in d. Yellow diamonds in c and d indicate a position at which staples may be cut and resealed to bridge the seam. e, A finished design after merges and rearrangements along the seam. Most staples are 32-mers spanning three helices. Insets show a dumbbell hairpin (d) and a 4-T loop (e), modifications used in Fig. 3.

“Folding DNA to create nanoscale shapes and patterns,” Rothemund, Nature, Vol 440, 16 March 2006 http://www.dna.caltech.edu/Papers/DNAorigami-nature.pdf

DNA Origami

“Folding DNA to create nanoscale shapes and patterns,” Rothemund, Nature, Vol 440, 16 March 2006 http://www.dna.caltech.edu/Papers/DNAorigami-nature.pdf

DNA Origami

“Folding DNA to create nanoscale shapes and patterns,” Rothemund, Nature, Vol 440, 16 March 2006

http://www.dna.caltech.edu/Papers/DNAorigami-nature.pdf

http://www.nyu.edu/public.affairs/videos/qtime/biped_movie.mov

Autonomous DNA Walker(Seeman et al.)

http

://w

ww

.na

ture

.co

m/n

chem

/jour

nal

/v3/

n2/fi

g_ta

b/n

che

m.9

57

_F6

.htm

l

Looking Ahead…

http://www.niac.usra.edu/files/studies/final_report/806Mavroidis.pdf

Looking Ahead…

http://www.niac.usra.edu/files/studies/final_report/806Mavroidis.pdf

FIGURE 3: 0-10 Years: Understanding of basic biological components and controlling their functions as robotic components. From left we have: DNA which will be used in a variety of ways such as a structural element and a power source; hemagglutinin virus used as a VPL motor; bacteriorhodopsin as a sensor and power source.

FIGURE 4: 10-20 Years: The biological elements once in study will now be used to fabricate robotic systems. A vision of a nano-organism: carbon nano-tubes form the main body; peptide limbs can be used for locomotion and object manipulation a biomolecular motor located at the head can propel the device in various environments.

Nanotechnology

Nanotechnology Roadmaphttp://www.foresight.org/roadmaps/Nanotech_Roadmap_2007_main.pdf

1024 yotta-

1021 zetta-

1018 exa-

1015 peta-

1012 tera-

109 giga-

106 mega-

103 kilo-

102 hecto-

101 deca-

100

10−1 deci-

10−2 centi-

10−3 milli-

10−6 micro-

10−9 nano-

10−12 pico-

10−15 femto-

10−18 atto-

10−21 zepto-

10−24 yocto-

DNA Computing and Assembly ADN_animation.gif.

Documents

Transcript of DNA Computing and Assembly ADN_animation.gif.