Molecular Computations Using Self-Assembled DNA Nanostructures and Autonomous Motors John H. Reif...

Post on 31-Dec-2015

213 views 0 download

Tags:

Transcript of Molecular Computations Using Self-Assembled DNA Nanostructures and Autonomous Motors John H. Reif...

Molecular Computations Using Self-Assembled DNA Nanostructures and Autonomous Motors

John H. Reif

Department of Computer Science, Duke University

1

DNA Nanostructures: DNA tiles: composed of a few strands of DNA that self-assemble

(via DNA annealing) into a roughly rectangular shape.

TX tiles: 3 double stranded DNA with Holiday junctions

Approx 20 Angstroms wide

TAO35 TAE 44

Strand topology traces of TX tiles.Strand topology traces of TX tiles. •‘‘T’ denotes three DNA helicesT’ denotes three DNA helices•‘‘A’ denotes anti-parallel crossovers A’ denotes anti-parallel crossovers (strand changes direction of propagation at crossover points)(strand changes direction of propagation at crossover points)•‘‘O’ and ‘E’ are odd and even, respectively:O’ and ‘E’ are odd and even, respectively: refer to the number of helical half-turns between adjacent crossover points.refer to the number of helical half-turns between adjacent crossover points.

GTTCAGCCTTAGT CCACAGTCACGGATGG ACTCGATAGCCAA

CAAGTCGGAATCA GGTGTCAGTGCCTACC TGAGCTATCGGTT

T

T

T

T

T

T

T

T

TCTGG ACTCC TGGCATCTCATTCGCA GGACA GGTAG

AGACC TGAGG ACCGTAGAGTAAGCGT CCTGT CCATC

CATCTCGT CCTTGCGTTTCGCCAATCCAGAAGCC TGCGAGCA

GTAGAGCA GGAACGCAAAGCGGTTAGGTCTTCGG ACGCTCGT

2

2

1

3

4 1

4

3

DNA TX tile Nanostructures: • 3 double stranded DNA with Holiday junctions• Approx 20 Angstroms wide

DNA Hybridization and Ligation Operations.

Hybridization of sticky single-strand DNA segments. Ligation: If the sticky single-strand segments that anneal abut

doubly stranded segments of DNA, you can use an enzymic reaction known as ligation to concatenate these segments.

Unique Sticky Ends on DNA tiles. Input layers can be assembled via unique sticky-ends at each tile joint thereby

requiring one tile type for each position in the input layer.

Tiling self-assembly: proceeds by the selective hybridization of the pads of distinct tiles, which allows tiles to compose together to form a controlled tiling lattice (these pads determine the form of the tiling that self-assembles).

Technical Challenge: Optimal Design of strands composing

tiles.

Atomic Force Microscope Image

Bands Generated by B* Tileswith Attached Beads

2D DNA Self-Assembled Tilings:Rendering Simple Banded Images

B* Tiles with Loops

Major Goals of Duke DNA Nanotechnology (Reif’s Group)

1) Patterned DNA Arrays.• Set of specific tiles which form patterns.• Assembly around scaffold strands.• Molecular fabric.

2) Computation via DNA Self-Assembly• Reporter strand output (requires ligation).• Microscopic readout (via AFM, TEM, SEM, etc.).

3) Applications of DNA-Based Assemblies.• Molecular and nano-scale electronics.• Molecular motors and actuators.

4) Software for Design and Simulation of DNA Assemblies.

Background Literature on DNA Self-Assembled Tiling Lattices.

First Paper:

[Winfree and Seeman,98] The first experimental demonstration of self-assembly of DNA to construct 2D lattices consisting of up to a hundred thousand DNA tiles.

Recent Review papers:

• [Reif, LaBean, Seeman, 2000] Challenges and Applications for Self-Assembled DNA Nanostructures.

• [J. H. Reif, IEEE Computer and Scientific Engineering Magazine, 2002] DNA Lattices: A Method for Molecular Scale Patterning and Computation.

• [LaBean, Yan, Park, Feng, Yin, Li, Ahn, Liu, Guan, and Reif, Information Sciences, 2004] Overview of New Structures for DNA-Based Nanofabrication and Computation.

Reif Group’s Papers on DNA Self-Assembled Tiling Lattices.

• [LaBean, Winfree, Reif, & Seeman, J. Am. Chem. Soc. 2000] Improved DNA nanostructures: TX tiles that have multiple DNA strands running them.

• [Mao, LaBean, Reif, Seeman, Nature 2000] Experimentally demonstrated for the first time a computation using self-assembled DNA lattices of TX tiles that self-assembled around input strands running through the tiles.

• [Yan, Feng, LaBean, and Reif, JACS, 2003] DNA Nanotubes, Parallel Molecular Computation of Pair-Wise XOR Using DNA String Tile.

• [Yan, LaBean, Feng, and Reif, PNAS, 2003] Directed Nucleation Assembly of Barcode Patterned DNA Lattices.

• [Yan, Park, Finkelstein, Reif, & LaBean, Science, 2003] DNA-Templated Self-Assembly of Protein Arrays and Highly Conductive Nanowires.

• [Feng, Park, Reif, and Yan, Angewandte Chemie, 2003] A Two State DNA Lattice Actuated by DNA Motors.

• [Li, Park, Reif, LaBean, Yan, JACS 2004] DNA Templated Self-Assembly of Protein and Nanoparticle Linear Arrays.

• [Liu, Reif, LaBean, PNAS 2004] Improved DNA nanostructures: DNA nanotubes self-assembled from triple-crossover tiles as templates for conductive nanowires.

Molecular Pattern Formation using Scaffold Strands for Directed Nucleation:

• Multiple tiles of an input layer can be assembled around a single, long DNA strand we refer to as a scaffold strand (shown as black lines in the figures).

A

B

C

o Examples of Arrangements of Scaffold Strands : – (A) Diagonal TAO layer which partially defines binding slots for tiles of the next successive layer. – (B) Horizonal layer of alternating TAE and DAE tiles.– (C) crenellated horizontal layer which could be comprised of TAE or DAE tiles.

Structures in B and C completely define binding slot for tiles on next layers.

Directed Nucleation Assembly of Barcode Patterned DNA Lattices (PNAS 2003)Hao Yan, Thomas H. LaBean, Liping Feng, John H. Reif

Aperiodic patterned DNA lattice (Barcode Lattice) by directed nucleation self-assembly of DNA tiles around a scaffold DNA strand.

A first step toward implementation of a visual readout system capable of converting information encoded on a 1-D DNA strand into a 2-D form readable by advanced microscopic techniques.

TileSoft: Sequence Optimization Software for Designing DNA Secondary Structures

P. Yin*, B. Guo*, C. Belmore*, W. Palmeri*, E. Winfree†, T. H. LaBean* and J. H. Reif*

* Department of Computer Science, Duke University† Department of Computer Science, Caltech

12

DNA is a crucial construction material for molecular scale objects with nano-scale features. Diverse synthetic DNA objects hold great potential for applications such as nano-fabrication, nano-robotics, nano-computing, and nano-electronics. The construction of DNA objects is generally carried out via self-assembly. During self-assembly, DNA strands are guided by their sequence information into secondary structures to maximize Watson-Crick pairing of their bases and thus minimize the free energy of the resultant structures. A crucial computational problem in constructing DNA objects is the design of DNA sequences that can correctly assemble into desired DNA secondary structures. However, existing software packages only provide unintuitive text-line interfaces and generally require the user to step through the entire sequence selection process, which could be time-consuming and tedious. In contrast, TileSoft described here deliver the following features: • Its graphical user interface renders the molecular architect the ability to define DNA

secondary structure and accompanying designing constraints directly on the interface as well as the ability to view the optimized sequence information pictorially.

• Its fully automatic optimization module relieves the user of the drudgery of manually dictating the sequence selection process, and its evolutionary algorithm produces satisfactory results efficiently.

• Its graphical user interface and its optimization module are smoothly integrated from user's perspective, while they are at the same time well separated in terms of software architecture, making each amenable to future improvements without negatively affecting the other.

Abstract

13

Motivation: Designing DNA tiles

14

DX lattice Rhombus lattice TX lattice 4x 4 lattice Barcode lattice

• DNA as nano-construction material

• Self-assembly as bottom-up nano-construction method

• DNA lattices made of DNA tiles, i.e. smaller DNA secondary nanostructure units

• Design process:1. Template design

2. Sequence selection (optimization)Tile template to be optimized

Prior work v.s. TileSoft

15

• DNA word design software:• Produce a pool of DNA sequences such that each sequence is of maximal difference from

others

• Sequin:• Generate DNA sequences that uniquely assemble into desired secondary structures

• Text line interface for inputting template and displaying optimized sequences

• Sequence optimization process only semi-automated

• TileSoft: Available at http://www.cs.duke.edu/~py/dnaTileSoft/. • Generate DNA sequences that uniquely assemble into desired secondary structures

• Graphical interface for inputting template and displaying optimized structure (GUI Module)

• Sequence optimization process fully automated (Optimization Module)

• GUI Module and Optimization Module smoothly integrated for end users, yet well separated in software architecture

GUI: Default window

16

GUI: Define Crossover

17

• The user can define crossovers between helices, by clicking sequentially the two bases to be connected in 5' to 3' order.

GUI: Set 5’ end; set 3’ end

18

Set 5’ end Set 3’ end

• By setting the 5' end and 3' end of a DNA strand, the user specifies the length of the strand, and the unused segment of the strand is deleted automatically (showed in color gray).

GUI: Edit base

19

• The user can directly input the base values for a strand by typing; Typing more than one character edits consecutive bases in the 5' to 3' direction along the strand

GUI: Set non-Waston-Crick base pairing

20

• The user can define the subsequences that are not required to be Watson-Crick base paired by clicking on the starting and ending bases of the subsequences.

GUI: Set and show EQ constraint

21

Set EQ constraint Show EQ constraint

• Set EQ constraint: Clicking on two bases in one strand defines the starting and ending points of the first sub-sequence, and a click on another base delineates the second sub-sequence with the same length and direction as the first one.

• Show EQ Constraint: A small window is brought up that contains multiple buttons, with each representing a set of equal sub-sequences. When one of these buttons is clicked, the corresponding sub-sequences will be highlighted in purple.

• The optimization module employs an evolutionary algorithm to find the best solution to the optimization objective function.

• Evolutionary algorithm:• An evolutionary algorithm maintains a population of DNA sequences, which are

generated randomly during initialization. During selection, the fittest DNA sequences are chosen for reproduction, based on their score according to the objective function. These individuals are used to generate new individuals via mutations and crossovers, and the newly produced individuals are reinserted into the population. The process is repeated until meeting some termination condition.

• Objective function:• The objective function consists of two weighted factors, the count of unwanted

complementary sequences spurious matches (as the sequence-symmetry minimization algorithm used in Sequin) and the count of to-be-avoided sub-sequences, e.g. long AT runs.

Optimization module

22

• GUI:• Geometrically more flexible structures• Make the number of helices and number of bases per helix user specifiable

• Optimization:• Multiple parallel traces of optimization process with different starting points • A pre-optimized library• Incorporate parameters such as hybridization temperature and software modules

such as BIND• A new heuristic that performs optimization based on existing pre-optimized duplex

libraries

• Other:• Curvature analyzer (S. H. Park, Duke Physics )• Make the software more robust

Future work

23

4 x 4 Tile

A Lattice of 4 x 4 Tiles

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Uncorrugated 4 x 4 tileUncorrugated 4 x 4 tile

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Software for Designing and Simulating

DNA Nanorobotical Devices

John Reif, Sudheer Sahu, Peng Yin

Department of Computer Science, Duke University

27

DNA-Based Nano-Engineering: DNA and its Enzymes as the Engines of Creation at the Molecular Scale

John H. Reif

Department of Computer Science, Duke University

28

Motivation

29

DNA based nanorobotics devices

(Mao et al 99) (Yurke et al 00) (Simmel et al 01) (Simmel et al 02)

(Yan et al 02) (Li et al 02) (Alberti et al 03) (Feng et al 03)

Rotation

Rotation

Open/close Open/close Open/close

Extension/contraction Extension/contraction Extension/contraction

MotivationMotivation-Device I-Device II-Device III-Conclusion

Motivation

30

DNA nanorobotics

(R. Cross Lab)

Kinesin

Synthetic unidirectional DNA walker that moves autonomously

along a linear route over a macroscopic structure ?

(Recent work: non-autonomous DNA walking device by Seeman’s group,

autonomous DNA tweezer by Mao’s group)

Rotation, open/close

extension/contraction

mediated by

environmental changes

Autonomous, unidirectional motion along an extended linear trackAutonomous, unidirectional motion along an extended linear track

MotivationMotivation-Device I-Device II-Device III-Conclusion

Structural Components

• Design Part

• Sequence Optimization Part

• Simulation Part

31

I/O Specification for Design

• Input: – Symbolic relation between various parts of

DNA sequences

• Output:– Suitable restriction enzymes, sequences and

optimal values of experimental conditions

Design Sequence Optimization Simulation

32

Design Part

• 3-parameter model for restriction enzymes– r,d,l

Design Sequence Optimization Simulation

33

Design Part (continued….)

• All restriction enzymes can be expressed via these parameters.

• The cleaving actions on the DNA strands in the device are translated into these parameters, and as output we get a list of all restriction enzymes that can be used for the purpose.

Design Sequence Optimization Simulation

34

Example

• r=6, d=16, l=14

• Restriction enzymes that can be used– Acu I– Bpm I– BpuE I– Bsg I

Design Sequence Optimization Simulation

35

I/O for Sequence Optimization

• Input– Topologies of system in various conformations– Template sequences and constraints on them

• Output– Exact optimized sequences for these template

sequences

Design Sequence Optimization Simulation

36

Procedure

• Find out the constraints for WC matches, for the given conformation.

• Assign bases randomly to degenerate bases.

• Calculate SCORE on the basis of:

– Count of bad oligos like GGG, GGGG, TTTT, AAAA

– Counts of complementary match regions (with and without one mismatch) at intra-molecular, inter-molecular intra-complex and inter-complex regions.

• The goal is to find an optimal structure that minimizes the score.

Design Sequence Optimization Simulation

37

Issues

• We can mutate one base at a time and evaluate the new structure.

– Local minima.

• An evolutionary algorithm can be used for that with a fitness function similar to the score function defined earlier.

Design Sequence Optimization Simulation

38

Dynamic case

• Number of conformations small

– Easy to extend this method (just add few more constraints)

• Number of conformations large

– Difficult problem

– Consider one (or more) basic conformation(s) and the local changes in it (them) and apply the method on this group as a whole

• C0 is initial conformation

Ci is local change in conformation in going from Ci-1 to Ci

• Consider the group C0,C1,C2,…,Cn for the optimization as a whole.

Design Sequence Optimization Simulation

39

I/O for Simulation

• Input– Initial conformation of the system– Various external factors like

• temperature• Na+, Mg++ concentrations

• Output– Graphical display of the simulation

Design Sequence Optimization Simulation

40

Simulation Software

• DNA molecules represented as – A string representing the sequence of 1st strand from 5’

to 3’ end.

– Another string representing the sequence of 2nd strand from 3’ to 5’ end.

– Offset between the two sequences.

• With each DNA molecule we associate the numerical value of its concentration.

Design Sequence Optimization Simulation

41

Assumptions

• Discrete time events.• Restriction requires an exact match of the

recognition sites.• For simplification, we assume that ligation

requires an exact compliment of the sticky end.

Design Sequence Optimization Simulation

42

Ligation Events

• For every DNA molecule, calculate the probability of it ligating with any other molecule, based upon the concentration of the two molecules (only if they have complementary sticky ends).

• Allow that ligation to occur with that probability.

Design Sequence Optimization Simulation

43

Restriction Events

• In every molecule, find out if there is any restriction site, then depending upon the activity of the restriction enzyme, find the probability of that restriction.

• Allow that restriction to occur with that probability.

Design Sequence Optimization Simulation

44

Simulation Steps

1. Start with the initial configuration of the system.

2. Calculate the probability that next event will be a ligation or a restriction.

3. Calculate the probabilities of various ligations and restrictions.

4. Perform those events with the calculated probabilities.

5. Update the configuration of the system and the concentrations of the molecules and goto step 2.

45

Design Sequence Optimization Simulation

Example Design of DNA Walker

46

B C D A

Track

AnchorageA

Walker*

LigasePflM I

BstAP I

Restriction enzymes

Motivation-Device I-Device II-Device IIIDevice III-Conclusion

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

DNA Walker: Operation

47

B C D A

Track

AnchorageA

Walker*

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

A*BAC D

DNA Walker: Operation

48

Ligase

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

A*BAC D

DNA Walker: Operation

49

Ligase

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

A*BAC D

DNA Walker: Operation

50

PflM I

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

B*

A C D A

DNA Walker: Operation

51

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

B*CA

AD

DNA Walker: Operation

52

Ligase

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

B*CA

AD

DNA Walker: Operation

53

Ligase

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

B*CA

AD

DNA Walker: Operation

54

BstAP I

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

C*

A B D A

DNA Walker: Operation

55

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

D*ACA B

DNA Walker: Operation

56

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

C*D AA B

DNA Walker: Operation

57

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

D*

A B C A

DNA Walker: Operation

58

• Valid hybridization:

A* + B = A + B* => A*B B* + C = B + C* => B*C

C* + D = C + D* => C*D D* + A = D + A* => D*A• Valid cut:

A*B => A + B* B*C => B + C*

C*D => C + D* D*A => D + A*

A*

A B C D

DNA Walker: Operation

59

60

DNA Walker: Experimental Design

61

Autonomous Motion of the Walker

DNA Turing Machine: Structure

62

Turing machine

Transitional rules: Rule molecules

Turing head: Head molecules

Data tape: Symbol molecules

Autonomous universal DNA Turing machine: 2 states, 5 colorsAutonomous universal DNA Turing machine: 2 states, 5 colors