DNA Structure and Manipulation 58
CHAPTER 3
DNA STRUCTURE AND MANIPULATION
Ever since ancient Greek times, man has suspected that the features of one generation
are passed on to the next. It was not until Mendel's work on garden peas was
recognised (see [38, 75]) that scientists accepted that both parents contribute material
that determines the characteristics of their offspring. In the early 20th century, it was
discovered that chromosomes make up this material. Chemical analysis of
chromosomes revealed that they are composed of both protein and deoxyribonucleic
acid, or DNA. The question was, which substance carries the genetic information? For
many years, scientists favoured protein, because of its greater complexity relative to
that of DNA. Nobody believed that a molecule as simple as DNA, composed of only
four subunits (compared to 20 for protein) could carry complex genetic information.
It was not until the early 1950s that most biologists accepted the evidence showing
that it is in fact DNA that carries the genetic code. However, the physical structure of
the molecule and the hereditary mechanism was still far from clear. In 1951, the
biologist James Watson moved to Cambridge to work with a physicist, Francis Crick.
Using data collected by Rosalind Franklin and Maurice Wilkins at King's College,
London, they began to decipher the structure of DNA. They worked with models
made out of wire and sheet metal in attempt to construct something that fitted the
available data. Once satisfied with their model, they published the paper [78] (also see
[77]) that would eventually earn them (and Wilkins) the Nobel Prize for Physiology
or Medicine in 1962.
3.1 THE STRUCTURE AND MANIPULATION OF DNA
DNA (deoxyribonucleic acid) [1, 76] encodes the genetic information of cellular
organisms. It consists of polymer chains, commonly referred to as DNA strands. Each
strand may be viewed as a chain of nucleotides, or bases, attached to a sugar phosphate
―backbone". An n-letter sequence of consecutive bases is known as an oligonucleotide
DNA Structure and Manipulation 59
of length n. The four DNA nucleotides are adenine, guanine, cytosine and thymine,
commonly abbreviated to A, G, C and T respectively.
The four DNA nucleotides are adenine, guanine, cytosine and thymine, commonly
abbreviated to A, G, C and T respectively. Each strand has, according to chemical
convention, a 50 and a 30 end, thus any single strand has a natural orientation. This
orientation (and, therefore, the notation used) is due to fact that one end of the single
strand has a free (i.e., unattached to another nucleotide) 50 phosphate group, and the
other has a free 30 deoxyribose hydroxl group. The classical double helix of DNA is
formed when two separate strands bond. Bonding occurs by the pair wise attraction of
bases; A bonds with T and G bonds with C. The pairs (A,T) and (G,C) are therefore
known as complementary base pairs. The two pairs of bases form hydrogen bonds
between each other, two bonds between A and T, and three between G and C
In what follows we adopt the following convention: if x denotes an oligonucleotide,
then x denotes the complement of x. The bonding process, known as annealing, is
fundamental to our implementation. A strand will only anneal to its complement if
they have opposite polarities. Therefore, one strand of the double helix extends from
50 to 30, and the other from 30 to 50.
3.2 OPERATIONS ON DNA
The main idea behind DNA computing is to adopt a biological (wet) technique as an
efficient computing vehicle, where data are represented using strands of DNA. Even
though a DNA reaction is much slower than the cycle time of a silicon-based
computer, the inherently parallel processing offered by the DNA process plays an
important role. This massive parallelism of DNA processing is of particular interest in
solving NP-complete or NP-hard problems. It is not uncommon to encounter
molecular biological experiments which involve 6 × 1016/ml of DNA molecules.
This means that we can effectively realize 60,000 TeraBytes of memory, assuming
that each string of a DNA molecule expresses one character. The total execution
DNA Structure and Manipulation 60
speed of a DNA computer can outshine that of a conventional electronic computer,
even though the execution time of a single DNA molecule reaction is relatively slow.
A DNA computer is thus suited to problems such as the analysis of genome
information, and the functional design of molecules (where molecules constitute the
input data).
DNA consists of four bases of molecule structure, named adenine (A), guanine (G),
cytosine (C) and thymine (T). Moreover, constraints apply to connections between
these bases: more specifically, A can connect only with T, and G only with C – this
connecting rule is referred to as ‗Watson-Crick complementarily‘. This property is
essential to realize the separate operation. In other words, it is possible to separate a
partial string of characters ‗ad‘ so that a DNA sequence complementary to the DNA
denoting ‗ad‘ is marked, input into a test tube, hybridized to form a double strand
helix of DNA, then abstracted. Further, this property enables us to randomly create a
set of character strings according to some rule.
Since [1] described a method for solving a directed Hamiltonian path problem with 7
cities using DNA molecules, researchers have pursued theoretical studies to realize
general computation using DNA molecules [for example, [23]. [2] has developed a
computational model to realize – via experimental treatment of DNA molecules –
operations on multiple sets of character strings, following the encoding of finite
alphabet characters onto DNA molecules.
As previously mentioned, DNA molecules can be used as information storage media.
Usually, DNA sequences of around 8-20 base-pairs are used to represent bits, and
numerous methods have been developed to manipulate and evaluate these. In order to
manipulate a wet technology to perform computations, one or more of the following
techniques are used as computational operators for copying, sorting, splitting or
concatenating the information contained within DNA molecules:
ligation,
hybridization,
polymerase chain reaction (PCR),
gel electrophoresis, and
enzyme reaction.
DNA Structure and Manipulation 61
In the following lines we briefly describe the specific bio-chemical process. A DNA
computer performs wet computation based on the high ability of special molecule
recognition executed in reactions among DNA molecules. Molecular computation was
first reported in [1], where it was found that a DNA polymerase – which incorporates
an enzyme function for copying DNA – is very similar in function to that of a Turing
machine. DNA polymerase composes its complementary DNA molecule using a
single strand helix of a DNA molecule. On the basis of this characteristic, if a large
amount of DNA molecules are mixed in a test tube, then reactions among them occur
simultaneously. Therefore, when a DNA molecule representing data or code reacts
with other DNA molecules, this corresponds to super parallel processing and/or a
huge amount of memory in comparison with a conventional (electronic) computer.
All models of DNA computation apply a specific sequence of biological operations to
a set of strands. These operations are commonly used by molecular biologists. Some
operations are specific to certain models of DNA computation.
3.2.1 Synthesis
Background
Deoxyribonucleic acid (DNA) synthesis is a process by which copies of nucleic acid
strands are made. In nature, DNA synthesis takes place in cells by a mechanism
known as DNA replication. Using genetic engineering and enzyme chemistry,
scientists have developed man-made methods for synthesizing DNA. The most
important of these is polymerase chain reaction (PCR). First developed in the early
1980s, PCR has become a multi-billion dollar industry with the original patent being
sold for $300 million dollars.
History
DNA was discovered in 1951 by Francis Crick, James Watson, and Maurice Wilkins.
Using x-ray crystallography data generated by Rosalind Franklin, Watson and Crick
determined that the structure of DNA was that of a double helix. For this work,
Watson, Crick, and Wilkins received the Nobel Prize in Physiology or Medicine in
1962. Over the years, scientists worked with DNA trying to figure out the "code of
life." They found that DNA served as the instruction code for protein sequences. They
DNA Structure and Manipulation 62
also found that every organism has a unique DNA sequence and it could be used for
screening, diagnostic, and identification purposes. One thing that proved limiting in
these studies was the amount of DNA available from a single source.
After the nature of DNA was determined, scientists were able to examine the
composition of the cellular genes. A gene is a specific sequence of DNA base pairs
that provide the code for the construction of a protein. These proteins determine the
traits of an organism, such as eye color or blood type. When a certain gene was
isolated, it became desirable to synthesize copies of that molecule. One of the first
ways in which a large amount of a specific DNA was synthesized was though genetic
engineering.
Genetic engineering begins by combining a gene of interest with a bacterial plasmid.
A plasmid is a small stretch of DNA that is found in many bacteria. The resulting
hybrid DNA is called recombinant DNA. This new recombinant DNA plasmid is then
injected into bacterial cells. The cells are then cloned by allowing it to grow and
multiply in a culture. As the cells multiply so do copies of the inserted gene. When the
bacteria has multiplied enough, the multiple copies of the inserted gene can then be
isolated. This method of DNA synthesis can produce billions of copies of a gene in a
couple of weeks.
In 1983, the time required to produce copies of DNA was significantly reduced when
Kary Mullis developed a process for synthesizing DNA called polymerase chain
reaction (PCR). This method is much faster than previous known methods producing
billions of copies of a DNA strand in just a few hours. It begins by putting a small
section of double stranded DNA in a solution containing DNA polymerase,
nucleotides and primers. The solution is heated to separate the DNA strands. When it
is cooled, the polymerase creates a copy of each strand. The process is repeated every
five minutes until the desired amount of DNA is produced. In 1993, Mullis's
development of PCR earned him the Nobel Prize in Chemistry. Today, PCR has
revolutionized the fields of medical diagnostics, forensics, and microbiology. It is said
to be one of the most important developments in genetic research.
DNA Structure and Manipulation 63
The key to understanding DNA synthesis is understanding its structure. DNA is a long
chain polymer made up of chemical units called nucleotides. Also known as genetic
material, DNA is the molecule that carries information that dictates protein synthesis
in most living organisms. Typically, DNA exists as two chains of chemically linked
nucleotides. These links follow specific patterns dictated by the base pairing rules.
Each nucleotide is made up of a deoxyribose sugar molecule, a phosphate group, and
one of four nitrogen containing bases. The bases include the pyrimidines thymine (T)
and cytosine (C)and the purines adenine (A) and guanine (G). In DNA, adenine
generally links with thymine and guanine with cytosine. The molecule is arranged in a
structure called a double helix which can be imagined by picturing a twisted ladder or
spiral staircase. The bases make up the rungs of the ladder while the sugar and
phosphate portions make up the ladder sides. The order in which the nucleotides are
linked, called the sequence, is determined by a process known as DNA sequencing.
In a eukaryotic cell, DNA synthesis occurs just prior to cell division through a process
called replication. When replication begins the two strands of DNA are separated by a
variety of enzymes. Thus opened, each strand serves as a template for producing new
strands. This whole process is catalyzed by an enzyme called DNA polymerase. This
molecule brings corresponding, or complementary, nucleotides in line with each of
the DNA strands. The nucleotides are then chemically linked to form new DNA
strands which are exact copies of the original strand. These copies, called the daughter
strands, contain half of the parent DNA molecule and half of a whole new molecule.
Replication by this method is known as semi conservative replication. The process of
replication is important because it provides a method for cells to transfer an exact
duplicate of their genetic material from one generation of cell to the next.
Raw Materials
The primary raw materials used for DNA synthesis include DNA starting materials,
taq DNA polymerase, primers, nucleotides, and the buffer solution. Each of these play
an important role in the production of millions of DNA molecules.
Controlled DNA synthesis begins by identifying a small segment of DNA to copy.
This is typically a specific sequence of DNA that contains the code for a desired
protein. Called template DNA, this material is needed in concentrations of about 0.1-1
DNA Structure and Manipulation 64
micro-grams. It must be highly purified because even trace amounts of the
compounds used in DNA purification can inhibit the PCR process. One method for
purifying a DNA strand is treating it with 70% ethanol.
While the process of DNA replication was know before 1980, PCR was not possible
because there were no known heat stable DNA polymerases. DNA polymerase is the
enzyme that catalyzes the reactions involved in DNA synthesis. In the early 1980s,
scientists found bacteria living around natural steam vents. It turned out that these
organisms, called thermus aquaticus, had a DNA polymerase that was stable and
functional at extreme levels of heat. This taq DNA polymerase became the
cornerstone for modern DNA synthesis techniques. During a typical PCR process, 2-3
micrograms of taq DNA polymerase is needed. If too much is used however,
unwanted, nonspecific DNA sequences can result.
The polymerase builds the DNA strands by combining corresponding nucleotides on
each DNA strand. Chemically speaking, nucleotides are made up of three types of
molecular groups including a sugar structure, a phosphate group, and a cyclic base.
The sugar portion provides the primary structure for all nucleotides. In general, the
sugars are composed of five carbon atoms with a number of hydroxy (-OH) groups
attached. For DNA, the sugar is 2-deoxy-D-ribose. The defining part of a nucleotide is
the hetero-cyclic base that is covalently bound to the sugar. These bases are either
pyrimidine or purine groups, and they form the basis for the nucleic acid code. Two
types of purine bases are found including adenine and guanine. In DNA, two types of
pyrimidine bases are present, thymine and cytosine. A phosphate group makes up the
final portion of a nucleotide. This group is derived from phosphoric acid and is
covalently bonded to the sugar structure on the fifth carbon.
To initiate DNA synthesis, short primer sections of DNA must be used. These primer
sections, called oligo fragments, are about 18-25 nucleotides in length and correspond
to a section on the template DNA. They typically have a C and G nucleotide
concentration of about 60% with even distribution. This provides the maximum
efficiency in the synthesis process.
DNA Structure and Manipulation 65
The buffer solution provides the medium in which DNA synthesis can occur. This is
an aqueous solution which contains MgCl2, HCI, EDTA, and KCI. The MgCl2
concentration is important because the Mg2+ ions interact with the DNA and the
primers creating crucial complexes for DNA synthesis. The recommended
concentration is one to four micromoles. The pH of this system is critical so it may
also be buffered with ammonium sulphate. To energize the reaction, various energy
molecules are added such as ATP, GTP, and NTP. These compounds are the same
ones that living organisms use to power metabolic reactions.
Other materials that may be used in the process include mineral oil or paraffin wax.
After DNA synthesis is complete, the DNA is typically isolated and purified. Some
common reagents used in this process include phenol, EDTA and Proteinase K.
The Manufacturing Process
DNA synthesis is typically done on a small scale in laboratories. It involves three
distinct processes including sample preparation, DNA synthesis reaction cycle and
DNA isolation. These manufacturing steps are typically done in separate areas to
avoid contamination. Following these procedures scientists are able to convert a few
strands of DNA into millions and millions of exact copies.
Preparation of the samples
To begin DNA synthesis, the various solutions are prepared. This is typically done in
a laminar flow cabinet equipped with a UV lamp to minimize contamination.
Scientists use fresh gloves during each production step for similar reasons. Typically,
all of the starting solutions except the primers, polymerases and the dNTPs are put in
an autoclave to kill off any contaminating organism. Two separate solutions are made.
One contains the buffer, primers and the polymerase. The other contains the MgCl2
and the template DNA. These solutions are all put into small tubes to begin the
reaction.
Kary Banks Muilis was born in Lenoir, North Carolina, in 1944. Upon graduation
from Georgia Tech in 1966 with a B.S. in chemistry, Muilis entered the biochemistry
doctoral program at the University of California, Berkeley. Earning his Ph.D. in 1973,
he accepted a teaching position at the University of Kansas Medical School in Kansas
DNA Structure and Manipulation 66
City. In 1977, he assumed a postdoctoral fellowship at the University of California,
San Francisco.
Muilis accepted a position as a research scientist in 1979 with a growing biotech firm-
Cetus Corporation, in Emeryville, California-that synthesized chemicals used by other
scientists in genetic cloning. While there, he designed polymerase chain reaction
(PCR), a fast and effective technique for reproducing specific genes or DNA
(deoxyribonucleic acid) fragments that can create billions of copies in a few hours.
The most effective way to reproduce DNA was by cloning, but it was problematic. It
took time to convince Mullis's colleagues of the importance of this discovery but soon
PCR became the focus of intensive research. Scientists at Cetus developed a
commercial version of the process and a machine called the Thermal Cycler (with the
addition of the chemical building blocks of DNA [nucleotides] and a biochemical
catalyst [polymerase], the machine would perform the process automatically on a
target piece of DNA).
Cetus awarded Muilis $10,000 for developing the PCR patent, then sold it for $300
million. Leaving Cetus in 1986, Muilis became a private biochemical research
consultant and was awarded the Nobel Prize in 1993.
DNA synthesis cycle
o After the reacting solutions are prepared, the PCR cycle is started. The first phase
involves the denaturation of DNA. One of the most important initial steps is the
complete denaturation of the DNA template. Denaturation of the DNA essentially
means breaking apart of the double bonded strand. This "opening up" of the DNA
molecule provides the template for the next DNA molecule from which to be
produced. An incomplete denaturation will result in an inefficient copy in the first
cycle which negatively impacts each subsequent cycle. The initial denaturation is
done by heating up the DNA template solution to 203°F (95°C) over one to three
minutes. The total time depends on the template composition. In repeat cycles, the
denaturation step lasts about two minutes and involves heating the solution to
201PF (94°C). Additional materials may be added to the solution to facilitate
DNA denaturation such as glycerol, DMSO, or formamide.
DNA Structure and Manipulation 67
o With the DNA split into separate strands, the temperature is lowered to 122-149°F
(50-65°C). This is known as the primer annealing step and lasts for about two
minutes. At this point, the left and right primers match up and chemically link
with their complementary bases on the template DNA.
o The next phase involves the extending step. This part of the reaction is when most
of the DNA strand gets copied. The temperature of the system is heated to about
162°F(72°C) and held there depending on the length of DNA to copy. At this
stage, the DNA polymerase interacts with the strands and adds complementary
nucleotides along the entire length. The time required at this phase is about one
minute for every 1,000 base pairs.
o After this first cycle, the DNA synthesis cycle is repeated. The number of cycles
depends of the amount of initial DNA and the amount of DNA desired. If less than
10 copies of the template DNA are available, 40 cycles are needed. With more
initial DNA, 25-30 cycles is sufficient.
o During the last cycle the sample is held at 162°F (72°C) for about 15 minutes.
This allows the filling in (with nucleotides) of any protruding ends of a new DNA
strand. At this stage, the polymerase adds extra A nucleotides on one end of the
DNA strands.
DNA isolation
When the reactions are complete, the DNA is isolated from the PCR reacting
materials such as the DNA polymerase, MgCl2 and the primers. This is done by
adding compounds like phenol, EDTA and Proteinase K. Centrifugation is also
helpful in this regard.
A desired strand of DNA can be synthesized [6] in lab. This is possible for strands up
to a certain length. Longer ‘random‘ strands are available. They consist of DNA
sequences that have been cloned from many different organisms. The synthesizer is
supplied with the four nucleotide bases in solution, which are combined according to
a sequence entered by the user. The instrument makes millions of copies of the
required oligonucleotides and places them in solution in a small vial.
DNA Structure and Manipulation 68
3.2.2 Denaturing, annealing and ligation
Double-stranded DNA may be dissolved into single strands (or denatured [6] ) by
heating the solution to a temperature determined by the composition of the strand.
Heating breaks the hydrogen bonds between complementary strands (Fig.3.1). Since
the hydrogen bonds between strands are much weaker than the covalent bonds within
strands, the strands remain undamaged by this process. Since a G-C pair is joined by
three hydrogen bonds, the temperature required to break it is slightly higher than that
for an A-T pair, joined by only two hydrogen bonds. This factor was taken into
account when designing sequences to represent computational elements.
Figure 3.1: Detailed structure of double-stranded DNA
Annealing is the reverse of melting, whereby a solution of single strands is cooled,
and allowing complementary strands to bind together (Fig. 3.2). In double-stranded
DNA, if one of the single strands contains a discontinuity (i.e, one nucleotide is not
bonded to its neighbour), then this may be repaired by DNA ligase, which is an
enzyme which helps in joining two DNA strands or pairs of nucleotides.
DNA Structure and Manipulation 69
Figure 3.2: DNA melting and annealing
Hybridization separation
Separation by hybridization [6] is an operation often used in DNA computation, and
involves the extraction from a test tube of any single strands containing a specific
short sequence (e.g., extract all strands containing the sequence TAGACT). If we
want to extract single strands containing the sequence X, we first create many copies
of its complement. We attach to these oligonucleotides a biotin molecule which binds
in turn to a fixed matrix. If we pour the contents of the test tube over this matrix,
strands containing X will anneal to the anchored complementary strands. Washing the
matrix removes all strands that do not anneal, leaving only strands containing X.
These may then be removed from the matrix.
3.2.3 Gel-Electrophoresis
Gel electrophoresis [6] is an important technique for sorting DNA strands by size.
Electrophoresis is the movement of charged molecules in an electric field. Since DNA
molecules carry negative charge, when placed in an electrical field they tend to
migrate towards the positive pole. The rate of migration of a molecule in an aqueous
solution depends on its shape and electrical charge. Since DNA molecules have the
same charge per unit length, they all migrate at the same speed in an aqueous solution.
However, if electrophoresis is carried out in a gel (usually made of agarose,
polyacrylamide or a combination of the two) the migration rate of a molecule is also
affected by its size. This is due to the fact that the gel is a dense network of pores
through which the molecules must travel. Smaller molecules therefore migrate faster
DNA Structure and Manipulation 70
through the gel, thus sorting them according to size. A simplified representation of gel
electrophoresis is depicted in Fig. 3.3. The DNA was placed in a well cut out of the
gel, and a charge applied. Once the gel has been run (usually overnight), it is
necessary to visualize the results. This is achieved by staining the DNA with the
fluorescent dye ethidium bromide and then viewing the gel under ultraviolet light. At
this stage the gel is usually photographed for convenience. One such photograph is
depicted in Fig.3.4. Gels are interpreted as follows; each lane corresponds to one
particular sample of DNA.
Figure 3.3: Gel electrophoresis process
We can therefore run several tubes on the same gel for the purposes of comparison.
Lane 7 is known as the marker lane; this contains various DNA fragments of known
length, for the purposes of calibration. DNA fragments of the same length cluster to
form visible horizontal bands, the longest fragments forming bands at the top of the
picture, and the shortest at the bottom. The brightness of a particular band depends on
the amount of DNA of the corresponding length present in the sample. Larger
concentrations of DNA absorb more dye, and therefore appear brighter. One
advantage of this technique is its sensitivity as little as 0.05µg of DNA in one band
can be detected as visible fluorescence. The size of fragments at various bands is
shown to the right of the marker lane, and is measured in base pairs (bp).
DNA Structure and Manipulation 71
Figure 3.4: Gel electrophoresis photograph
3.2.4 Primer extension and PCR
The DNA polymerases perform several functions, including the repair and duplication
of DNA. Given a short primer oligonucleotide, p, in the presence of nucleotide
triphosphates, the polymerase extends p if and only if p is bound to a longer template
oligonucleotide, t. For example, in Fig. 3.5(a), p is the oligonucleotide TCA which is
bound to t; ATAGAGTT. In the presence of the polymerase, p is extended by a
complementary strand of bases to the 3‘ end of t (Fig.3.5 (b).
Another useful method of manipulating DNA is the Polymerase Chain Reaction, or
PCR [6]. PCR is a process that quickly amplifies the amount of a specific molecule of
DNA in a given solution using primer extension by polymerase. Each cycle of the
reaction doubles the quantity of this molecule, giving an exponential growth in the
number of strands. In order to target specific molecules we need to know their ―start‖
and ―end‖ sections. A common problem in DNA computation is how to read-out the
final solution to a problem encoded as a DNA strand, as the laboratory steps carried
out may result in a very dilute solution. PCR solves this problem: if a sought after
molecule is present in the solution, then it will be hugely (exponentially) multiplied so
that the volume of the solution will ―visibly‖ grow, this solves then the Detection
problem.
DNA Structure and Manipulation 72
Figure 3.5: Primer extension by polymerase
According to the PCR resource, the cycle can be repeated more than thirty times.
With thirty cycles taking just three hours, one strand of DNA can turn into a million
strands.
Extraction
Given a test tubeT1 and a strand S, it is possible to extract all the strands inT1 that
contain S as a subsequence and to separate them from those that do not contain it.
Union
Given two or more test tubes, sayT 1 , T 2 , . . . , T n, it is possible to put in a new test
tube the union [6] of all the strands contained inT 1 , T 2 , . . . , T n.
Detection
Confirm presence/ absence of DNA in a given test tube.
3.2.5 Restriction Enzymes
There are many kind of enzymes that are the molecules capable of operating on other
molecules. Some of them are the restriction enzymes that cut DNA double strands
where specific subsequence appears. Any double stranded DNA that contains the
restriction site within its sequence is cut by the enzyme [6] at that point. For example,
the double stranded DNA in fig 3.6(a) is cut by restriction enzyme Sau3AI, which
recognizes the restriction site GATC. The resulting DNA is depicted in fig 3.6(b). The
DNA Structure and Manipulation 73
cleavage here generates sticky (cohesive) ends. Such ends are important in DNA
manipulation, because they allow catenation of DNA molecules if they have
complimentary sticky ends.
Figure 3.6 (a) Restriction Enzymes
(b) Double stranded DNA being cut by Sau3AI
DNA Fundamentals
DNA (deoxyribonucleic acid) is a double stranded sequence of four nucleotides; the
four nucleotides that compose a strand of DNA are as follows: adenine (A), guanine
(G), cytosine (C), and thymine (T); they are often called bases. DNA supports two
key functions for life:
coding for the production of proteins,
self-replication.
Each deoxyribonucleotide consists of three components:
a sugar — deoxyribose
o five carbon atoms: 1´ to 5´
o hydroxyl group (OH) attached to 3´ carbon
a phosphate group
a nitrogenous base.
DNA Structure and Manipulation 74
The chemical structure of DNA consists of a particular bond of two linear sequences
of bases. This bond follows a property of Complementarity: adenine bonds with
thymine (A-T) and vice versa (T-A), cytosine bonds with guanine (C-G) and vice
versa (G-C). This is known as Watson-Crick complementarity.
The DNA monomers can link in two ways:
Phosphorus bond Hydrogen bond
DNA Structure and Manipulation 75
The four nucleotides adenine (A), guanine (G), cytosine (C), and thymine (T)
compose a strand of DNA. Each DNA strand has two different ends that determine its
polarity: the 3‘end, and the 5‘end. The double helix is an anti-parallel (two strands of
opposite polarity) bonding of two complementary strands.
3.3 PRINCIPLES OF DNA COMPUTING
DNA is the major information storage molecule in living cells, and billions of years of
evolution have tested and refined both this wonderful informational molecule and
highly specific enzymes that can either duplicate the information in DNA molecules
or transmit this information to other DNA molecules. Instead of using electrical
impulses to represent bits of information, the DNA computer uses the chemical
properties of these molecules by examining the patterns of combination or growth of
the molecules or strings. DNA can do this through the manufacture of enzymes,
which are biological catalysts that could be called the ‘software‘, used to execute the
desired calculation.
A single strand of DNA is similar to a string consisting of a combination of four
different symbols A G C T. Mathematically this means we have at our disposal a
letter alphabet, Σ = {A GC T} to encode information which is more than enough
considering that an electronic computer needs only two digits and for the same
purpose. In a DNA computer, computation takes place in test tubes. The input and
output are both strands of DNA, whose genetic sequences encode certain information.
A program on a DNA computer is executed as a series of biochemical operations,
which have the effect of synthesizing, extracting, modifying and cloning the DNA
strands.
As concerning the operations that can be performed on DNA strands the proposed
models of DNA computation are based on various combinations of the following
primitive bio-operations:
Synthesizing a desired polynomial-length strand used in all models.
DNA Structure and Manipulation 76
Figure 3.7 DNA Synthesis
Mixing : combine the contents of two test tubes into a third one to achieve
union.
Annealing: bond together two single-stranded complementary DNA sequences
by cooling the solution. Annealing in vitro is known as hybridization
Melting: break apart a double-stranded DNA into its single-stranded
complementary components by heating the solution. Melting in vitro is also
known under the name of denaturation.
DNA Structure and Manipulation 77
Figure 3.8 DNA Melting and Annealing
DNA Structure and Manipulation 78
Amplifying (copying): make copies of DNA strands by using the Polymerase
Chain Reaction PCR. The DNA polymerase enzymes perform several functions
including replication of DNA. The replication reaction requires a guiding DNA
single-strand called template, and a shorter oligonucleotide called a primer,
that is annealed to it.
Figure 3.9 DNA Replication
DNA Structure and Manipulation 79
Separating the strands by length using a technique called gel electrophoresis
that makes possible the separation of strands by length.
Figure 3.10 Gel Electrophoresis
DNA Structure and Manipulation 80
Extracting those strands that contain a given pattern as a substring by using
affinity purification.
DNA Affinity Purification
DNA Structure and Manipulation 81
Cutting DNA double-strands at specific sites by using commercially available
restriction enzymes. One class of enzymes, called restriction endonucleases, will
recognize a specific short sequence of DNA, known as a restriction site. Any
double-stranded DNA that contains the restriction site within its sequence is cut
by the enzyme at that location.
Figure 3.11 DNA Cutting
DNA Structure and Manipulation 82
Ligating: paste DNA strands with compatible sticky ends by using DNA
ligases. Indeed, another enzyme called DNA ligase, will bond together, or
―ligate‖, the end of a DNA strand to another strand.
Substituting: substitute, insert or delete DNA sequences by using PCR site-
specific oligonucleotide mutagenesis.
Figure 3.12 DNA Substitution
Marking single strands by hybridization: complementary sequences are
attached to the strands, making them double-stranded. The reverse operation is
DNA Structure and Manipulation 83
unmarking of the double-strands by denaturing, that is, by detaching the
complementary strands. The marked sequences will be double-stranded while
the unmarked ones will be single-stranded.
Destroying the marked strands by using exonucleases, or by cutting all the
marked strands with a restriction enzyme and removing all the intact strands by
gel electrophoresis. (By using enzymes called exonucleases, either double-
stranded or single-stranded DNA molecules may be selectively destroyed. The
exonucleases chew up DNA molecules from the end inward, and exist with
specificity to either single-stranded or double-stranded form.)
Detecting and Reading: given the contents of a tube, say ``yes'' if it contains at
least one DNA strand, and ``no'' otherwise. PCR may be used to amplify the
result and then a process called sequencing is used to actually read the solution.
In Short, DNA computers work by encoding the problem to be solved in the language
of DNA: the base-four values A, T, C and G. Using this base four number system, the
solution to any conceivable problem can be encoded along a DNA strand like in a
Turing machine tape. Every possible sequence can be chemically created in a test tube
on trillions of different DNA strands, and the correct sequences can be filtered out
using genetic engineering tools.
We described here the basic structure of DNA and the methods by which it may be
manipulated in the laboratory. These techniques owe their origin to, and are being
constantly improved by the wide interests of molecular biologists working in modern
areas such as the Human Genome project and genetic engineering. In chapter 4 we
show how these techniques allow us to implement the various DNA computational
models described in the following chapter. Adleman used a small subset of these
techniques (hybridisation extraction, PCR and gel electrophoresis) in [2]. Although
other molecules (such as proteins) may be used as a computational substrate in the
future, the benefit of using DNA is that this wide range of manipulation techniques is
already available.
Top Related