Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat...

7
Uncovering pathways in DNA oligonucleotide hybridization via transition state analysis E. J. Sambriski a,b , D. C. Schwartz c , and J. J. de Pablo a,1 a Department of Chemical and Biological Engineering and b Department of Chemistry and Biochemistry, Delaware Valley College, Doylestown, PA 18901; and c Laboratory for Molecular and Computational Genomics, Department of Chemistry, and Laboratory for Genetics, University of Wisconsin, Madison, WI 53706 Edited by Mark A. Ratner, Northwestern University, Evanston, IL, and approved August 31, 2009 (received for review May 1, 2009) DNA hybridization plays a central role in biology and, increasingly, in materials science. Yet, there is no precedent for examining the pathways by which specific single-stranded DNA sequences inter- act to assemble into a double helix. A detailed model of DNA is adopted in this work to examine such pathways and to determine the role of sequence, if any, on DNA hybridization. Transition path sampling simulations reveal that DNA rehybridization is prompted by a distinct nucleation event involving molecular sites with approximately four bases pairing with partners slightly offset from those involved in ideal duplexation. Nucleation is promoted in regions with repetitive base pair sequence motifs, which yield multiple possibilities for finding complementary base partners. Repetitive sequences follow a nonspecific pathway to renaturation consistent with a molecular ‘‘slithering’’ mechanism, whereas ran- dom sequences favor a restrictive pathway involving the formation of key base pairs before renaturation fully ensues. mesoscale modeling nucleic acids self-assembly T he renaturation or hybridization of DNA, i.e., the formation of duplexed strands (dsDNA) from complementary single strands (ssDNA), is of fundamental importance. Many natural processes depend on the systematic denaturation and rezipper- ing of DNA, including replication during cell division, transcrip- tion in the synthesis of RNA, and gene expression or regulation (1). In the laboratory, amplification of DNA sequences via PCR relies on thermal cycling (2). From an engineering perspective, self-assembly processes that are based on nucleic base comple- mentarity are gaining widespread use for the production of organized, hierarchical nanostructures (3, 4). A better under- standing of the molecular mechanisms involved in DNA melting and rehybridization could shed light on relevant biochemical processes and could help establish design principles for DNA- based nanoscale assembly. Double-stranded DNA arises from specific interactions between nitrogenous bases (A, C, G, and T); these include intrastrand base-stacking and interstrand Watson–Crick (AT or CG) base pairing. Such interactions can be thermally perturbed to yield ssDNA by denaturation, resulting in the so-called ‘‘melting’’ (or ssDNA– dsDNA) transition. With a proper annealing protocol, it is possible to renature or hybridize DNA, i.e., to reverse the effects of melting. The transition can be traced with the extent of reaction, , an order parameter defined as the fraction of any Watson–Crick base pairs from the n ideal duplex base pairs. Experimentally, is followed by measuring changes in UV absorbance or solution viscosity of the system (5). The melting transition of DNA has been perceived to be a first-order phase transition (6–8), and the mechanism has traditionally been assumed to occur via a two-state process: one being ssDNA ( 0), and the other being dsDNA ( 1). Known as the ‘‘all-or-none’’ simplification, the two-state as- sumption is often invoked in theoretical constructs to treat experimental data. Many cases exist, however, where this mech- anism is at odds with available experimental data (9–12). The origin of this observation, especially among similar DNA se- quences, is unclear. Thus, approaches that elucidate the rena- turation of DNA given an arbitrary sequence continue to be of great interest for their application in the design and optimization of self-assembly protocols. In this report, we present a molecular study on the pathways that DNA follows as it undergoes melting and renaturation. The picture that emerges is one where nucleotide sequence plays a significant role in the ssDNA–dsDNA transition. When a nu- cleotide sequence consists of a repetitive motif in ssDNA, the pathway for reassociation tends to be nonspecific and involves multiple molecular intermediate species in the transition state. On the other hand, when a sequence is random, with an irregular pattern of bases on ssDNA, the process is more restrictive, requiring the formation of key base pairs for hybridization to proceed. This results in a pathway whose transition state is populated by intermediate molecular species that share common paired bases. Model and Methods To approach the problem of the ssDNA–dsDNA transition, we use an off-lattice, coarse grain model originally developed by Knotts et al. (13) and subsequently improved by Sambriski et al. (14) to account for molecular renaturation. This construct maps the three chemical moieties in a nucleotide (sugar, phosphate, and nucleic base) onto three interaction sites, thus providing resolution at the base-pair level. The solvent is treated implicitly by using Langevin Dynamics (LD) with an effective friction coefficient. In this study, electrostatic interactions are treated at the level of Debye–Hu ¨ckel theory, justified here by the weak electrolytic conditions used. The force field includes both intra- and interstrand contributions, the details of which are provided in refs. 13 and 14. The model captures both major and minor grooves of DNA, has been shown to reproduce quantitatively experimental denaturation temperatures as a function of base composition as well as ionic strength, exhibits the correct dependence of DNA persistence length with ionic strength, and resolves a DNA nucleotide into three chemical moieties while enabling nanoscale resolution. Collectively, these features are highly relevant to the studies of DNA hybridization described in what follows. DNA duplexation can be viewed as a transient, rare event hindered by large entropic barriers. To investigate the ssDNA– dsDNA transition, an approach that relies on importance sam- pling of trajectory space must be adopted. In this work, Tran- sition Path Sampling (TPS) (15) is used. We begin by ‘‘mining’’ for a reactive trajectory, or primitive path; this is an initial trajectory that spans two states of interest [here, dsDNA (state A) and ssDNA (state B)]. New paths are created through Author contributions: E.J.S., D.C.S., and J.J.d.P. designed research; E.J.S. performed re- search; E.J.S. analyzed data; and E.J.S. and J.J.d.P. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1 To whom correspondence should be addressed. E-mail: [email protected]. www.pnas.orgcgidoi10.1073pnas.0904721106 PNAS October 27, 2009 vol. 106 no. 43 18125–18130 BIOPHYSICS AND COMPUTATIONAL BIOLOGY CHEMISTRY Downloaded by guest on March 26, 2020 Downloaded by guest on March 26, 2020 Downloaded by guest on March 26, 2020

Transcript of Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat...

Page 1: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

Uncovering pathways in DNA oligonucleotidehybridization via transition state analysisE. J. Sambriskia,b, D. C. Schwartzc, and J. J. de Pabloa,1

aDepartment of Chemical and Biological Engineering and bDepartment of Chemistry and Biochemistry, Delaware Valley College, Doylestown, PA 18901;and cLaboratory for Molecular and Computational Genomics, Department of Chemistry, and Laboratory for Genetics, University of Wisconsin,Madison, WI 53706

Edited by Mark A. Ratner, Northwestern University, Evanston, IL, and approved August 31, 2009 (received for review May 1, 2009)

DNA hybridization plays a central role in biology and, increasingly,in materials science. Yet, there is no precedent for examining thepathways by which specific single-stranded DNA sequences inter-act to assemble into a double helix. A detailed model of DNA isadopted in this work to examine such pathways and to determinethe role of sequence, if any, on DNA hybridization. Transition pathsampling simulations reveal that DNA rehybridization is promptedby a distinct nucleation event involving molecular sites withapproximately four bases pairing with partners slightly offset fromthose involved in ideal duplexation. Nucleation is promoted inregions with repetitive base pair sequence motifs, which yieldmultiple possibilities for finding complementary base partners.Repetitive sequences follow a nonspecific pathway to renaturationconsistent with a molecular ‘‘slithering’’ mechanism, whereas ran-dom sequences favor a restrictive pathway involving the formationof key base pairs before renaturation fully ensues.

mesoscale modeling � nucleic acids � self-assembly

The renaturation or hybridization of DNA, i.e., the formationof duplexed strands (dsDNA) from complementary single

strands (ssDNA), is of fundamental importance. Many naturalprocesses depend on the systematic denaturation and rezipper-ing of DNA, including replication during cell division, transcrip-tion in the synthesis of RNA, and gene expression or regulation(1). In the laboratory, amplification of DNA sequences via PCRrelies on thermal cycling (2). From an engineering perspective,self-assembly processes that are based on nucleic base comple-mentarity are gaining widespread use for the production oforganized, hierarchical nanostructures (3, 4). A better under-standing of the molecular mechanisms involved in DNA meltingand rehybridization could shed light on relevant biochemicalprocesses and could help establish design principles for DNA-based nanoscale assembly.

Double-stranded DNA arises from specific interactions betweennitrogenous bases (A, C, G, and T); these include intrastrandbase-stacking and interstrand Watson–Crick (AT or CG) basepairing. Such interactions can be thermally perturbed to yieldssDNA by denaturation, resulting in the so-called ‘‘melting’’ (orssDNA–dsDNA) transition. With a proper annealing protocol, it ispossible to renature or hybridize DNA, i.e., to reverse the effects ofmelting. The transition can be traced with the extent of reaction, �,an order parameter defined as the fraction of any Watson–Crickbase pairs from the n ideal duplex base pairs. Experimentally, � isfollowed by measuring changes in UV absorbance or solutionviscosity of the system (5).

The melting transition of DNA has been perceived to be afirst-order phase transition (6–8), and the mechanism hastraditionally been assumed to occur via a two-state process: onebeing ssDNA (� � 0), and the other being dsDNA (� � 1).Known as the ‘‘all-or-none’’ simplification, the two-state as-sumption is often invoked in theoretical constructs to treatexperimental data. Many cases exist, however, where this mech-anism is at odds with available experimental data (9–12). Theorigin of this observation, especially among similar DNA se-

quences, is unclear. Thus, approaches that elucidate the rena-turation of DNA given an arbitrary sequence continue to be ofgreat interest for their application in the design and optimizationof self-assembly protocols.

In this report, we present a molecular study on the pathwaysthat DNA follows as it undergoes melting and renaturation. Thepicture that emerges is one where nucleotide sequence plays asignificant role in the ssDNA–dsDNA transition. When a nu-cleotide sequence consists of a repetitive motif in ssDNA, thepathway for reassociation tends to be nonspecific and involvesmultiple molecular intermediate species in the transition state.On the other hand, when a sequence is random, with an irregularpattern of bases on ssDNA, the process is more restrictive,requiring the formation of key base pairs for hybridization toproceed. This results in a pathway whose transition state ispopulated by intermediate molecular species that share commonpaired bases.

Model and MethodsTo approach the problem of the ssDNA–dsDNA transition, weuse an off-lattice, coarse grain model originally developed byKnotts et al. (13) and subsequently improved by Sambriski et al.(14) to account for molecular renaturation. This construct mapsthe three chemical moieties in a nucleotide (sugar, phosphate,and nucleic base) onto three interaction sites, thus providingresolution at the base-pair level. The solvent is treated implicitlyby using Langevin Dynamics (LD) with an effective frictioncoefficient. In this study, electrostatic interactions are treated atthe level of Debye–Huckel theory, justified here by the weakelectrolytic conditions used. The force field includes both intra-and interstrand contributions, the details of which are providedin refs. 13 and 14. The model captures both major and minorgrooves of DNA, has been shown to reproduce quantitativelyexperimental denaturation temperatures as a function of basecomposition as well as ionic strength, exhibits the correctdependence of DNA persistence length with ionic strength, andresolves a DNA nucleotide into three chemical moieties whileenabling nanoscale resolution. Collectively, these features arehighly relevant to the studies of DNA hybridization described inwhat follows.

DNA duplexation can be viewed as a transient, rare eventhindered by large entropic barriers. To investigate the ssDNA–dsDNA transition, an approach that relies on importance sam-pling of trajectory space must be adopted. In this work, Tran-sition Path Sampling (TPS) (15) is used. We begin by ‘‘mining’’for a reactive trajectory, or primitive path; this is an initialtrajectory that spans two states of interest [here, dsDNA (stateA) and ssDNA (state B)]. New paths are created through

Author contributions: E.J.S., D.C.S., and J.J.d.P. designed research; E.J.S. performed re-search; E.J.S. analyzed data; and E.J.S. and J.J.d.P. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.

1To whom correspondence should be addressed. E-mail: [email protected].

www.pnas.org�cgi�doi�10.1073�pnas.0904721106 PNAS � October 27, 2009 � vol. 106 � no. 43 � 18125–18130

BIO

PHYS

ICS

AN

DCO

MPU

TATI

ON

AL

BIO

LOG

YCH

EMIS

TRY

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Page 2: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

shooting moves, in which the primitive path is perturbed alongsome time frame. In our case, the time frame is perturbed byreassigning momenta, which coupled with an LD integrator,yields uncorrelated trajectories. From a chosen time frame,integration is performed forward and backward in time toacquire a complete trajectory. If a trial leads to an unreactivepath (a path that does not span states A and B), the trajectoryis rejected, and a new frame is selected at random. Otherwise,the trajectory is accepted, and a new shooting move is attempted.An ensemble of reactive paths, the Transition Path Ensemble(TPE), is thereby generated and analyzed.

Analysis of the TPE yields the Transition State Ensemble(TSE), the set of configurations that are equally likely to end ineither of states A or B when subjected to a series of fleetingtrajectories. These configurations are sampled by randomlychoosing time points from a path in the TPE. A fleetingtrajectory is initialized by assigning momenta at random from aMaxwell–Boltzmann distribution at the temperature of interest,and then subjecting it to LD.

A configuration is considered to be part of the TSE byevaluating its committor probability (with respect to state B),given by PB � (1/Nf)�i�1

Nf hBi , where Nf denotes the total number

of fleeting trajectories, whereas hBi represents a counting func-

tion of configurations in state B of the ith trajectory. If thefleeting trajectory of a configuration with some value of � � 1successfully reaches state B, then we set hB

i � 1; otherwise we sethB

i � 0. A configuration is classified as belonging to the TSEwhen PB � 0.5, within statistical tolerance. In our studies, Nf �100 and, at the 95% confidence interval, a tolerance of 0.4 � PB� 0.6 is used.

To study a phase transition, an order parameter that distin-guishes between the relevant phases of the system must beidentified. Here, we focus on the ssDNA–dsDNA phase transi-tion; we therefore adopt a definition of � in which it denotes thefraction of any Watson–Crick interstrand base pairs from the npossible ideal base pairs that a molecule can form. The DNA‘‘reaction’’ is envisioned as a nonconstrained progression of thesystem from ssDNA (� � 0) to dsDNA (� � 1). In classifyingconfigurations of the TSE, the system is considered to be in stateB if � � 0.8 during 25% of the fleeting trajectory (and for whichwe assign hB

i � 1).

Results and DiscussionThe systems considered in this study, which have been previouslyinvestigated experimentally (16), are listed in Table 1. This setof oligonucleotides enables us to examine the effects of strandlength and nucleotide sequence. The systems are further differ-entiated by the fraction of CG contacts, fCG. For brevity, whendiscussing different DNA molecules, we refer to each accordingto the nomenclature appearing in the first column of Table 1. Forthe short, repetitive sequence, we did not consider the n � 15case, since it is less interesting because of symmetry consider-ations. The melting temperature, Tm, for each molecule isprovided in Table 2 (all systems were studied under identicalionic strength conditions). All TPS simulations were performedat �5 K below Tm, a range that expedites the emergence ofmultiple reassociation events.

The TPE for each system was acquired from LD trajectoriesspanning 20 ns and 100 ns for n � 14,15 and n � 30, respectively.Fleeting trajectories were 10% of the duration of the TPEtrajectories, a time scale sufficiently long to enable molecules tocommit to a given state. Once the TSE was collected, it wascharacterized by calculating the probability of observing anextent of reaction, P(�). Results for P(�) are shown in Fig. 1; thecorresponding ensemble sizes of the TPE and TSE are listed inthe last two columns of Table 2. It can be observed that, forlarger chains (30-base pair systems), a bimodal distributionarises, consisting of nucleation events with a small and anintermediate number of base pair contacts. For both short andlong chains, random sequences tend to display a sharper distri-bution for the extent of reaction in the transition state. Forrepetitive sequences, the corresponding distribution is muchbroader. These observations support the notion that a preferred

Table 1. Series of oligonucleotides

System n fCG Sequence (5� to 3�)

REP14 14 0.0 ATA TAT ATA TAT ATRAN15 15 0.2 TAC TAA CAT TAA CTAREP30 30 0.0 ATA TAT ATA TAT ATA TAT ATA TAT ATA TATMIX30 30 0.2 TTA TGT ATT AAG TTA TAT AGT AGT AGT AGTRAN30 30 0.5 AGT CTG GTC TGG ATC TGA GAA CTT CAG GCT

Table 2. Ensemble properties for oligonucleotides

System* Tm obs, K TPE (count) TSE (count)

REP14 291.2† 120 1,042RAN15 308.4‡ 120 935REP30 312.9‡ 200 2,337MIX30 323.8† 200 1,856RAN30 339.4† 200 2,028

*All systems were investigated in [Na�] � 0.069 M.†Data are from ref. 16.‡Data are from www.idtdna.com/analyzer/Applications/OligoAnalyzer.

0.0 0.2 0.4 0.6 0.8 1.0ξ

0.00

0.05

0.10

0.15

0.20

0.25

P(ξ

)

0.0 0.2 0.4 0.6 0.8 1.0ξ

0.00

0.05

0.10

0.15

0.20

0.25

P(ξ

)

Fig. 1. Probability distribution for � in the TSE, for n � 14,15 (top) and n �30 (bottom). Data are shown for REP14, REP30 (solid lines); RAN15, RAN30(dashed lines); and MIX30 (dot–dashed line). The REP systems have theirrespective standard deviations shown, which were determined by randomlypooling 50% of the ensemble and performing a dozen replicates. These errorsare representative of the other systems.

18126 � www.pnas.org�cgi�doi�10.1073�pnas.0904721106 Sambriski et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Page 3: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

pathway for renaturation is in fact encoded in an oligonucleotidesequence: Repetitive sequences favor a less specific or morediffuse pathway to renaturation, whereas random sequencesfollow a restrictive pathway in which a nucleation event mustoccur for rehybridization to proceed.

Further evidence for the presence of restrictive pathways andnucleation events for renaturation can be gleaned from theprobability of base pair formation, �ij, shown in Fig. 2. Thisfunction describes the probability of observing a complementary(Watson–Crick) base pair in the TSE, formed between the ithbase of the sense strand and the jth base of the antisense strand.Entries along the diagonal of each distribution (black line)represent ideal base pair contacts for dsDNA. Note that forREP14 and REP30, weak ‘‘filtering’’ occurs in the formation ofbase pairs on renaturation, resulting in the broad appearance ofP(�). On the other hand, RAN15 and RAN30 clearly exhibit apreference for specific base pairs. The MIX30 system is aninteresting example because it contains both a repetitive domain(note the set of four AGT triplets on the 3� end of the sensestrand) and a random domain (on the opposite end). The profileof �ij exhibits preferential association along the repetitive do-main of the chain. Plausibly, this observation is due to acombinatorial effect: the presence of adjacent, repeating motifsallows for interconversion of molecular arrangements of similarstability with minimal entropic penalties. In the case of a randomdomain, base pair ‘‘filtering’’ is more stringent due to entropiccosts and different stabilities in molecular arrangements, result-ing in preferred base pair contacts.

Insight into the process of DNA renaturation can be garneredby studying configurations in the TSE for which � � 0.0. To thisend, we extracted configurations from the TSE in which only afew base pairs have formed, ensuring a sample of �5% of theTSE size (see Table 2). In the case of REP14 and RAN15,configurations with one base pair (n� � 1) were collected,whereas for REP30, MIX30, and RAN30, configurations withfour (n� � 4) base pairs were selected. Results for the corre-sponding �ij are shown in Fig. 2 Right, which suggest theimpressive feature that for small �, sequence effects dominatethe pathway for renaturation. In all cases, nucleation preferen-tially involves contacts in domains where multiple complemen-tary base pairs can form (i.e., in regions with a repetitive motif)and is biased toward the center of the molecule. For the MIX30case, the latter preference is less pronounced because therepetitive motif at the 3� end of the sense strand biases a higherprobability of favorable contacts. This effect appears to besilenced in REP14 and REP30.

In Fig. 3, base pair contacts, appearing with the highestprobability as shown in Fig. 2, are presented schematically. Theresults suggest that in the TSE, configurations in which chainsare offset by approximately four base pairs are prevalent. Amolecular arrangement in which complementary strands areslightly offset with respect to each other appears to be entropi-cally driven: The dangling ends of DNA strands allow the systemto sample molecular arrangements of similar stability withoutlosing too much entropy on full reassociation.

The possibility that complementary strands of DNA slide, or‘‘slither,’’ past each other en route to duplexation is hinted at bythe presence of dangling ends. Evidence for this mechanism canbe found by inspecting �ij at different values of PB, a procedureakin to uncovering the binding preferences that a strand exhibitsas it crosses a free energy barrier (or the separatrix, in TPSterminology) (15). Relevant data are shown in Fig. 4 for theREP30 and RAN30 systems. The REP30 system systematicallymatches the periodicity of the complementary sequence as PBincreases, an effect that can arise from ‘‘molecular slithering.’’ Inthe RAN30 case, this trend is not evident and, upon approachinghigher values of PB, the general features of �ij remain qualita-tively similar, suggesting that the renewal of base pairs is not

T A T A T A T A T A T A T A

A

T

A

T

A

T

A

T

A

T

A

T

A

T

RE

P14

(al

l nξ)

T A T A T A T A T A T A T A

A

T

A

T

A

T

A

T

A

T

A

T

A

T

RE

P14

(nξ

= 1

)

A T G A T T G T A A T T G A T

T

A

C

T

A

A

C

A

T

T

A

A

C

T

A

RA

N15

(al

l nξ)

A T G A T T G T A A T T G A T

T

A

C

T

A

A

C

A

T

T

A

A

C

T

A

RA

N15

(nξ

= 1

)

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(al

l nξ)

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(nξ

= 4

)

A A T A C A T A A T T C A A T A T A T C A T C A T C A T CA

TTATGTATTAAGTTATATAGTAGTAGTAGT

MIX

30 (

all n

ξ)

A A T A C A T A A T T C A A T A T A T C A T C A T C A T CA

TTATGTATTAAGTTATATAGTAGTAGTAGT

MIX

30 (

nξ =

4)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(al

l nξ)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(nξ

= 4

)

0.0 0.2 0.4 0.6 0.8 1.0

Fig. 2. Probability of base pair contacts, �ij, renormalized with respect to thehighest value for each system. The ordinate axes denote the system, the valuesof n� shown, and the sequence of the sense strand. The 5� end of each strandis indicated by the nucleic base set in red. Ideal duplex base pairs fall along thediagonal (black line). The color scale (bottom) corresponds to renormalizedvalues of �ij.

Sambriski et al. PNAS � October 27, 2009 � vol. 106 � no. 43 � 18127

BIO

PHYS

ICS

AN

DCO

MPU

TATI

ON

AL

BIO

LOG

YCH

EMIS

TRY

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Page 4: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

strongly favored. A plausible scenario for RAN30 is one in whicha strand forms key contacts with its partner, allowing the systemto then ‘‘snap’’ into a duplexed state. Collectively, these obser-vations also hold for the shorter oligonucleotides.

Another characterization of the TSE was obtained by plottingthe potential energy of the system as a function of the extent ofreaction, U(�), shown in Fig. 5. A significant amount of overlapbetween energy levels of �0.5 kBT among neighboring � statesis observed. This is consistent with prior studies that characterizethe ssDNA–dsDNA transition in the context of an ExpandedEnsemble calculation (see ref. 17). Moreover, as the ‘‘random-ness’’ of an oligonucleotide increases, the extent of ‘‘filtering’’ ofU(�) is stronger (e.g., compare the n � 30 systems in Fig. 5), aneffect that is also coupled to the appearance of P(�) and �ij. Thesubtle decrease in potential energy with increasing �, as shownin Fig. 5, is due to interstrand base pairs and gives way torenaturation intermediates of similar molecular stability. Theseresults indicate that the potential energy of the system is not adiscerning order parameter to describe the ssDNA–dsDNAtransition, and assert the interconversion of molecular arrange-ments of similar stability.

The approach presented here identifies molecular intermediatesinvolved in the reassociation of specific DNA oligonucleotides,without resorting to a priori assumptions on the mechanisticpathway. An important example where this ancillary informationcan be used involves melting profiles of oligonucleotides obtained

Fig. 3. Identification of DNA base pairs (gray bands) with a renormalizedvalue of �ij � 0.8 in the TSE, as shown in Fig. 2 Left (systems are similarlyarranged from top to bottom). The sense strand is the top sequence of eachduplex, and the 5� end is denoted by an oblique bar. Sites shown representbase moieties (backbone sites are not shown for clarity): A (magenta circles),C (green squares), G (orange squares), and T (blue circles).

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(P B

= 0

.9)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(P B

= 0

.9)

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(P B

= 0

.7)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(P B

= 0

.7)

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(P B

= 0

.5)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(P B

= 0

.5)

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(P B

= 0

.3)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(P B

= 0

.3)

T A T A T A T A T A T A T A T A T A T A T A T A T A T A T A

ATATATATATATATATATATATATATATAT

RE

P30

(P B

= 0

.1)

T C A G A C C A G A C C T A G A C T C T T G A A G T C C GA

AGTCTGGTCTGGATCTGAGAACTTCAGGCT

RA

N30

(P B

= 0

.1)

Fig. 4. Plots of �ij for all n�, in which evaluations were performed at thecommittor probabilities, PB, indicated on the ordinate axes. Shown are results forthe REP30 (Left) and RAN30 (Right) systems. Each plot is labeled as in Fig. 2, withcorresponding color scale. For a given system, plots are arranged such that PB

decreases from top to bottom. For PB � 0.5, the size of the TSE is of O (500).

18128 � www.pnas.org�cgi�doi�10.1073�pnas.0904721106 Sambriski et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Page 5: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

from UV absorbance or solution viscosity, which need not yield thesame denaturation temperature (5). These measurements are dom-inated by small-chain effects, in which the ssDNA–dsDNA transi-tion spans temperature intervals of 10–30 K (16, 18), and may bedismissed as an order–disorder (continuous) transition. However,the treatment of data typically invokes an all-or-none, infinitelycooperative model (19) that is largely characteristic of first-order(discontinuous) transitions (20). This incompatibility is difficult toreconcile and has challenged our understanding of the nature of thetransition. However, by sampling molecular configurations in thetransition state with TPS simulations, as is done here, the transitioncan be characterized for specific sequences by exploring the distri-bution of P(�). Sequences that display a narrow distribution mayfavor an all-or-none pathway, where the cooperativity arising fromkey base pair contacts is needed before hybridization can occur.Sequences with a broad distribution may favor a diffusive pathway,where a variety of molecular intermediates are almost equally likelyto contribute in a system with a continuous rehybridization tran-sition. This information is especially useful when systems sharesimilar melting profiles but have distinct renaturation pathways. Acase in point is the comparison between n� � 1 configurationsfound in the TSE of the REP14 and RAN15 sequences. Accountingonly for complementary interstrand contacts, there are 98 possiblearrangements in the former, whereas the latter has 83 such possi-bilities. Remarkably, in the transition state, only three specific basepairs are prevalent in RAN15, whereas many more base pairs occurin REP14 (see Fig. 2).

ConclusionTwo qualitatively different behaviors, which depend on basepair sequence, emerge in DNA hybridization. Complementarystrands composed entirely of a random sequence assemble intodsDNA by forming key contacts. Such contacts tend to occurwhen a few adjacent Watson–Crick bases exist on the com-plementary strand for a given base pair. This pairing favorsarrangements in which DNA strands are slightly offset fromideal duplexation. A nucleation event involving these keycontacts is necessary and sufficient to send the molecule downa rehybridization funnel that favors a specific pathway and thusrestricts the molecular species found in the TSE [e.g., byfavoring a narrow P(�) distribution]. In contrast, repetitivemotifs relieve those restrictions by posing multiple possibilitiesfor complementary base pairs. When present only in a local-ized region of the molecule, these motifs offer a control to biasthe molecular species found in the TSE (i.e., those configu-rations that exhibit base pairing in this region). When the motifextends over the entire molecule, the pathway to renaturationis nonspecific, as evidenced by a broad P(�) distribution and adiverse population of molecular intermediates in the TSE. Theinformation from the present analysis offers initial designprinciples to optimize molecular reassociation through theinterplay of these two behaviors. For example, a randomsequence tailored for a unique molecular recognition eventcan be prompted by f lanking either side with a short concate-mer (i.e., a sequence consisting of a repetitive motif). Suchprinciples could be useful in addressing problems that requireoptimization in the selectivity and sensitivity of DNA reasso-ciation and/or hybridization, including microarray-based bio-assays (18, 21) and the assembly of nanoscale structures (4, 22).

ACKNOWLEDGMENTS. The University of Wisconsin-Madison Center for HighThroughput Computing is gratefully acknowledged for providing resources.E.J.S. and D.C.S. acknowledge support from the National Human Genome Re-search Institute through Training Grant 5T32HG002760 to the Genomic SciencesTraining Program and Research Grant R01HG000225, respectively. This work wassupported by the National Science Foundation through the University of Wis-consin-Madison Nanoscale Science and Engineering Center.

0.0 0.2 0.4 0.6 0.8 1.0ξ

-0.5

0.0

0.5

1.0

1.5

βU(ξ

)/N

0.0 0.2 0.4 0.6 0.8 1.0ξ

-0.5

0.0

0.5

1.0

1.5

βU(ξ

)/N

0.0 0.2 0.4 0.6 0.8 1.0ξ

-0.5

0.0

0.5

1.0

1.5

βU(ξ

)/N

0.0 0.2 0.4 0.6 0.8 1.0ξ

-0.5

0.0

0.5

1.0

1.5

βU(ξ

)/N

0.0 0.2 0.4 0.6 0.8 1.0ξ

-0.5

0.0

0.5

1.0

1.5

βU(ξ

)/N

Fig. 5. Plot of U(�) normalized by the inverse system thermal energy, � �(kBT)1, and scaled by the total number of interaction sites in the system, N(systems are similarly arranged from top to bottom as in Fig. 2).

Sambriski et al. PNAS � October 27, 2009 � vol. 106 � no. 43 � 18129

BIO

PHYS

ICS

AN

DCO

MPU

TATI

ON

AL

BIO

LOG

YCH

EMIS

TRY

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Page 6: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

1. Voet D, Voet JG, Pratt CW (1999) Fundamentals of Biochemistry (Wiley, New York), pp751–930.

2. Saiki RK, et al. (1988) Primer-directed enzymatic amplification of DNA with a thermo-stable DNA polymerase. Science 239:487–491.

3. Rothemund PWK (2006) Folding DNA to create nanoscale shapes and patterns. Nature404:297–302.

4. Yan H, LaBean TH, Feng L, Reif JH (2003) Directed nucleation assembly of DNA tilecomplexes for barcode-patterned lattices. Proc Natl Acad Sci USA 100:8103–8108.

5. Freifelder D (1985) Principles of Physical Chemistry with Applications to the BiologicalSciences (Jones and Bartlett, Boston), pp 513–517.

6. Garel T, Monthus C, Orland H (2001) A simple model for DNA denaturation. EurphysLett 55:132–138.

7. Carlon E, Orlandini E, Stella AL (2002) Role of stiffness and excluded volume in DNAdenaturation. Phys Rev Lett 88:198101.

8. Kafri Y, Mukamel D, Peliti L (2000) Why is the DNA denaturation transition first-order?Phys Rev Lett 85:4988–4991.

9. Mikulecky PJ, Feig AL (2006) Heat capacity changes associated with DNA duplexformation: salt- and sequence-dependent effects. Biochemistry 45:604–616.

10. Montrichok A, Gruner G, Zocchi G (2003) Trapping intermediates in the meltingtransition of DNA oligomers. Europhys Lett 62:452–458.

11. Ma H, Wan C, Wu A, Zewail AH (2007) DNA folding and melting observed in real timeredefine the energy landscape. Proc Natl Acad Sci USA 104:712–716.

12. Woodside MT, et al. (2006) Direct measurement of the full, sequence-dependentfolding landscape of a nucleic acid. Science 314:1001–1004.

13. Knotts TA, IV, Rathore N, Schwartz DC, de Pablo JJ (2007) A coarse grain model for DNA.J Chem Phys 126:084901.

14. Sambriski EJ, Schwartz DC, de Pablo JJ (2009) A mesoscale model for DNA and itsrenaturation. Biophys J 96:1675–1690.

15. Dellago C, Bolhuis PG, Geissler PL (2002) Transition path sampling. Adv Chem Phys123:1–78.

16. Owczarzy R, et al. (2004) Effects of sodium ions on DNA duplex oligomers: improvedpredictions of melting temperatures. Biochemistry 43:3537–3554.

17. Sambriski EJ, Ortiz V, de Pablo JJ (2009) Sequence effects in the melting and renatur-ation of short DNA oligonucleotides: Structure and mechanistic pathways. J PhysCondens Matter 21:034105.

18. Anderson MLM (1999) Nucleic Acid Hybridization (Bios Scientific, Oxford).19. Marky LA, Breslauer KJ (1987) Calculating thermodynamic data for transitions of any

molecularity from equilibrium melting curves. Biopolymers 26:1601–1620.20. Turner DH (2000) in Nucleic Acids: Structures, Properties and Functions, eds Blookfield

VA, Crothers DM, Tinoco I, Jr (University Science, Sausalito, CA), pp 259–334.21. Zhang L, Miles MF, Aldape KD (2003) A model of molecular interactions on short

oligonucleotide microarrays. Nat Biotechnol 21:818–821.22. Seeman NC (2003) DNA in a material world. Nature 421:427–431.

18130 � www.pnas.org�cgi�doi�10.1073�pnas.0904721106 Sambriski et al.

Dow

nloa

ded

by g

uest

on

Mar

ch 2

6, 2

020

Page 7: Correction - PNAS · ran15 15 0.2 tac taa cat taa cta rep30 30 0.0 ata tat ata tat ata tat ata tat ata tat mix30 30 0.2 tta tgt att aag tta tat agt agt agt agt ran30 30 0.5 agt ctg

Correction

CHEMISTRY, BIOPHYSICS AND COMPUTATIONAL BIOLOGYCorrection for ‘‘Uncovering pathways in DNA oligonucleotidehybridization via transition state analysis,’’ by E. J. Sambriski,D. C. Schwartz, and J. J. de Pablo, which appeared in issue 43,October 27, 2009, of Proc Natl Acad Sci USA (106:18125–18130;first published October 8, 2009; 10.1073/pnas.0904721106).

The authors note that, due to a printer’s error, the affiliationfor E. J. Sambriski and J. J. de Pablo appeared incorrectly inpart. Instead of Delaware Valley College, Doylestown, PA, theDepartment of Chemical and Biological Engineering shouldhave been associated with the University of Wisconsin-Madison,Madison, WI 53706. The corrected author and affiliation linesappear below.

E. J. Sambriskia,b, D. C. Schwartzc, and J. J. de Pabloa

aDepartment of Chemical and Biological Engineering, and cLaboratory forMolecular and Computational Genomics, Department of Chemistry, andLaboratory for Genetics, University of Wisconsin-Madison, Madison, WI53706; and bDepartment of Chemistry and Biochemistry, Delaware ValleyCollege, Doylestown, PA 18901

www.pnas.org/cgi/doi/10.1073/pnas.0912578106

www.pnas.org PNAS � December 8, 2009 � vol. 106 � no. 49 � 21007–21007

CORR

ECTI

ON