Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally...

6
Enhanced, targeted sampling of high-dimensional free- energy landscapes using variationally enhanced sampling, with an application to chignolin Patrick Shaffer a , Omar Valsson a,b , and Michele Parrinello a,b,1 a Department of Chemistry and Applied Biosciences, Eidgenössische Technische Hochschule Zurich and Facoltà di Informatica, Instituto di Scienze Computazionali, Università della Svizzera Italiana, CH-6900 Lugano, Switzerland; and b National Center for Computational Design and Discovery of Novel Materials MARVEL, Università della Svizzera Italiana, CH-6900 Lugano, Switzerland Contributed by Michele Parrinello, December 17, 2015 (sent for review October 5, 2015; reviewed by Aaron R. Dinner and Jim Pfaendtner) The capabilities of molecular simulations have been greatly extended by a number of widely used enhanced sampling methods that facilitate escaping from metastable states and crossing large barriers. Despite these developments there are still many problems which remain out of reach for these methods which has led to a vigorous effort in this area. One of the most important problems that remains unsolved is sampling high-dimensional free-energy landscapes and systems that are not easily described by a small number of collective variables. In this work we demonstrate a new way to compute free- energy landscapes of high dimensionality based on the previously introduced variationally enhanced sampling, and we apply it to the miniprotein chignolin. enhanced sampling | protein folding | free-energy calculation | biomolecular simulation M olecular simulation has become an increasingly useful tool for studying complex transformations in chemistry, biology, and materials science. Such simulations provide extremely high spatial and temporal resolution but suffer from inherent time-scale limitations. Molecular dynamics simulations of biomolecules per- formed on standard hardware struggle to exceed the microsecond time scale. Modern, purpose-built hardware can reach the milli- second time scale (1), but these tools are not generally available and they still prove insufficient for many important problems (2). Widely used enhanced sampling methods also allow one to simulate processes with a time scale of milliseconds (3) and there is signifi- cant interest in extending these methods. One important class of enhanced sampling methods involves applying an external bias to one or more collective variables (CVs) of the system. Umbrella sampling and metadynamics fall into this category (4, 5), but there are many other examples (613). These methods are most effective when the transformations of interest are in some sense collective and can be described by a very small number of CVs. When the choice of CV is not obvious or the free-energy land- scape cannot be projected meaningfully on a low-dimensional manifold, new strategies are needed. Here we describe an enhanced sampling strategy that allows one to bias high-dimensional free- energy landscapes by exploiting the flexibility inherent in the re- cently introduced variationally enhanced sampling (VES) method (14). This method casts free-energy computation as a functional minimization problem, and it solves two important problems asso- ciated with high-dimensional free-energy computation. The first problem is that when exploring a high-dimensional space a system might never visit the regions of interest, but the variational principle introduced in ref. 14 permits us to specifically target the region we are interested in through the use of a so-called target distribution.The second problem is that one must adopt an approximation for the high-dimensional bias and the variational principle allows us to do this in an optimal way. Furthermore, as we show here, for the first time to our knowledge, choosing an intelligent target distri- bution, which includes any available a priori information, allows us to compensate for some of the variational flexibility that is lost when we adopt an approximate form for the bias. We apply this method to study the folding and unfolding of chignolin and show that we can accurately estimate the folding free energy in a small fraction of the computational time used by other methods. Background To put our method in context we will provide some background on the methods that have been explored for overcoming barriers in high-dimensional CV spaces. Most of these methods involve applying an external bias to some CVs of the system. Standard methods like metadynamics and umbrella sampling are effective for low-dimensional CV spaces and they achieve the dual goal of allowing the system to explore the CV space and reconstructing the free-energy surface accurately. Methods that attempt to sample high-dimensional CV spaces often sacrifice the second goal. Bias-exchange metadynamics (15), for example, biases a different CV in each of several replicas and then allows exchange moves between replicas. The output of such a calculation is several one-dimensional representations of the free-energy sur- face on each of the biased CVs. A related method is the recently introduced parallel bias metadynamics, which also produces many low-dimensional free-energy surfaces but eliminates the exchange moves (16). Methods like replica exchange with CV tempering (17) and biasing potential replica exchange molecular dynamics (18) rely on a replica exchange scheme to collect sta- tistics from an unbiased replica, and they use either a predefined or an iteratively constructed bias to help the system explore in biased replicas which can exchange with the unbiased replica. An alternative approach is to use some machine learning procedure to identify the important basins in a high-dimensional space, as is Significance The problem of sampling complex systems characterized by metastable states separated by kinetic bottlenecks is a universal one that has received much attention. Popular strategies involve applying a bias to a small number of collective variables. This strategy fails when the choice of collective variable is not clear or when the systems free energy cannot be projected meaningfully on a low-dimensional manifold. We use a variational principle that allows one to construct approximate but effective multidi- mensional biases. We show in a practical case that the method is much more efficient than alternative methods. We expect this method to be useful for a wide variety of sampling problems in chemistry, biology, and materials science. Author contributions: P.S., O.V., and M.P. designed research; P.S. performed research; P.S. analyzed data; and P.S., O.V., and M.P. wrote the paper. Reviewers: A.R.D., The University of Chicago; and J.P., University of Washington. The authors declare no conflict of interest. 1 To whom correspondence should be addressed. Email: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1519712113/-/DCSupplemental. 11501155 | PNAS | February 2, 2016 | vol. 113 | no. 5 www.pnas.org/cgi/doi/10.1073/pnas.1519712113

Transcript of Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally...

Page 1: Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally enhanced ... we adopt an approximate form for ... free-energy landscapes using variationally

Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhancedsampling, with an application to chignolinPatrick Shaffera, Omar Valssona,b, and Michele Parrinelloa,b,1

aDepartment of Chemistry and Applied Biosciences, Eidgenössische Technische Hochschule Zurich and Facoltà di Informatica, Instituto di ScienzeComputazionali, Università della Svizzera Italiana, CH-6900 Lugano, Switzerland; and bNational Center for Computational Design and Discovery of NovelMaterials MARVEL, Università della Svizzera Italiana, CH-6900 Lugano, Switzerland

Contributed by Michele Parrinello, December 17, 2015 (sent for review October 5, 2015; reviewed by Aaron R. Dinner and Jim Pfaendtner)

The capabilities of molecular simulations have been greatly extendedby a number of widely used enhanced sampling methods thatfacilitate escaping frommetastable states and crossing large barriers.Despite these developments there are still many problems whichremain out of reach for these methods which has led to a vigorouseffort in this area. One of the most important problems that remainsunsolved is sampling high-dimensional free-energy landscapes andsystems that are not easily described by a small number of collectivevariables. In this work we demonstrate a new way to compute free-energy landscapes of high dimensionality based on the previouslyintroduced variationally enhanced sampling, and we apply it to theminiprotein chignolin.

enhanced sampling | protein folding | free-energy calculation |biomolecular simulation

Molecular simulation has become an increasingly useful toolfor studying complex transformations in chemistry, biology,

and materials science. Such simulations provide extremely highspatial and temporal resolution but suffer from inherent time-scalelimitations. Molecular dynamics simulations of biomolecules per-formed on standard hardware struggle to exceed the microsecondtime scale. Modern, purpose-built hardware can reach the milli-second time scale (1), but these tools are not generally available andthey still prove insufficient for many important problems (2).Widely used enhanced sampling methods also allow one to simulateprocesses with a time scale of milliseconds (3) and there is signifi-cant interest in extending these methods. One important class ofenhanced sampling methods involves applying an external bias toone or more collective variables (CVs) of the system. Umbrellasampling and metadynamics fall into this category (4, 5), but thereare many other examples (6–13). These methods are most effectivewhen the transformations of interest are in some sense collectiveand can be described by a very small number of CVs.When the choice of CV is not obvious or the free-energy land-

scape cannot be projected meaningfully on a low-dimensionalmanifold, new strategies are needed. Here we describe an enhancedsampling strategy that allows one to bias high-dimensional free-energy landscapes by exploiting the flexibility inherent in the re-cently introduced variationally enhanced sampling (VES) method(14). This method casts free-energy computation as a functionalminimization problem, and it solves two important problems asso-ciated with high-dimensional free-energy computation. The firstproblem is that when exploring a high-dimensional space a systemmight never visit the regions of interest, but the variational principleintroduced in ref. 14 permits us to specifically target the region weare interested in through the use of a so-called “target distribution.”The second problem is that one must adopt an approximation forthe high-dimensional bias and the variational principle allows us todo this in an optimal way. Furthermore, as we show here, for thefirst time to our knowledge, choosing an intelligent target distri-bution, which includes any available a priori information, allows usto compensate for some of the variational flexibility that is lost when

we adopt an approximate form for the bias. We apply this methodto study the folding and unfolding of chignolin and show that wecan accurately estimate the folding free energy in a small fraction ofthe computational time used by other methods.

BackgroundTo put our method in context we will provide some backgroundon the methods that have been explored for overcoming barriersin high-dimensional CV spaces. Most of these methods involveapplying an external bias to some CVs of the system. Standardmethods like metadynamics and umbrella sampling are effectivefor low-dimensional CV spaces and they achieve the dual goal ofallowing the system to explore the CV space and reconstructingthe free-energy surface accurately. Methods that attempt tosample high-dimensional CV spaces often sacrifice the secondgoal. Bias-exchange metadynamics (15), for example, biases adifferent CV in each of several replicas and then allows exchangemoves between replicas. The output of such a calculation isseveral one-dimensional representations of the free-energy sur-face on each of the biased CVs. A related method is the recentlyintroduced parallel bias metadynamics, which also producesmany low-dimensional free-energy surfaces but eliminates theexchange moves (16). Methods like replica exchange with CVtempering (17) and biasing potential replica exchange moleculardynamics (18) rely on a replica exchange scheme to collect sta-tistics from an unbiased replica, and they use either a predefinedor an iteratively constructed bias to help the system explore inbiased replicas which can exchange with the unbiased replica. Analternative approach is to use some machine learning procedureto identify the important basins in a high-dimensional space, as is

Significance

The problem of sampling complex systems characterized bymetastable states separated by kinetic bottlenecks is a universalone that has received much attention. Popular strategies involveapplying a bias to a small number of collective variables. Thisstrategy fails when the choice of collective variable is not clear orwhen the system’s free energy cannot be projected meaningfullyon a low-dimensional manifold. We use a variational principlethat allows one to construct approximate but effective multidi-mensional biases. We show in a practical case that the method ismuch more efficient than alternative methods. We expect thismethod to be useful for a wide variety of sampling problems inchemistry, biology, and materials science.

Author contributions: P.S., O.V., and M.P. designed research; P.S. performed research; P.S.analyzed data; and P.S., O.V., and M.P. wrote the paper.

Reviewers: A.R.D., The University of Chicago; and J.P., University of Washington.

The authors declare no conflict of interest.1To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1519712113/-/DCSupplemental.

1150–1155 | PNAS | February 2, 2016 | vol. 113 | no. 5 www.pnas.org/cgi/doi/10.1073/pnas.1519712113

Page 2: Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally enhanced ... we adopt an approximate form for ... free-energy landscapes using variationally

done in reconnaissance metadynamics (19) or in ref. 20. Othermethods do without an external bias altogether, like temperature-accelerated molecular dynamics (MD) (21), in which an artificiallylarge friction is applied to a set of CVs to ensure that they evolveslowly compared with the atomic motions. This list is by no meansexhaustive; for a more thorough review see for example ref. 22, butthe sheer number of attempts to solve this problem testifies toits importance.

AlgorithmThe VES approach was introduced by Valsson and Parrinello inref. 14 and extensions have been described in refs. 23, 24. In thisapproach the system is biased by a potential V ðsÞ, which acts on aset of CVs s= ðs1, . . . , sdÞ and which is updated over the courseof the simulation to minimize a functional Ω½V �:

Ω½V �= 1βlog

Rds  e−β½FðsÞ+V ðsÞ�R

ds  e−βFðsÞ+

Zds  pðsÞV ðsÞ, [1]

where β= 1=kBT and FðsÞ is the free-energy surface correspondingto the set of CVs. pðsÞ in this expression is known as the targetdistribution and it can be either predefined or determined by someiterative procedure during the simulation (23, 24). As we will see,the target distribution is the distribution that the biased systemsamples in the long time limit and thus it allows one to emphasizewhich states one wishes to sample. This is distinct, however, fromchoosing a particular configuration and biasing the system towardthat configuration. The V ðsÞ that minimizes this functional obeysthe relationship

V ðsÞ=−FðsÞ− 1βlog  pðsÞ+C, [2]

where C is an unimportant constant. In an ergodic simulation bi-ased by V ðsÞ, the CVs sample the distribution PV ðsÞ∝ e−β½FðsÞ+V ðsÞ�and we see that PV ðsÞ= pðsÞ when the minimum is reached. Thefunctional Ω½V � is convex with respect to V ðsÞ which means thatthis is the global minimum. Minimizing this functional is equivalentto minimizing the Kullback–Leibler divergence between the sam-pled distribution and the target distribution (25–28). Eqs. 1 and 2define the variational principle that we make use of in this paper.To perform the minimization of Ω½V � it is generally most

convenient to expand V ðsÞ in a set of basis functions

V ðs;αÞ=Xk

αk fkðsÞ, [3]

for example, plane waves or products of Chebyshev polynomials, anduse the set of expansion coefficients αk as variational parameters.The approach we use to minimize Ω with respect to these parame-ters is identical to the approach described in ref. 14, and it is basedon the stochastic optimization method from ref. 29. We provide adetailed description in the Supporting Information.As a prototypical example of a high-dimensional free-energy

surface, we will consider the backbone dihedral angles of a protein.In particular, we will study chignolin, a small 10-residue-long protein.It must be said, however, that our method is completely general andafter appropriate adaptation it could be applied to many otherproblems such as, for instance, nucleation and ligand binding. Theadvantage of studying chignolin is that its experimental structure isknown (30) and it has been simulated for 100 μs using a specialpurpose machine (1), in addition to many other simulation studies(e.g., refs. 31–33), which provides us with good reference data. In ref.1 chignolin was studied very close to the folding temperature, and afolding time of ≈0.6 μs and an unfolding time of ≈2.2 μs weremeasured. These time scales put the study of this protein outside therange of ordinary MD simulations on standard hardware.

A complete characterization of the ensemble of conformationsof chignolin would require calculating the free energy as a functionof all of its backbone dihedral angles. In general there are twohighly flexible dihedral angles per residue, ϕ and ψ , with a fewexceptions like proline in which only the ψ angle can truly rotate,and the N-terminal and C-terminal residues which only have oneflexible dihedral angle. For simplicity we will refer to the set of Nbiased dihedral angles as θi. In total there are 17 dihedral anglesthat determine the structure of chignolin, which means that astraightforward calculation of the free-energy surface is clearlyimpossible and also finding a restricted number of CVs capable ofdescribing the conformational manifold is challenging. Thus, tosolve this multidimensional problem we shall use an approximatebut computationally tractable expression for V ðsÞ, where s is the setof 17 dihedral angles. This is in spirit very similar to what is done inquantum chemistry when deriving the Hartree–Fock equations(34). Namely, it is first assumed that the wave function has the formof a single Slater determinant. Then use is made of the Rayleigh–Ritz variational principle to derive the equations whose solutionwill give the optimal determinant.In our case however we can add to the power of the variational

principle the flexibility that comes from the freedom of choosingthe target distribution pðsÞ. To better understand this point let usconsider the case in which pðsÞ is the equilibrium distributionpðsÞ= exp½−βFðsÞ�=Z. In such a case the functional is minimumfor V ðsÞ= 0 and FðsÞ=−ð1=βÞlog pðsÞ if we ignore irrelevantconstants. Of course FðsÞ is the very quantity we want to evaluateand it is not available but we can imagine having an approxi-mation FoðsÞ=−ð1=βÞlog pðsÞ and in this case at the minimumthe variational principle gives

FðsÞ=−V ðsÞ+FoðsÞ. [4]

This equation is equivalent to Eq. 2 but it has been recast in aform that allows one to view the bias V ðsÞ as the correction to theinitial estimate. There is a conceptual as well as a technical con-sequence of this observation. From a conceptual point of view,sampling can be greatly accelerated with an adroit choice of pðsÞthat captures as much as possible any a priori information we haveon the system. From a technical point of view the requirements onthe variational flexibility are much less stringent because V ðsÞ onlyneeds to correct the errors in FoðsÞ, especially those errors aroundthe minimum. Thus, even an approximate form of V ðsÞ can still becapable of correcting FoðsÞ.

Fig. 1. Time series of the Cα RMSD from the reference structure forchignolin during the optimization process. We see several transitions be-tween the folded and the unfolded state of the protein.

Shaffer et al. PNAS | February 2, 2016 | vol. 113 | no. 5 | 1151

CHEM

ISTR

YBIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Page 3: Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally enhanced ... we adopt an approximate form for ... free-energy landscapes using variationally

Let us illustrate this point on chignolin. This small protein free-energy landscape is characterized by two basins, the folded stateand the unfolded one. In the first, the protein is folded in astructure that can be described by a set of dihedral angles (θi)about which the protein oscillates. The other basin is the unfoldedensemble in which no definite stable structure can be identified.Thus, to assist sampling we construct a pðsÞ that reflects these twofeatures. We write pðsÞ as a sum of two contributions:

pðsÞ=wf pf ðsÞ+wupuðsÞ, [5]

where the positive weights, wf and wu, are such that wf +wu = 1, andpf ðsÞ and puðsÞ are distributions that describe the folded state andthe unfolded state, respectively. In building pf ðsÞ and puðsÞ we haveto keep in mind that it is computationally advantageous to be able toeasily calculate the term hfiðsÞipðsÞ which appears on the right-handside of Eq. S1. Possibly the simplest pf ðsÞ= pf ðθ1, . . . , θNÞ that sat-isfies these requirements is obtained by writing pf ðθ1, . . . , θNÞ as aproduct of functions localized around the reference angles θi. Thevon Mises function, which is the periodic analog of the Gaussianfunction, lends itself to this use. This leads to

pf ðsÞ= pf ðθ1, . . . , θNÞ=Πieκ cosðθi−θiÞ2πI0ðκÞ , [6]

where I0ðκÞ is the modified Bessel function of order zero and eachfunction in the product is localized around θi and has variance 1=κ.We have chosen κ= 5, which corresponds roughly to the magnitudeof fluctuations in the folded state. To assist sampling of the un-folded state we simply use the uniform distribution:

puðθ1, . . . , θNÞ= 1

ð2πÞN . [7]

If the relative size of the two basins is known one could choosethe weights accordingly. For instance, based on the results fromref. 1 one could have chosen wf = 2=3 and wu = 1=3 to betterreflect their results. However, to show the robustness of ourprocedure we have made the blind choice wf =wu = 1=2. In whatfollows, we will discuss briefly how different choices of wf and wucan affect the rate of convergence and the sampling.The next step is to simplify the expression for the multidimen-

sional V ðsÞ. In this work we express the bias as the sum of two bodybiases where only dihedrals that are next-nearest neighbors alongthe peptide chain are explicitly correlated:

V ðθ1, . . . , θNÞ|V1ðθ1, θ2Þ+V2ðθ2, θ3Þ+ . . .VN−1ðθN−1, θNÞ, [8]

where each term is expanded as in Eq. 2. This form is inspired bythe Ising model and takes advantage of the linear structure of theprotein. The long-range correlations that are not explicitly takeninto account are introduced indirectly through our choice of pðsÞ.This model does assume that long-range correlations are negligiblewhen the protein is not near the folded state. This may seem like asevere approximation but recent work has demonstrated that anoptimal causality model for the Ala-10 peptide only includes cor-relations that are local in time and space (35). The computationaladvantages of this form are evident. If we let M be the size of thebasis set that is needed to express the angular dependence of eachindividual dihedral angle, then the number of basis functions re-quired to represent the left-hand side above is M17 whereas thenumber of basis functions required to represent the approximateright-hand side is reduced to manageable 16M2.

ResultsFig. 1 shows a time series of the root-mean-square displacement(RMSD) from the folded state reference structure for chignolin.

This time series was taken from a simulation in which the bias wasbeing optimized to sample the target distribution described aboveand it shows the protein going back and forth between folded andunfolded states. The protein was initiated in an extended state butin less than 300 ns we see at least four transitions between theunfolded state and the folded state. The ability to generate thesetransitions underlies any study of the folding transition, whetherthey be studies of thermodynamics or kinetics. In the SupportingInformation we demonstrate that the bias converges over this timescale, indicating that the functional Ω½V � has been minimized withrespect to the variational parameters. This behavior is somewhatsensitive to the exact choice of wf and wu. In fact, one of the mainresults of the paper is to illuminate the role that pðsÞ can have inassisting and guiding sampling. If we choose wf = 0 (which is justtargeting a uniform distribution in the dihedral angles), we find thatthe system explores the unfolded state thoroughly but never findsthe folded state. Higher values of wf can assist the system in findingthe folded state faster. In the Supporting Information we show a timeseries for wf = 0.85 and we see that in this case the rate of foldingis faster.This approach gives us various ways to study the physics of

chignolin. One interesting possibility involves examining the high-dimensional free-energy surface itself. In the present study, how-ever, we will focus on extracting information from the unbiasedensemble on CVs that we have not directly biased. Recovering thisinformation directly from the optimization simulations can be adifficult reweighting problem. One simple way to sidestep thisproblem is to use a replica exchange (RE) scheme. RE, in particularparallel tempering, is an extremely popular method for enhancingsampling in simulations (36, 37). Conventional parallel tempering isplagued by a few important problems that limit its applicability. Thegoal in parallel tempering is to have a sufficiently high-temperaturereplica where the system can explore ergodically and easily sur-mount the important barriers, and then have a ladder of replicasthat allow exchanges between low-temperature and high-tem-perature systems. If the barriers between the states of interest are

Fig. 2. Estimates of the folding free energy as a function of simulation timefrom VES-RE and unbiased MD. The quantity ΔF is calculated according toEq. S1 as described in the Supporting Information. The red line is from the100-μs unbiased MD trajectory from ref. 1, the blue line is from VES-RE, andthe horizontal black line is the converged value from the unbiased MDsimulation for reference. The time scale for the unbiased MD simulationfrom ref. 1 is displayed on the top x axis and the time scale for VES-RE isdisplayed on the bottom x axis; arrows point to the appropriate axes. Thefirst 50 ns of the VES-RE simulation was considered an equilibration periodand was not used to estimate the folding free energy. To make an exactcomparison of the computational resources used, one must consider thatthere were 12 replicas required for VES-RE.

1152 | www.pnas.org/cgi/doi/10.1073/pnas.1519712113 Shaffer et al.

Page 4: Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally enhanced ... we adopt an approximate form for ... free-energy landscapes using variationally

primarily energetic this can be a very useful approach, but if thebarriers are largely entropic then the height of these barriers canincrease as the temperature increases and sampling can becomeeven less efficient than it is at the reference temperature; further-more, some of the basins may diminish or even disappear at thehigher temperature. This problem of interesting basins gettingwashed out is also a potential problem for RE with CV tempering(17) and biasing potential RE MD (18), which we discussed above.Ultimately, for an RE procedure to truly improve upon the effi-ciency of a standard simulation with comparable computationalresources, it must lower the roundtrip time for exploring the basinsof interest; this is a criterion which is often not demonstrated forsuch simulations (38). In the VES-RE approach, this is not aproblem because under the combined action of V ðsÞ and pðsÞ wecan target specifically the conformational changes of interest.The details of the RE we use are described in Materials and

Methods. We will refer to this approach as VES-RE. In Fig. 2 weshow how an estimate of the folding free energy obtained from thelowest, unbiased replica converges over simulation time. The waywe obtain this estimate is described in the Supporting Information.For comparison we also show the same quantity calculated fromthe 100-μs unbiased MD trajectory from ref. 1. Results from

VES-RE are shown for a set of replicas which were initiated in thefolded state. Within 300 ns per replica the folding free energy hasconverged almost exactly to the result obtained from the unbiasedMD trajectory, with small fluctuations on the order of ∼0.5 kJ/mol.For the unbiased MD simulation it takes nearly the entire 100 μsto obtain a converged result. This is an order of magnitudeimprovement in efficiency and it is very likely that it could beimproved further by, for example, using information from the biasedreplicas. One of the most important tests of convergence for asimulation that samples a complex transformation is how sensitivethe results are to initial conditions. To rule out such a dependenceon initial conditions we also did a calculation in which all replicaswere initiated in an extended state and we find that the estimate ofthe folding free energy from this calculation agrees very well with theresult in Fig. 2. These results are presented in Fig. 3A. For com-parison with a standard method, we have also performed a paralleltempering simulation in the well-tempered ensemble (PT-WTE) (39,40), which we describe in the Supporting Information. These resultsare shown in Fig. 3B, where it is clear that 300 ns per replica is notsufficient to obtain a converged estimate of the folding free energyfrom PT-WTE. In the Supporting Information we show thattransitions between the folded and the unfolded state areextremely rare for continuous trajectories from PT-WTE, i.e.,those that are reconstructed from jumps between the differentreplicas, which is crucial for efficient RE simulations.We can also measure free-energy surfaces of any order parameter

we like from the unbiased replica in VES-RE. These are calculatedby using a histogram to estimate the probability distribution of

A

B

Fig. 3. Convergence of the estimate of the folding free energy from VES-RE(A) and PT-WTE (B). We show results from simulations in which all replicaswere initiated in the folded state (solid lines) and simulations in which allreplicas were initiated in the unfolded state (dashed lines). The horizontalblack line indicates the converged result from 100-μs unbiased MD (1). Within300 ns of simulation time per replica, both of the VES-RE calculations and theresult from 100-μs unbiased MD agree with each other within 0.5 kJ/mol,indicating that we can accurately estimate the folding free energy this wayregardless of initial conditions. The PT-WTE calculations clearly have notconverged in this time scale.

A

B

Fig. 4. Comparison between free-energy surfaces as a function of the CαRMSD from the reference structure and the radius of gyration of theα-carbon atoms obtained from 100-μs unbiased MD trajectory from ref. 1(A) and VES-RE (B). The folded state minimum in these surfaces is set to zeroand we only display features up to 20 kJ/mol.

Shaffer et al. PNAS | February 2, 2016 | vol. 113 | no. 5 | 1153

CHEM

ISTR

YBIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Page 5: Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally enhanced ... we adopt an approximate form for ... free-energy landscapes using variationally

the order parameter (nðsÞ), and then using the relationshipFðsÞ=−ð1=βÞlog nðsÞ. In Fig. 4 we show the free-energy surfaceas a function of the Cα RMSD from the reference structure andthe radius of gyration of the α-carbon atoms. For comparison weshow the same free-energy surface estimated from the 100-μsunbiased MD trajectory. We find excellent agreement betweenthe two methods. In particular, all of the regions below ∼15 kJ/molare very well sampled. The small discrepancies in the transitionstate regions are due to difficulties that RE methods have insampling the very low probability regions around barriers.

ConclusionsWe have presented results that illustrate some of the uniquecapabilities of the VES method introduced in ref. 14. We haveshown that we can bias high-dimensional systems by choosing anappropriate, simplified representation of the bias and by makingan intelligent choice of target distribution. This makes it possibleto generate biasing potentials that efficiently and reversibly drivethe system back and forth between the folded state and a dis-ordered state of a small protein. This capability is extremelyimportant for any study of the folding transition, and we haveshown that it forms the basis for an efficient RE procedure thatallows one to obtain statistics of the unbiased ensemble. Statisticsgenerated in this way reproduce those obtained from 100-μs MDsimulations on specialized hardware with extremely high accuracy ata fraction of the computational cost. The tradeoff for this dramaticreduction in computational resources is that we must make use ofsome information about the equilibrium structure in designing thetarget distribution, and that all dynamical information is lost in theRE procedure. Recent improvements in extracting rate constantsfrom biased simulations (24, 41) may make it possible to overcomethe latter limitation.Foreseeable extensions of the methods we have described here

have many exciting and interesting applications. In particular, theidea of using the target distribution to sample transitions between alocalized state and a disordered state could be extremely useful forproblems like ligand binding and unbinding. Accurately estimatingthe binding free energy of ligands is an extremely important andstill very difficult calculation. We also imagine combining thismethod with tools like Rosetta which can predict protein structurebased on the primary sequence (42). Another avenue we have notexplored in this work is studying the high-dimensional free-energysurface itself. Accurate, high-dimensional free-energy surfacescould be an ideal way to construct coarse-grained models, and theycould form the basis for novel studies of protein physics. The free-energy landscape of a protein must have several unique propertiesso that the protein does not get trapped in metastable states, and sothat it can find the folded state in minimal time. Much of ourcurrent understanding of protein-folding free-energy landscapes is

informed by studies of simplified, lattice-based models (43), andthe approach we describe here might complement such studies.

Materials and MethodsMD Simulations. All simulations of chignolin (sequence TYR-TYR-ASP-PRO-GLU-THR-GLY-THR-TRP-TYR) were conducted with GROMACS 4.6.5 (44) with amodified version of the PLUMED 2 plugin (45). The CHARMM22* force field (46)and the three-site transferrable intermolecular potential (TIP3P) water model (47)were used to make direct comparisons with ref. 1. ASP and GLU residues weresimulated in their charged states, as were the N- and C-terminal amino acids.A timestep of 2 fs was used for all systems and constant temperature of 340 K(in agreement with ref. 1) was maintained by the velocity rescaling thermostatof Bussi et al. (48). All bonds involving H atoms were constrained with the linearconstraint solver (LINCS) algorithm (49). Electrostatic interactions were calcu-lated with the particle mesh Ewald scheme using a Fourier spacing of 0.12 nm anda real space cutoff of 0.95 nm (50). The same real space cutoff was alsoapplied to van der Waals interactions. Initial configurations were generatedby running a short 1-ns constant pressure (NPT) simulation with the proteinin the folded state to allow the box size to equilibrate using the Parrinello–Rahman barostat (51). Subsequent runs were all conducted with fixed boxsize. The final cubic box had a side length of 4.044 nm and it had 2,122 watermolecules, along with 2 sodium ions to neutralize the charge.

RE Procedure with VES Optimized Bias. The RE procedure used is similar in spiritto Hamiltonian RE, solute tempering, and RE with CV tempering (17, 52, 53).In each replica the system is biased by the converged bias potential from theoptimization run, scaled by a constant factor. The bias applied on replica i isð1− ð1=γiÞÞVðsÞ, where γi determines the strength of the bias applied. In thelowest replica, γi =1 and the system is unbiased. γi then increases geo-metrically as one goes up the ladder of replicas. For the chignolin systemstudied here we used 12 replicas and the highest replica had γ = 18. Ex-change moves were attempted every 20 ps and accepted or rejectedaccording to a Metropolis criterion. To satisfy detailed balance an exchangemove must have an acceptance probability:

acc=min�1, e−Δ

�, [9]

where

Δ= β�Vi�sj�+VjðsiÞ−ViðsiÞ+Vj

�sj��, [10]

where si is the value of the order parameters in ensemble i and Vi is the biasused in ensemble i. In principle one could also use information extractedfrom the biased replicas to speed up the convergence of the free-energysurfaces shown in Fig. 4. One could also imagine optimizing the choices of γithe way that Trebst et al. optimize the choice of temperatures for standardparallel tempering (54) but neither of these improvements was necessary forthe present study.

ACKNOWLEDGMENTS. We acknowledge D. E. Shaw Research for sharing datafrom the simulations of chignolin. P.S., O.V., and M.P. acknowledge fundingfrom the National Center for Computational Design and Discovery of NovelMaterials MARVEL and the European Union Grant ERC-2009-AdG-247075.

1. Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) How fast-folding proteins fold.

Science 334(6055):517–520.2. Pan AC, Borhani DW, Dror RO, Shaw DE (2013) Molecular determinants of drug-

receptor binding kinetics. Drug Discov Today 18(13-14):667–673.3. Tiwary P, Limongelli V, Salvalaglio M, Parrinello M (2015) Kinetics of protein-ligand

unbinding: Predicting pathways, rates, and rate-limiting steps. Proc Natl Acad Sci USA

112(5):E386–E391.4. Laio A, Parrinello M (2002) Escaping free-energy minima. Proc Natl Acad Sci USA

99(20):12562–12566.5. Torrie G, Valleau J (1977) Nonphysical sampling distributions in Monte Carlo free-

energy estimation: Umbrella sampling. J Comput Phys 23(2):187–199.6. Huber T, Torda AE, van Gunsteren WF (1994) Local elevation: A method for improving

the searching properties of molecular dynamics simulation. J Comput Aided Mol Des

8(6):695–708.7. Darve E, Pohorille A (2001) Calculating free energies using average force. J Chem Phys

115(20):9169–9183.8. Hansmann UHE, Wille LT (2002) Global optimization by energy landscape paving. Phys

Rev Lett 88(6):068105.9. Wang F, Landau DP (2001) Efficient, multiple-range random walk algorithm to cal-

culate the density of states. Phys Rev Lett 86(10):2050–2053.

10. Barducci A, Bussi G, Parrinello M (2008) Well-tempered metadynamics: A smoothly

converging and tunable free-energy method. Phys Rev Lett 100(2):020603.11. Maragakis P, van der Vaart A, Karplus M (2009) Gaussian-mixture umbrella sampling.

J Phys Chem B 113(14):4664–4673.12. Whitmer JK, et al. (2015) Sculpting bespoke mountains: Determining free energies

with basis expansions. J Chem Phys 143(4):044101.13. Bartels C, Karplus M (1997) Multidimensional adaptive umbrella sampling: Applica-

tions to main chain and side chain peptide conformations. J Comput Chem 18(12):

1450–1462.14. Valsson O, Parrinello M (2014) Variational approach to enhanced sampling and free

energy calculations. Phys Rev Lett 113(9):090601.15. Piana S, Laio A (2007) A bias-exchange approach to protein folding. J Phys Chem B

111(17):4553–4559.16. Pfaendtner J, Bonomi M (2015) Efficient sampling of high-dimensional free-energy

landscapes with parallel bias metadynamics. J Chem Theory Comput 11(11):5062–5067.17. Gil-Ley A, Bussi G (2015) Enhanced conformational sampling using replica exchange

with collective-variable tempering. J Chem Theory Comput 11(3):1077–1085.18. Kannan S, Zacharias M (2007) Enhanced sampling of peptide and protein con-

formations using replica exchange simulations with a peptide backbone biasing-

potential. Proteins 66(3):697–706.

1154 | www.pnas.org/cgi/doi/10.1073/pnas.1519712113 Shaffer et al.

Page 6: Enhanced, targeted sampling of high-dimensional free ... · energy landscapes using variationally enhanced ... we adopt an approximate form for ... free-energy landscapes using variationally

19. Tribello GA, Ceriotti M, Parrinello M (2010) A self-learning algorithm for biasedmolecular dynamics. Proc Natl Acad Sci USA 107(41):17509–17514.

20. Chen M, Yu TQ, Tuckerman ME (2015) Locating landmarks on high-dimensional freeenergy surfaces. Proc Natl Acad Sci USA 112(11):3235–3240.

21. Abrams CF, Vanden-Eijnden E (2010) Large-scale conformational sampling of proteinsusing temperature-accelerated molecular dynamics. Proc Natl Acad Sci USA 107(11):4961–4966.

22. Abrams C, Bussi G (2014) Enhanced sampling in molecular dynamics using metady-namics, replica-exchange, and temperature-acceleration. Entropy (Basel) 16(1):163–199.

23. Valsson O, Parrinello M (2015) Well-tempered variational approach to enhancedsampling. J Chem Theory Comput 11(5):1996–2002.

24. McCarty J, Valsson O, Tiwary P, Parrinello M (2015) Variationally optimized free-energy flooding for rate calculation. Phys Rev Lett 115(7):070601.

25. de Boer PT, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropymethod. Ann Oper Res 134(1):19–67.

26. Chaimovich A, Shell MS (2011) Coarse-graining errors and numerical optimizationusing a relative entropy framework. J Chem Phys 134(9):094112.

27. Bilionis I, Koutsourelakis PS (2012) Free energy computations by minimization ofKullback-Leibler divergence: An efficient adaptive biasing potential method forsparse representations. J Comput Phys 231(9):3849–3870.

28. Zhang W, Wang H, Hartmann C, Weber M, Schutte C (2014) Applications of the cross-entropy method to importance sampling and the optimal control of diffusions. SIAMJ Sci Comput 36(6):2654–2672.

29. Bach F, Moulines E (2013) Non-strongly-convex smooth stochastic approximation withconvergence rate O (1/n). Advances in Neural Information Processing Systems 26 (NIPS2013), eds Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (CurranAssociates, Inc., Red Hook, NY), pp 773–781.

30. Honda S, et al. (2008) Crystal structure of a ten-amino acid protein. J Am Chem Soc130(46):15327–15331.

31. Miao Y, Feixas F, Eun C, McCammon JA (2015) Accelerated molecular dynamics sim-ulations of protein folding. J Comput Chem 36(20):1536–1549.

32. Zhang T, Nguyen PH, Nasica-Labouze J, Mu Y, Derreumaux P (2015) Folding atomisticproteins in explicit solvent using simulated tempering. J Phys Chem B 119(23):6941–6951.

33. Kührová P, De Simone A, Otyepka M, Best RB (2012) Force-field dependence ofchignolin folding and misfolding: Comparison with experiment and redesign. BiophysJ 102(8):1897–1906.

34. Szabo A, Ostlund N (1982) Modern Quantum Chemistry: Introduction to AdvancedElectronic Structure Theory (Macmillan, New York).

35. Gerber S, Horenko I (2014) On inference of causality for discrete state models in amultiscale context. Proc Natl Acad Sci USA 111(41):14651–14656.

36. Hansmann UHE (1997) Parallel tempering algorithm for conformational studies ofbiological molecules. Chem Phys Lett 281:140–150.

37. Sugita Y, Okamoto Y (1999) Replica exchange molecular dynamics method for proteinfolding simulation. Chem Phys Lett 314:141–151.

38. Rosta E, Hummer G (2009) Error and efficiency of replica exchange molecular dy-

namics simulations. J Chem Phys 131(16):165102.39. Bonomi M, Parrinello M (2010) Enhanced sampling in the well-tempered ensemble.

Phys Rev Lett 104(19):190601.40. Deighan M, Bonomi M, Pfaendtner J (2012) Efficient simulation of explicitly solvated

proteins in the well-tempered ensemble. J Chem Theory Comput 8(7):2189–2192.41. Tiwary P, Parrinello M (2013) From metadynamics to dynamics. Phys Rev Lett 111(23):

230602.42. Rohl CA, Strauss CEM, Misura KMS, Baker D (2004) Protein structure prediction using

Rosetta. Methods Enzymol 383:66–93.43. Dinner AR, �Sali A, Smith LJ, Dobson CM, Karplus M (2000) Understanding protein

folding via free-energy surfaces from theory and experiment. Trends Biochem Sci

25(7):331–339.44. Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GRGMACS 4: Algorithms for

highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory

Comput 4(3):435–447.45. Tribello GA, Bonomi M, Branduardi D, Camilloni C, Bussi G (2014) PLUMED 2: New

feathers for an old bird. Comput Phys Commun 185(2):604–613.46. Piana S, Lindorff-Larsen K, Shaw DE (2011) How robust are protein folding simula-

tions with respect to force field parameterization? Biophys J 100(9):L47–L49.47. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) Comparison

of simple potential functions for simulating liquid water. J Chem Phys 79(2):926.48. Bussi G, Donadio D, Parrinello M (2007) Canonical sampling through velocity rescal-

ing. J Chem Phys 126(1):014101.49. Hess B, Bekker H, Berendsen HJC, Fraaije JGEM (1997) LINCS: A linear constraint solver

for molecular simulations. J Comput Chem 18(12):1463–1472.50. Essmann U, et al. (1995) A smooth particle mesh Ewald method. J Chem Phys 103:

8577–8593.51. Parrinello M (1981) Polymorphic transitions in single crystals: A new molecular dy-

namics method. J Appl Phys 52(12):7182.52. Liu P, Kim B, Friesner RA, Berne BJ (2005) Replica exchange with solute tempering: A

method for sampling biological systems in explicit water. Proc Natl Acad Sci USA

102(39):13749–13754.53. Sugita Y, Okamoto Y (2000) Replica-exchange multicanonical algorithm and multi-

canonical replica-exchange method for simulating systems with rough energy land-

scape. Chem Phys Lett 329(October):261–270.54. Trebst S, Troyer M, Hansmann UHE (2006) Optimized parallel tempering simulations

of proteins. J Chem Phys 124(17):174903.55. Raiteri P, Laio A, Gervasio FL, Micheletti C, Parrinello M (2006) Efficient reconstruction

of complex free energy landscapes by multiple walkers metadynamics. J Phys Chem B

110(8):3533–3539.56. Prakash MK, Barducci A, Parrinello M (2011) Replica temperatures for uniform ex-

change and efficient roundtrip times in explicit solvent parallel tempering simula-

tions. J Chem Theory Comput 7(7):2025–2027.

Shaffer et al. PNAS | February 2, 2016 | vol. 113 | no. 5 | 1155

CHEM

ISTR

YBIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY