Force mode atomic force microscopy as a tool for protein folding studies

19
Analytica Chimica Acta 479 (2003) 87–105 Force mode atomic force microscopy as a tool for protein folding studies Robert B. Best a,1 , David J. Brockwell b,1 , José L. Toca-Herrera a,2 , Anthony W. Blake b , D. Alastair Smith c , Sheena E. Radford b , Jane Clarke a,a MRC Centre for Protein Engineering, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK b School of Biochemistry and Molecular Biology, University of Leeds, Leeds LS2 9JT, UK c Department of Physics and Astronomy, University of Leeds, Leeds LS2 9JT, UK Received 4 October 2002; accepted 16 October 2002 Abstract The advent of a new class of force microscopes designed specifically to “pull” biomolecules has allowed non-specialists to use force microscopy as a tool to study single-molecule protein unfolding. This powerful new technique has the potential to explore regions of the protein energy landscape that are not accessible in conventional bulk studies. It has the added advantage of allowing direct comparison with single-molecule simulation experiments. However, as with any new technique, there is currently no well described consensus for carrying out these experiments. Adoption of standard schemes of data selection and analysis will facilitate comparison of data from different laboratories and on different proteins. In this review, some guidelines and principles, which have been adopted by our laboratories, are suggested. The issues associated with collecting sufficient high quality data and the analysis of those data are discussed. In single-molecule studies, there is an added complication since an element of judgement has to be applied in selecting data to analyse; we propose criteria to make this process more objective. The principal sources of error are identified and standardised methods of selecting and analysing the data are proposed. The errors associated with the kinetic parameters obtained from such experiments are evaluated. The information that can be obtained from dynamic force experiments is compared, both quantitatively and qualitatively to that derived from conventional protein folding studies. © 2002 Elsevier Science B.V. All rights reserved. Keywords: AFM; Biomolecules; Protein folding; Titin; Mechanical unfolding 1. Introduction In general terms, the mechanism by which proteins fold to their native state is now quite well under- Corresponding author. E-mail address: [email protected] (J. Clarke). 1 These two authors contributed equally to this work. 2 Present address: Center for Ultrastructure Research, Universität für Bodenkultur Wien, Gregor Mendel Str. 33, A-1180 Vienna, Austria. stood. The process is known to be highly cooperative, and protein engineering studies, in combination with conventional kinetic experiments, allow the character- isation of intermediates and transition states along the folding pathway. Due to the enormous number of de- grees of freedom of the polypeptide chain, folding is understood to occur on a complex multidimensional energy landscape which is biased towards the native state (the energy minimum). However, conventional methods such as stopped–flow kinetics have disad- vantages in terms of characterising this landscape. 0003-2670/03/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved. doi:10.1016/S0003-2670(02)01572-6

Transcript of Force mode atomic force microscopy as a tool for protein folding studies

Page 1: Force mode atomic force microscopy as a tool for protein folding studies

Analytica Chimica Acta 479 (2003) 87–105

Force mode atomic force microscopy as a tool forprotein folding studies

Robert B. Besta,1, David J. Brockwellb,1, José L. Toca-Herreraa,2,Anthony W. Blakeb, D. Alastair Smithc, Sheena E. Radfordb, Jane Clarkea,∗

a MRC Centre for Protein Engineering, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UKb School of Biochemistry and Molecular Biology, University of Leeds, Leeds LS2 9JT, UK

c Department of Physics and Astronomy, University of Leeds, Leeds LS2 9JT, UK

Received 4 October 2002; accepted 16 October 2002

Abstract

The advent of a new class of force microscopes designed specifically to “pull” biomolecules has allowed non-specialists touse force microscopy as a tool to study single-molecule protein unfolding. This powerful new technique has the potential toexplore regions of the protein energy landscape that are not accessible in conventional bulk studies. It has the added advantageof allowing direct comparison with single-molecule simulation experiments. However, as with any new technique, there iscurrently no well described consensus for carrying out these experiments. Adoption of standard schemes of data selection andanalysis will facilitate comparison of data from different laboratories and on different proteins. In this review, some guidelinesand principles, which have been adopted by our laboratories, are suggested. The issues associated with collecting sufficienthigh quality data and the analysis of those data are discussed. In single-molecule studies, there is an added complication sincean element of judgement has to be applied in selecting data to analyse; we propose criteria to make this process more objective.The principal sources of error are identified and standardised methods of selecting and analysing the data are proposed. Theerrors associated with the kinetic parameters obtained from such experiments are evaluated. The information that can beobtained from dynamic force experiments is compared, both quantitatively and qualitatively to that derived from conventionalprotein folding studies.© 2002 Elsevier Science B.V. All rights reserved.

Keywords: AFM; Biomolecules; Protein folding; Titin; Mechanical unfolding

1. Introduction

In general terms, the mechanism by which proteinsfold to their native state is now quite well under-

∗ Corresponding author.E-mail address: [email protected] (J. Clarke).

1 These two authors contributed equally to this work.2 Present address: Center for Ultrastructure Research, Universität

für Bodenkultur Wien, Gregor Mendel Str. 33, A-1180 Vienna,Austria.

stood. The process is known to be highly cooperative,and protein engineering studies, in combination withconventional kinetic experiments, allow the character-isation of intermediates and transition states along thefolding pathway. Due to the enormous number of de-grees of freedom of the polypeptide chain, folding isunderstood to occur on a complex multidimensionalenergy landscape which is biased towards the nativestate (the energy minimum). However, conventionalmethods such as stopped–flow kinetics have disad-vantages in terms of characterising this landscape.

0003-2670/03/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved.doi:10.1016/S0003-2670(02)01572-6

Page 2: Force mode atomic force microscopy as a tool for protein folding studies

88 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

Being bulk experiments, it is only possible to ob-serve ensemble averages, so that possible alternativepathways can only be indirectly inferred from thekinetics. Furthermore, using chemical denaturantsto perturb the system may only explore a part ofthe energy landscape; in some cases this may notbe the most functionally relevant region. Therefore,single-molecule experiments such as the recently de-veloped force mode atomic force microscopy (AFM)technique, have an important and complementary roleto play in the study of protein folding and unfolding.

The development of the atomic force microscope[1]has permitted the measurement of forces in the pN–nNrange with high resolution in a realistic solvent en-vironment. The key component is a micro-fabricatedcantilever between 20 and 300�m in length, at theend of which is a sharpened tip (radius approximately10 nm). Interaction between the tip and sample givesrise to a force which can be measured from the resul-tant cantilever detection. The “force mode” version ofthe instrument measures interaction forces involvingsingle molecules: the tip is moved vertically at con-stant velocity, mapping the force exerted by the in-teraction as a function of vertical displacement[2].Such an approach has been used in a number of bio-logical systems including the binding of antibodies totheir antigens[3], ligands to receptors[4], the bind-ing forces of complimentary strands of DNA[5] andconformational changes in biological polymers[6].More recently, however, force mode AFM has beenused to measure the response of folded proteins tomechanical stress in so-called mechanical unfoldingexperiments. In these experiments, a piezoelectric po-sitioner (allowing sub-nanometre distance resolution)is used to drive the tip of the cantilever onto a surfaceon which small amounts of protein have been immo-bilised (Fig. 1). The tip is then retracted at a set rate(‘pulling speed’), exerting an increasing force on pro-tein molecules that adhere to the tip. At a certain forcethe protein will unfold. The rapid increase in distanceobtained by unfolding a globular protein causes thecantilever to spring back close to its resting position.Repeating this forn protein units or ‘domains’ in se-ries in a polymer thus gives rise to the now familiar‘saw-toothed’ pattern, typical of mechanical unfoldingof polyprotein polymers (Fig. 1).

Force mode AFM previously required the construc-tion of dedicated instruments or extensive hardware

Fig. 1. Cartoon showing unfolding of a single polyproteinmolecule: (bottom) the protein is immobilised on the gold surfaceand (top) non-specifically adsorbed onto the cantilever tip. Uponretraction of the tip the polyprotein is extended causing the forceto rise from a value close to zero (A) to one at which a do-main suddenly unfolds (B). The sudden increase in length rapidlyreduces the measured force (C). Further retraction causes the un-folded domain to extend and the force increases gradually (D) upto a point where the next domain unfolds. The process is repeateduntil all the domains are unfolded and the protein–tip interactionis broken. Reprinted from Smith and Radford, Protein folding:pulling back the frontiers,Curr. Biol. 10 (2000) R662–R664, withpermission from Elsevier Science.

modifications to scanning probe AFM instruments toobtain sufficient force sensitivity, reduce optical in-terference and allow spring constant calibration. Asa consequence, these experiments were mostly thepreserve of specialist microscopists. Recently, how-ever, dedicated force probes have become commer-cially available, making this technique accessible tothe non-specialist. At the same time, the ever expand-ing toolbox available to molecular biologists has fa-cilitated the production of biomolecules tailor-madefor mechanical unfolding studies, resulting in a rapidgrowth in applications of force mode AFM to biolog-ical problems. Additionally, there are other methodsof measuring force, such as the biomembrane forceprobe (BFP) in which force is measured by the dis-tortion of a membrane capsule[7], optical tweezers

Page 3: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 89

where the applied force is measured from the changein momentum of light refracted by the bead in the op-tical trap or, alternatively, from the displacement ofthe bead from the centre of the trap[8], and mag-netic tweezers, where the force is found from the mag-netic field required to move a paramagnetic bead[9].The use of these techniques to map the strength ofinteractions under different loading conditions is of-ten referred to as (dynamic) force spectroscopy. Someof these techniques have advantages over force modeAFM, particularly in terms of their sensitivity and sig-nal to noise ratio; however, these technologies are stillin their infancy and not available to the non-specialistsince there are no commercially available instrumentsat the time of writing.

The appeal of force spectroscopy, when applied tothe study of proteins involved in muscle contractionand cell–cell interactions is self-evident, as force playsa central role in these functions[10–15]. At a morefundamental level, the effect of mechanical deforma-tion on protein structure and function is beginning toreveal new insights into the elasticity of proteins aspolymers and the origins of the mechanical resistanceof proteins.

In this article, we review the new and excitingfield of single-molecule mechanical unfolding ofprotein polymers. Our aim is two-fold. Firstly, wedescribe methods and protocols, together with exper-imental strategies and recommendations, developedin our laboratories, that simplify the construction ofmultimeric proteins and the measurements of theirmechanical properties using the AFM. Secondly, weset down a set of criteria that we consider should bemet in obtaining, manipulating and quantifying dataobtained by this method. The processes of obtainingforce–distance curves, measuring the unfolding forceand forming frequency histograms are described. Is-sues such as hit-rate and quality of the data, criteriafor selecting peaks and the choice of bin and samplesizes are discussed. We also suggest possible sourcesof error and the best way to analyse data obtained ondifferent days using different cantilevers. By settingout these benchmarks we hope to enable the com-parison of data obtained by mechanical unfoldingexperiments in different laboratories and by differentexperimentalists to be as robust as that obtained usingtraditional chemical unfolding–refolding methods.Excellent reviews of the results obtained using force

mode AFM to date and the implications of thesedata for understanding the mechanical properties ofproteins have been published previously[16,17].

2. Engineering tailor-made molecules formechanical unfolding studies

The first mechanical unfolding studies of a proteinusing an AFM were attempted by attaching a singlethiol-derivatised protein between the tip and surface,both of which were coated with gold[18]. Whilstunfolding events were observed, the traces were dif-ficult to interpret: at high concentrations of proteinmany molecules were caught between the tip and sur-face, leading to a degeneracy of peaks and, secondly,complications in the data arose from tip–surface andtip–protein interactions. Later refinements of the ex-periment thus included steps that allowed the proteinof interest to be held away from the surface. In gen-eral, this can be achieved either by placing a singleprotein module between synthetic linkers and usingthe latter to attach the protein to both the surface andthe tip [19,20], by using naturally occurring chainsof protein domains, such a titin, tenascin or spec-trin [11,21,22], or by creating tailor-made polypro-teins of choice using molecular biological approaches[23–26]. Whilst natural polyproteins are fascinatingbiological entities[27], these proteins usually com-prise a series of different protein domains linked intandem which vary greatly in their biophysical proper-ties[28]. For example, the muscle protein titin containsmore than 300 different, albeit structurally related,protein domains[29]. Detailed analysis of the mechan-ical strength of an individual domain thus requires asimpler homopolymer to be constructed from a singledomain type. Such an approach was first adopted byFernandez and co-workers, in which a multi-modularprotein was created from a single domain of titin (TII27) by concatemerisation of a suitably engineeredTII27 gene[23]. As well as containing suitable linkerregions, this construct was also designed to contain anN-terminal hexahistidine tag to facilitate purificationand two C-terminal cysteine residues, to allow attach-ment to the gold coverslip. When this construct wasanalysed using a force mode AFM, each domain un-folded at a similar force and each gave rise to a highlyreproducible increase in chain length that produces the

Page 4: Force mode atomic force microscopy as a tool for protein folding studies

90 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

saw-toothed pattern of the force–distance traces of thisprotein (Fig. 1). Analysing the repeating pattern of un-folding forces due to a single domain allowed the me-chanical properties of this domain to be investigated[23]. Since that time, more than seven protein poly-mers have been constructed, including different ho-mopolymers of wild-type or mutant TI I27[23,30–32],I28 [33], T4 lysozyme[34], C2A [16], spectrin[24]and calmodulin[16], and heteropolymers of TI I27mutants[26] and TI I27–I28[33], TI I27–barnase[25]and TI I27–CI2 chimeras (Best and Clarke, unpub-lished data). The plethora of data resulting from thesepolyproteins has allowed new insights into a numberof issues, including the role played by stability andtopology in determining the mechanical properties ofproteins.

A number of different approaches have now beenused successfully to engineer polyproteins. In the en-gineered multi-modular TI I27 described earlier, twomethods were used[23]. The modules were cloned ei-ther using aBamH I–Bgl II system or a singleAva I re-striction site. These methods allow long constructs tobe made easily and relatively rapidly, but suffer somedisadvantages. TheBamH I–Bgl II system utilises thefact that whilst these restriction enzymes have dif-ferent recognition sequences they produce compatiblecohesive ends. Ligation of these cohesive ends resultsin a sequence that neither enzyme can cut. A con-catamer is thus built-up by ligating two single copiesinto a dimer, two dimers into a tetramer, etc. In thesecond protocol, the non-palindromic restriction sitefor Ava I was used. The ligation of a DNA fragmentwith an Ava I site at each end into a plasmid pre-cutwith Ava I is thus directional and successful ligationsresult in a broad range of products. This requires ex-tensive screening of ligation reactions to find a clonewith a sufficiently high number of inserts. More im-portantly, both of these methods are effectively ‘singleuse’. Each protein domain that one wishes to studyhas to be re-concatenated from scratch and insertionof a different domain into a pre-existing homopoly-mer is not possible. In an ingenious experiment, T4lysozyme polyproteins were assembled by engineer-ing cysteines into positions in the protein that formedinter-molecular contacts in the crystal structure, al-lowing the formation of chains of proteins by oxidis-ing crystals of the cysteine mutant[34]. Whilst simpleand effective, this method can only be used to produce

homopolymers and the polymerisation reaction is rel-atively uncontrolled. In addition, it relies on the abil-ity of the protein of interest to crystallise in a mannersuitable for the cross-linking reaction, and the avail-ability of a suitable crystal structure.

To circumvent these problems and to produce amore efficient and flexible method of producing pro-tein polymers, we have adopted a different approach(Fig. 2) in which a number of unique restriction sitesare specifically placed at the 5′ and 3′ ends of eachgene copy. The resulting genes can then be specif-ically ligated in a defined order, creating the abil-ity to make designer tailor-made polyproteins at will.In this manner, polymers containing either five oreight copies of a single TI I27 domain linked by spe-cific linker sequences have been created[26,35,36].The method of construction of a pentameric protein isshown schematically inFig. 2. The method can be usedto concatenate essentially any protein, although caremust be taken in designing the linker regions and se-lecting the domain boundaries. These caveats apply toany such method and are discussed later. The length ofthe linker region can be chosen at will, although limi-tations arise from the fact that different restriction sitesmust be chosen at each linker site. The shortest linkersinvolve the placement of a 6-base restriction enzymesite (or, when translated, a two-residue linker) betweenthe domains. In the case of theBamH I–Bgl II systemthe same amino acid linkers (Arg–Ser) were addedbetween each domain[23]. In the multi-restrictionenzyme case shown inFig. 2, a different four- orsix-residue linker containing a unique restriction siteseparates each domain.

The modular construction method has several ad-vantages. First, it allows modules to be cut andpasted in any order into a polyprotein. Heterogeneouspolyproteins can be constructed that usefully retain“reporter domains” as an internal standard[25]. Sin-gle modules or module pairs, with natural or artificiallinker sequences can be used. Whilst the cassettemethod is more laborious and complex to setup, itsmodularity makes it highly versatile. As a copy ofeach cassette is held in the shuttle vector, mutationscan easily be introduced in a controlled manner andthe resulting domains then used to replace one or morecopies in the polyprotein. This system, therefore, bothspeeds up the construction of new polyproteins andgives more control over the construction process.

Page 5: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 91

Fig. 2. Flow diagram showing the construction of a pentameric polyprotein[26]. The polymerase chain reaction (PCR) is used to introducea unique restriction site (black rectangle) and a synthetic multiple cloning site (containing five unique restriction sites, represented byshaded blocks) 5′ and 3′ to the DNA encoding ‘Your Favourite Protein’ (white rectangle), respectively. This cassette is then ligated directlyinto the expression vector (such as pET) and sequenced. The other cassettes are generated by four separate PCR reactions using pairs ofprimers with 5′ and 3′ overhangs that code for the pairs of unique restriction sites that define each cassette. Since Taq polymerase addsa single adenine base the PCR products are ligated into a T-vector (such as pGEM, Promega) obviating the need to digest each productand host plasmid with separate pairs of restriction endonucleases. The use of a such a ‘shuttle vector’ allows successful ligations to berapidly identified and their sequence confirmed by DNA sequencing. These cassettes are then sequentially ligated into the expression vector(at least three cassettes can be added simultaneously). Note: to limit the probability of homologous recombination of these cassettes, theexpression vector is always handled in arec− strain. After transformation into a suitablerec− expression host (such as BLR[DE3]pLysS)expression yields a polyprotein containing five copies of the protein joined together by the linkers shown. The residues in the linker whichcorrespond to the restriction endonuclease recognition site are underlined. Another such system is described in detail in[36].

Irrespective of the construction technique, thereare caveats to the polyprotein method. Care has tobe taken to ensure that the inserted protein domainis not truncated (aberrantly shortened domains canhave very different biophysical properties from theirfull-length counterparts[28,37–39]) and that thelinkers must be engineered such that the individualdomains do not interact. It is also important to ensure

that the protein maintains its structural, thermody-namic and kinetic properties in a multi-modular array.The thermodynamic stability and folding–unfoldingkinetics of proteins in concatamers can be directlymeasured by chemical equilibrium and stopped–flowtechniques. These parameters can then be comparedwith their monomeric counterparts, so that the effectof polymerisation on the thermodynamic and kinetic

Page 6: Force mode atomic force microscopy as a tool for protein folding studies

92 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

properties of the domains can be assessed[25,26].This is also possible where the polyprotein containsmore than one domain type, as long as the domains inthe polyprotein have properties that allow them to bedistinguished from each other. If this is not possible,an alternative method is to use NMR chemical shiftsas a sensitive test of structural similarity. For exam-ple, a three module TI I27–barnase–TI I27 constructhas been used in NMR experiments to demonstratedirectly that the structures of the domains are unaf-fected by their inclusion in a multimeric construct[25]. Not all proteins, however, will be unaffected byplacement in a polyprotein. Chymotrypsin inhibitor 2(CI2) is significantly destabilised when polymerisedusing (Gly–Ser)3 linkers (Best and Clarke, unpub-lished data). This may be ascribed to the loss ofthe salt bridge involving the C-terminal carboxy-late, and demonstrates that the construct must becarefully designed and tested before embarking onprotein concatamerisation for mechanical unfoldingstudies.

3. Optimising data acquisition, quality andquantity

The force exerted on the cantilever by the proteinis the product of the cantilever deflection and springconstant (kc). For a given force, therefore, a softerspring results in a larger signal and higher sensitivity.However, the softness of the cantilever that can beused is limited by thermally induced oscillations, theamplitude of which increases as the cantilever springconstant is reduced. It is easy to show, using theequipartition theorem, that the rms amplitude of thethermal bending motion for a cantilever of springconstant 50 pN nm−1 is 2.9 Å at room temperature.This equates to an rms noise on the force measure-ment of about 14 pN. For the purposes of systemswith soft elastic linkages (such as that used in me-chanical protein unfolding experiments), use of astiffer cantilever simplifies the analysis by ensuringthat the macromolecule is the dominant elastic com-ponent of the studied system. This latter point isa principal assumption in Monte Carlo simulations(seeSection 6). Depending on the manufacturer, eachchip comes with a choice of cantilever with a broadrange of spring constants, so cantilever choice can

be tailored to the unfolding forces being measured(unfolding of mechanically-resistant proteins usuallyoccurs at∼50–200 pN, although other domains maywell unfold at forces much lower than these[16]).Empirically, standard silicon nitride cantilevers usedin different laboratories have spring constants thatvary between 30 and 100 pN nm−1, and all result inadequate signal to noise.

An important source of error in mechanical unfold-ing experiments is the calibration of the spring con-stant: this is necessary, as the actual spring constantsof individual cantilevers can differ appreciably fromthe approximate values given by manufacturers. Sev-eral methods can be used[40]. An elegant procedureand the most practicable, consists of measuring thespring constant from the power spectral density of thecantilever fluctuations due to the thermal noise[41].This method gives values which are within 20% ofthose obtained using other methods[42]. To a lesserextent, the measured spring constant of an individ-ual cantilever can vary within the day. This variationintroduces an experimental error in the force mea-surement that contributes to the spread of the data(seeSection 5). The variation of the spring constantof an individual cantilever can be minimised by cali-brating the spring constant in liquid after the systemhas reached thermal equilibrium (about an hour afterswitching on the laser). It is informative to record thespectral density of cantilever fluctuations at the end ofthe experiment, to make sure that no adhesion of massto the tip has taken place: any additional mass wouldlower the resonant frequency of the cantilever. Finally,it should be noted that, especially for soft cantilevers,the cantilever displacement can be a significant frac-tion of the vertical piezo displacement, and should besubtracted from this to obtain the true molecular length(tip–surface distance).

The concentration of the sample under study greatlyaffects the hit-rate and quality of the force–distancecurves obtained. Optimisation of this variable is im-portant to minimise the amount of denatured proteinon the surface[43], to minimise the probability ofpicking up more than one protein on the tip and yetstill obtain data at a practicable rate[35]. The problemof protein denaturation is partially solved by creatingconcatamers with terminal cysteine residues and per-forming the experiments on a gold covered surface.This promotes specific immobilisation of the protein

Page 7: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 93

via the strong Au–S bond (∼1.5 nN [44]) and alsoacts as a further purification step as proteins withoutsurface accessible cysteine side chains will not beimmobilised on the surface. (Incidentally, isolation ofthese polyproteins is usually undertaken by an affinitychromatography purification procedure which resultsin an approximately 95% pure sample. Whilst ion ex-change chromatography removes most other contam-inants, truncated versions of the polyprotein itself arestill present. We have found, therefore, that if a secondpurification step is necessary, size exclusion chro-matography results in a homogenous sample[26]).Protein concentrations used in these experiments varywidely with different groups utilising protein concen-trations ranging from 0.5 to 150�M, depending on theprecise nature of the experiment. When using a lowconcentration of protein, 150�l of protein solution isapplied onto a freshly evaporated gold slide and me-chanical unfolding experiments started directly afterthermal equilibration. The low protein concentrationresults in a low probability (∼4% [26]) of attachinga protein molecule to the tip per approach–retract cy-cle (or ‘hit rate’) but yields excellent force–distancecurves that show little sign of non-specific tip–surfaceor tip–denatured protein interactions. These artefacts,should they occur, are manifested close to the surface(Fig. 3). Experiments on short constructs (≤5 do-mains) and polyproteins in which features of interestoccur at short extensions[30] are thus best carried outat low protein concentration. However, the improve-ment in the quality and number of usable unfoldingpeaks that the low-hit rate confers can be offset, at lowretraction speeds, by the significantly increased timeneeded to obtain a reasonable dataset. An alternativeprocedure uses a much higher concentration of pro-tein (∼100�M), with a different preparation protocol.Approximately 600�l of protein solution is placedon a freshly evaporated gold slide. After a 15-minincubation, the slide is washed with excess buffer toremove unbound protein. The mechanical unfoldingexperiment is then commenced after placing the slideon the microscope and adding buffer. No matter whichprotocol is used to immobilise the protein onto thesurface, the non-specificity of the tip–protein interac-tion causes ambiguities. When the tip approaches thesurface, some protein, picked randomly, is adsorbedto it by non-specific interactions at some point alongthe protein’s length. When the distance from the tip

to the surface exceeds the protein length, the proteindetaches from the tip as this interaction is weakerthan the covalent Au–S bond. Thus, it is generally un-known precisely which domain is attached to the tipand, as a result, force–distance curves show a variablenumber of unfolding events. Only in the case whena full complement of unfolding peaks are observed isthe location of attachment (i.e. to the N- and C-ter-minal regions) known, an issue that is discussed inSection 4.

Temperature is an important parameter that is dif-ficult to control in single-molecule experiments. Thetemperature of the sample rises typically by about 3◦Cafter 8 h of experimental work, due to the heat gen-erated by the system. This small temperature gradientdoes not affect the measured value of the spring con-stant significantly. However, the heat generated by thesystem causes another effect: the evaporation of thesolvent. This can be a problem when working withsimple electrolyte buffers because the ionic strengthchanges with time. When using chemicals that mod-ulate protein stability (urea, guanidinium chloride orsodium sulphate) this effect severely limits the timeavailable for the data collection. The use of a closedexperimental cell would reduce evaporation and al-low the exchange of solvent, keeping its concentra-tion constant but will increase the inner temperature,since there is little heat exchange with the surround-ings (indeed several such cells are now commerciallyavailable). However, for experiments performed with-out closed cells, the results obtained for proteins insolutions at constant pH show that the variation oftemperature of about 3◦C does not have an observableeffect on the unfolding force[25].

The mechanical unfolding of proteins is a kineticprocess. By varying the unfolding rate of the proteinit is possible to gain information on the height andposition of the unfolding barrier relative to the foldednative state of the protein[45]. Compared to the BFP,for example, the AFM has a rather limited dynamicrange and problems can arise in obtaining data at thelimits of this range. At very low tip retraction rates (be-low 100 nm s−1) thermal drift may affect the measure-ment, while at very fast speeds (7000–10000 nm s−1),the viscosity of the solvent and the sampling rate ofthe hardware become limiting. The effect of this lim-ited sampling range on the data analysis is consideredlater. This problem will possibly be overcome with

Page 8: Force mode atomic force microscopy as a tool for protein folding studies

94 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

Fig. 3. Representative unfolding traces recorded for TI I27. The force–extension data for the approach of the cantilever to the surface areshown in grey and the retraction data in black. (a) An “ideal” trace in which seven of the eight modules are seen to unfold in series, aswould be expected if the eighth were involved in adhesion to the tip; no unusual behaviour is seen and the data are entirely consistentwith the unfolding of a single multi-modular protein. (b) In this example, the first peak is exceptionally large, probably due to interactionswith proteins on the surface. (c) An example in which two proteins have become attached to the tip. The first part of the trace containsslightly higher peaks, spaced about half the distance of those in trace (a), following which normal peaks at the usual spacing are observed.This is interpreted as initial pulling on two proteins which have a phase difference of about half the difference in contour length betweena folded and unfolded module. After one protein detaches in the middle of the trace, the familiar pattern of a single protein is observedagain. (d) Here, a protein has not been detached from the tip at the end of the previous pull, causing the slight curvature at the beginningof the approach. This is observed again at the end of the retract, despite the peak at 240 pN which suggests detachment of the proteinwhose unfolding is shown. An explanation may be that a second unfolded protein remains attached to the tip for the duration of themeasurement, and the unfolding peaks are from a protein which was picked up on contact with the surface in this trace.

more stable instruments and small cantilevers that suf-fer less viscous drag[46].

There is a variation in the actual pulling speed fromtrace to trace. However, this can be neglected sincethe speed variation is<10% and the force depends onthe logarithm of the pulling speed. Experiments car-ried out in a more viscous solution (2.5 M urea) showsimilar variations and the increased viscosity doesnot change the measured pulling speed (Toca-Herrera,Best and Clarke, unpublished data).

4. Variety of traces—picking peaks

A significant difference between single-moleculeand ensemble measurements of protein unfolding isthe need, in single-molecule measurements, to choosewhich traces to analyse. Ideally for a protein withndomains, every approach and retract cycle should re-sult in a saw-toothed pattern with between 1 andnunfolding events. In reality this is rarely the case, asthere is play-off between hit-rate and the quality of the

Page 9: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 95

data. The process of obtaining force–distance curves,measuring the unfolding force and forming frequencyhistograms could thus lead to potential systematic er-rors, depending on which peaks are analysed and com-bined in the dataset. In particular, one is faced withthe dilemma of deciding whether unusual traces rep-resent “rare events” or “alternative pathways” that itis hoped can be detected in single-molecule experi-ments, or whether they are artefacts arising from theuncontrolled nature of the protein adhesion to the tipand surface.

The large variety in traces, illustrated inFig. 3,means that each trace has to be evaluated. How do wethen decide which is a “good” trace and a “bad” trace?In Fig. 3a, we show an “ideal” trace against which oth-ers may be judged. It is not generally possible to col-lect data only from such “ideal” traces; shorter, or less“clean” traces have to be included to allow a reason-able data accumulation rate. How do we select these?The difficulty is that there is no agreed standard, so it ishard to compare data from user to user or laboratory tolaboratory. We suggest a number of criteria for choos-ing traces, and for deciding which peaks in the trace tomeasure. Applying strict consistent criteria will facil-itate comparison of data and remove the judgement ofan individual as a variable. As an illustration, we usepeaks from TI I27, which has been analysed in a num-ber of laboratories[23,25,26]and may be consideredthe “gold standard” against which other proteins can beassessed.

In general, therefore, to be included in a dataset:

1. Each trace should have peaks that are separated bya reproducible distance, accounted for by calcu-lating the difference in length between the foldedand unfolded domains[34]. In TI I27, the peakto peak distance is∼250± 20 Å, whilst in the fi-bronectin type III domain from human tenascin(TNfn3), this distance is slightly larger,∼260 Å[36]. Since the different modules unfold at slightlydifferent forces, the apparent peak to peak distancecan vary. However, a more precise measurementof protein length can be made by fitting an ana-lytical model for the force dependence to the data.One such model which fits well to many proteinsis the worm-like chain model. Fits of this modelto the rising edge of each peak allows the proteinlength to be determined from the difference in fit-

ted contour length between adjacent peaks[11].There are instances where proteins may unfold viaan intermediate, as has been proposed for spectrindomains[24] but a similar distance criterion canstill be set. A protein, where this type of analysishas proved to be impossible is barnase which un-folded at low forces “in parts”. In this case, bar-nase had been cloned using TI I27 as an internalstandard as described earlier, so it was possible toidentify the barnase unfolding peaks by comparison[25].

2. The base of each successive peak should be higherthan that of the preceding peak. The baselines donot return to zero as the unfolded protein exerts aresidual entropic force on the cantilever. If a peakresults from detachment of a protein molecule fromthe cantilever, the peak base will be lower than thatof the previous peak.

3. Each trace should have at least three clearly re-solved peaks that satisfy points 1 and 2. This makesit possible to evaluate the inter-peak distances. Thefirst peak should only be counted if there is a cleanpull-off from the surface. As shown inFig. 3b,features close to the surface are sometimes ob-scured by large force peaks in the nanoNewtonrange. It is usually fairly easy to distinguish be-tween these events and authentic domain unfoldingevents as tip–surface, non-specific tip–protein orprotein–protein interactions give rise to large forcesand to irregular unfolding distances. Such tracesare rejected. When using longer polyproteins (e.g.eight modules), all first peaks are routinely disre-garded[25].

4. The trace should report on the mechanical unfold-ing of a single protein attached to the cantilever.It is possible for more than one protein to adhereto the tip. Again, this gives rise to an easily recog-nisable force–distance curve: regular peaks that aremore closely spaced than can be accounted for bythe complete unfolding of a single protein. In thiscase, typically two proteins are attached to the tipand the distance between alternate peaks is equiv-alent to the expected unfolding distance (Fig. 3c).After some distance one protein becomes detached(characterised by a drop of the baseline to a lowerlevel) and then the remainder of the trace showsthe unfolding of a single domain. These traces arerejected.

Page 10: Force mode atomic force microscopy as a tool for protein folding studies

96 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

5. Sometimes, a protein immobilised away from thesurface is picked up by the tip. As a result, thepre-set retraction distance is not long enough to al-low the protein to detach and the protein remainsattached to the cantilever. This is manifested byno “pull-off” peak and a curved approach trace(Fig. 3d). Although increasing the data set size bytaking repeated measurements on a single moleculewould be ideal, baseline drift in the AFM makes itdifficult to acquire repeated unfolding traces on asingle polyprotein molecule. Each peak has to beevaluated against its own post pull-off baseline (asshown later). Where the protein has not detachedfrom the tip it is not possible to be confident of thebaseline and so these traces are not used.

After selecting traces according to the above crite-ria, every peak (defined as the maximum force beforea sharp drop) except the final “pull-off” peak (usually>0.5 nN, representing the detachment of the proteinfrom the cantilever tip) is measured and included inthe dataset without further bias or judgement.

5. Quantifying the unfolding forces

Having chosen which traces to use, the next step isto measure the forces at which unfolding occurs andthe distribution of these forces. In principle, the orderof peaks in a trace has an effect on the unfolding forcemeasured. This is caused by two opposing effects:firstly, as more modules become unfolded, the chanceof another domain unfolding per unit time, is reduced.As a result, the unfolding force for the later peaks ina trace is likely to be higher (this effect has implica-tions when estimating the intrinsic unfolding rate con-stant (k0

u) and is discussed further inSection 6). On theother hand, the earlier peaks in a trace are subjected toa much higher rate of force loading (because as moredomains are unfolded the compliance of the total sys-tem is increased) and, so unfold at higher forces. Ide-ally, the forces would be counted separately by peak(recent work from our laboratory has shown that theeffects described earlier result in a small, but mea-surable, event number dependence upon the unfoldingforce [47]). However, due to the limited number offorce–extension traces usually measured, most stud-ies thus far have pooled the data from all peaks in atrace.

Any measurement of force by the AFM must berelative to the baseline signal expected for zero force.The most obvious source for this would be the forcesmeasured on approach to the surface, when there is noprotein attached, as these can be used to provide a zeroforce measurement at the same displacement from thesurface as the unfolding events. However, due to ther-mal drift in the microscope (e.g. thermal expansionor contraction of the head assembly components), thezero force baseline at the beginning of the approach isoften significantly displaced from that at the end of theretraction, despite being at the same distance from thesurface with no protein attached; this is especially truefor traces acquired at slow pulling speeds. In practice,the best choice of baseline is to use the “zero force”data closest to the unfolding forces (in time), that is,the force measured immediately after the protein de-taches from the tip. The force baseline is establishedby extrapolating a linear fit to a section of data af-ter detachment back into the region where unfoldingtakes place. A correct baseline is then verified by aforce value within the noise at zero displacement.

The most common method of picking peak heightsis to count only the absolute maximum of each un-folding event. Clearly, however, there is some bias in-troduced by this, since the thermal noise in the forcemeasurements may be of the order of 15 pN. A poten-tial improvement would be to estimate the unfoldingforce from a curve fit close to the maximum; however,direct picking has the advantage of simplicity, speedand robustness.

The data sets collected at a particular pulling speedform a broad distribution of forces; apart from thesources of error mentioned in the previous sections,this is due to the thermal distribution of energy avail-able to each single module prior to unfolding. The dis-tribution of measured unfolding forces at a particulartip retraction speed can be assessed by the construc-tion of frequency histograms. These act as an impor-tant quality control on the data as they quickly reveallarge systematic errors between datasets and outlyingdata points. In order to form a histogram, from whicha mean or mode unfolding force can be measured, itis necessary to record a sufficient number of unfold-ing events at each pulling speed. It is found in prac-tice that collecting 40 to 50 peaks per pulling speedis usually sufficient for this purpose. The force at un-folding of each domain is then ‘binned’ into groups.

Page 11: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 97

Fig. 4. Sample histograms of the mechanical unfolding forcesof a TI I27 heteropolymer (five domains long[26]) obtainedat: 70 nm s−1 (solid grey bars), 600 nm s−1 (up diagonals) and4000 nm s−1 (down diagonals). Fits of a Gaussian distribution toeach histogram are shown in bold lines. Reprinted from[26] withpermission from the authors and the Biophysical Society.

A bin size of 10 pN is usually satisfactory as it givesenough data in each bin whilst forming enough bins toproduce a meaningful distribution. Examples of typi-cal histograms of the maximum unfolding forces (inthis case of mutant TI I27[26]) at different pullingspeeds are shown inFig. 4. While theory predicts thatthe histograms should be skewed towards lower force[48,49] (seeSection 6), this is only observed for verylarge datasets[23].

Information about the unfolding energy surface ispotentially available from both the distribution of un-folding forces at any particular pulling speed and thedependence of the unfolding force on pulling speed;either of these may be fitted to simple models (seeSection 6). However, both systematic and randomerrors broaden the histogram of force and, since it isdifficult to deconvolute these from the inherent ther-mal spread, it is more practical to use the dependenceof the unfolding force on pulling speed to determinekinetic parameters from the unfolding data. Whichstatistic should be used to assess a ‘typical’ unfoldingforce at a particular speed? As most theoretical anal-yses utilise the most probable unfolding force[49],the mode is naturally the most appealing statistic andit is also less sensitive to outlying data. However, forsmall datasets such as those usually obtained fromthe force mode AFM, the mode obtained is very sen-sitive to the bin size if calculated from a histogram

and to the type of distribution if calculated from thefit of a skewed distribution to the data. An improvedmethod might use a window of fixed width in forcesto scan the forces for the range containing the mostdata points. The mean, however, has the advantage ofsimplicity and can also be compared directly to the-oretical results as the mean is easily calculated fromthe Monte Carlo simulations discussed inSection 6.In practice, it often turns out that the two statisticsare very similar (within error), although the mode isobviously skewed to slightly higher force[26]. Toaccount for systematic errors, especially those in thecalibration of the spring constant and optical leversensitivity, at least three replicate datasets with differ-ent cantilevers are acquired. The unfolding force plot-ted at each pulling speed is, therefore, the average ofeither the three modes or means found from the trip-licate experiments. Therefore, a typical AFM datasetneeded for obtaining unfolding rate constants consistsof a set of unfolding forces for each pulling speedand at least three replicate datasets for each pullingspeed (as shown earlier).Fig. 5 illustrates the meansfor each experiment and the mean of these means fora mutated TI I27 construct. In general, as expected,

Fig. 5. The dependence of unfolding force on pulling speed fora TI I27 heteropolymer. The means of three different datasets,collected with different cantilevers on different protein sampleson different days (open circles) are combined to give a mean ofmeans (filled circles). Data taken from[26].

Page 12: Force mode atomic force microscopy as a tool for protein folding studies

98 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

the mean of the means gives a much better linearcorrelation than any of the data from individual tips.

One of the most frequent tasks that will arise in fu-ture studies of forced protein unfolding is the compari-son of wild-type and mutant data[26,31,32,50]. Whilethis could be done by fitting the data from each proteinto a separate kinetic model and comparing the result-ing parameters (as shown later), a more direct compar-ison may be made by comparing the unfolding forcesdirectly in the experimentally measured range[50]. Ifthe effect of mutation is only to lower the energy bar-rier, the slope of the plot of force against the logarithmof pulling speed will not change, and the change inunfolding forces may be measured directly using anaverage slope. The difference in unfolding forces pro-vides a direct measure of the difference in activationfree energy for mechanical unfolding for wild-typeand mutant protein. This can be used to estimate thestructure of intermediates and the transition state forunfolding[50]. However, where the slope changes onmutation the comparison becomes more complex[31](Williams, Best and Clarke, unpublished data).

6. Using two-state Monte Carlo simulations toobtain kinetic parameters

What information can force spectroscopy experi-ments furnish about the nature of the protein unfoldingenergy landscape? Since mechanical unfolding exper-iments are performed at a constant pulling speed andnot constant force, it is not possible to calculate un-folding rate constants as a function of force directly(note, however, that a force clamp experiment has beenperformed on polymeric TI I27[51] using an adaptedAFM). Instead, a model is required which capturesthe full loading history of the protein as well as de-tails of the energy barrier(s) to unfolding. FollowingBell’s consideration of unbinding[52], it is usuallyassumed that the unfolding barrier is lowered underforce F by an amountFxu, wherexu is the distancefrom the folded state to the transition state. The barrierto refolding is correspondingly raised byFxf , wherexf is the distance from the extended unfolded state tothe transition state. Usuallyxf is sufficiently large thatrefolding under force is virtually impossible and maybe neglected. Then, from transition state theory, theunfolding rate constant at forceF, ku(F) is given by

Eq. (6.1); ν andκ are, respectively, the vibrational fre-quency and transmission coefficient of the transitionstate,β is 1/kBT as usual,�G‡ the activation free en-ergy for unfolding along this pathway at zero force,andk0

u the unfolding rate constant at zero force.

ku(F ) = νκ exp(−β(�G‡ − Fxu))

= k0u exp(βFxu) (6.1)

The intrinsic unfolding rate constant is of particularinterest, since it can be compared with the rate constantat 0 M denaturant obtained from conventional proteinfolding studies. A plot of the natural logarithm of theunfolding rate constant of the protein (tip retractionrate divided by the length of the unfolded domain)against the mean (of the three datasets) of unfoldingforce should, therefore, yield a linear relationship witha gradient related toxu and ay-axis intercept yieldingthe intrinsic unfolding rate constant. However, thisequation does not take the number of domains in thepolyprotein into account. The number of domains inthe polymer is known to affect both the measured peakand average forces for unfolding[26,53]. Calculationshows that the smaller the number of domains in apolymer, the greater the measured unfolding force.For an I27 construct, increasing the number of do-mains from 1 to 50 reduces the mean unfolding forceby about 40 pN, due to the fact that domains unfoldindependently of one another. The more domains,the greater is the chance of an unfolding event in agiven time interval. The number of domains affectsthe observed unfolding force and, therefore, apparentintrinsic unfolding rate constant[26]. The most usefulparameter, that can be directly compared to the rateof unfolding at zero molar denaturant is the unfoldingrate at zero force,k0

u. There is no closed solution forthis model, so it must be solved numerically.

By using the relation between force and unfoldingrate constant (Eq. (6.1)), and the known dependence offorce on extension (from the fit of the worm-like chainmodel to the experimental traces, for example), it ispossible to simulate unfolding traces using a simpleMonte Carlo procedure[54]. A single pull of a numberof modules in series is started with all domains foldedat zero extension. The simulation proceeds in discretetime steps, with the extension per step calculatedfrom the pulling speed: this also permits the calcula-tion of the force and unfolding rate for that step. The

Page 13: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 99

probability of unfolding for a single module isp1

u(F ) = ku(F )�t for time step�t and forn modulesis approximatelypn

u(F ) = nku(F )�t for sufficientlysmall �t. Unfolding is determined by generating arandom numberr between 0 and 1: ifr < pn

u(F ),unfolding occurs and the contour length of the wholepolyprotein is increased by the difference in lengthbetween a folded and unfolded module. It is vital thata sufficiently small time step is chosen in order tointegrate the forces accurately and to ensure that theprobability of unfolding in a particular step remainswell below 1.0.

The above procedure may be repeated to generateas many simulated traces as desired, over a range ofpulling speeds. Calculating the average force at eachpulling speed allows one to generate a plot of forceagainst (natural) logarithm pulling speed just as forthe experimental data. By varying the input parametersxu andku, one can iteratively improve the agreementbetween experimental and simulated data. The valueof xu gives an indication of how sensitive the protein isto force: a protein with a largexu will be more sensitiveto force and the slope of force against the logarithmof pulling speed will be shallower;ku is related to theintercept of this plot—a largeku will allow unfoldingat lower forces, hence reduce the intercept.

Clearly the parametersxu and ku provide a use-ful, if simplified, interpretation of a protein’s energylandscape, and allow one to compare the mechanicalstrength of different proteins[35]. Yet the strength ofany such conclusions depends on having some esti-mate of the uncertainty in the estimated parameters.This is difficult to obtain since the fit to the data isquite time consuming in the first instance, even if au-tomated; consequently, this has rarely been consideredin the literature. We present here a short study of therelation of the goodness of fit to the input parametersxu andku.

Firstly, we show how the fit varies with systematicvariation of the input parameters. By varyingxu andku on a logarithmic grid of values, a synthetic datasetof unfolding force as a function of pulling speed maybe generated. The goodness of fit may be estimated bycalculatingχ2 for each pair of values (xu, ku), as de-fined inEq. (6.2). In this sum, over theN experimentaldata points,Fexp(ln vi) is the experimental force forpulling speedvi , Fsim(ln vi; xu, ku) is the force pre-dicted from the simulations at that speed using (xu,

ku), andσ i is the standard deviation of pointi. Thisstandard deviation was approximated as 20 pN for alldata points.

χ2(xu, ku) =N∑

i=1

(Fexp(ln vi)−Fsim(ln vi; xu, ku)

σi

)2

(6.2)

If the errors are normally distributed, the probabil-ity of the dataset for a given set of parameters is givenby p(xu, ku) ∝ exp(−χ2/2). A plot of exp(−χ2/2)

for a titin I27 (TI I27) dataset is shown inFig. 6(a):large values correspond to a “good” or “likely” fit tothe data. It is clear that the two Monte Carlo param-eters are strongly coupled in the fit. Choosing fourparameter pairs of similar likelihood and plottingthe simulated data together with the experimentaldata gives an illustration of the uncertainty in the fit(Fig. 6(b) and (c)). In the case of TI I27, there isclearly a relatively small range of parameters whichwill fit the data well; however, a similar calculationfor TNfn3, which has anxu of around 8 Å (data notshown), shows that the parameters which accept-ably describe the data vary over a wide range (rateconstants vary over orders of magnitude). Thus, thesensitivity of the fit is much poorer for proteins witha largexu (smaller slope ofF against lnv).

In practice, an estimate of the errors onxu andkumay be made using the data themselves: if there is areasonable number of data points, an error estimationmay be made by a Monte Carlo process using a pro-cedure known as ‘boot strapping with replacement’(not to be confused with the simulations of AFM datadescribed earlier). From theN original data points, anumber (e.g. 30–100) of synthetic datasetsDi are gen-erated by selectingN data points randomly with re-placement (i.e. the same data point may be selectedmore than once). Each such dataset could reasonablyhave been generated by the force mode AFM experi-ment. If these sets are each fitted to a pair (xu, ku)i , thenthe uncertainty in the parameters may be estimated asthe standard deviation of the parameters found fromthe synthetic sets[55].

7. How can force spectroscopy aid ourunderstanding of protein folding?

Atomic force microscopy is now sufficiently acces-sible to the non-specialist that it can be considered as a

Page 14: Force mode atomic force microscopy as a tool for protein folding studies

100 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

Fig. 6. Evaluating the errors in the Monte Carlo fits. (a) Likelihood function exp(−χ2/2) for the fit of an experimental Titin I27 datasetby a Monte Carlo simulation described by parameters (xu, ku). (b) Magnification showing points of similar likelihood A–D chosen forillustration. (c) Plot of best fit line, together with fits chosen from (b) to demonstrate the range of reasonable fits. Note that for TI I27“likely” values for ku vary over∼1 order of magnitude and values ofxu by over 10%. Reproduced from[50] with permission.

Page 15: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 101

standard tool for studying protein unfolding, allowinga range of problems to be explored. Most obviously,force, rather than chemical denaturant acts to unfoldthe protein, potentially probing alternative regions ofthe energy landscape to those sampled in conventionalbulk solution studies. For many proteins which need toresist force for their function, such as the immunoglob-ulin (Ig) and fibronectin type III (fnIII) domains fromthe muscle protein titin, this may be the more relevantmethod for studying their unfolding kinetics. Thereare, however, many other questions to address, evenfor proteins with no physiological need for mechani-cal strength.

Is resistance to force a special property, which ex-ists only in proteins which need it for their function?Of the systems studied so far, it seems as though theproteins with truly no requirement to resist force (theenzymes T4 lysozyme[34] and barnase[25]) do gen-erally unfold at lower forces than those with mechan-ical functions such as the Ig and fnIII domains oftitin [23,33,56], and fnIII domains from tenascin[21]and fibronectin[57]. However, the cytoskeletal pro-tein spectrin, which as a structural protein arguablyneeds some tensile strength, unfolds at similar forcesto T4 lysozyme[22]. Thus, mechanical strength is notunique to proteins with mechanical function and infact may exist to a similar degree in proteins with nosuch function.

What is noticeable from the examples given, how-ever, is that the proteins with the highest mechanicalstrength tend to be all-� proteins, in which the�-sheethydrogen bonding acts to resist mechanical unfold-ing. Of those unfolding at lower forces, spectrin is�-helical and T4 lysozyme is predominantly�-helicaland the�-sheet of barnase does not stabilise the coreof the protein against mechanical disruption. Is me-chanical strength then determined mainly by the fold?This is doubtless a very important factor, but the vari-ation in mechanical strength of proteins with a verysimilar fold and structure (for example, the I27 and I28domains of titin, or the fnIII domains in fibronectinand titin) emphasises the important modulating effectof the sequence on mechanical strength.

Force mode AFM, as a single-molecule exper-iment, is especially suitable for comparison withsimulations of unfolding, which also deal with singlemolecules. Since force acts along a single dimension,the unfolding potential may be described in terms of a

well-defined structural coordinate, namely end-to-endlength, facilitating comparison with simulations; forconventional bulk denaturation, the reaction coordi-nate must be indirectly inferred. This is importantbecause molecular dynamics simulations may beable to supply details about the unfolding processnot available from the experiment where the reactionco-ordinate is simply distance. The level of detail thatcan be obtained by molecular dynamics and other sim-ulations is demonstrated by work on TI I27[58–61].Despite differences in methodology all these simula-tions broadly agree that mechanical resistance of TII27 to force is in some part related to the protein’stopology and the existence of a so-called ‘mechanicalclamp’ involving the A′ and G strands. Moreover, thepresence of an unfolding intermediate identified ex-perimentally was also observed by steered moleculardynamics and was due to the peeling-off of a strandoutside of the clamp region in the simulation. Thissimulation was supported by mechanical unfoldingexperiments on mutated forms of TI I27[30,32]. Thesynergy between experiment and simulation is thuspowerful and will be a very important feature of thiswork in the future.

One of the most frustrating aspects of thesesingle-molecule unfolding experiments is that, inprinciple, they should allow observation of rare eventsor of separate populations of molecules that responddifferently to force. Yet it is probable that, in trying tobe consistent in picking data, traces are discarded thatwould give an opportunity to address these questions.Two possibilities exist to resolve this ambiguity. Thefirst would be to develop a method to systematicallyanalyse all traces, probably automatically, to pick uprare events. Alternatively, it should be possible to pullthe same protein molecule multiple times without re-leasing it from the tip. Any unusual events observedwould be genuine and not the result of “non-specific”interactions with other molecules on the surface. Fur-thermore, many more traces will have to be analysedat any given pulling speed to be certain that smallpopulations of molecules will be detected in a his-togram of forces or of distances between peaks. Thatthis will be worthwhile has been demonstrated for TII27, where small populations of misfolded proteinshave been observed[62] and detailed analysis of theleading edge of initial unfolding events revealed thepresence of an unfolding intermediate[30] (Fig. 7).

Page 16: Force mode atomic force microscopy as a tool for protein folding studies

102 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

Fig. 7. Detailed analysis of force traces can reveal new features. A feature of the force–distance traces of TI I27 is the existence of ashoulder on the side of the force peaks in certain traces, causing a deviation from the fit of a worm-like chain model. The deviation isgreatest for the earlier peaks in the trace; an example is shown for a trace recorded from a TI I27 octameric construct[25]. In I27, thiswas explained by the detachment of the A-strand from the body of the protein at approximately 100 pN, causing a slight lengthening, anobservation supported by molecular dynamics simulations[30].

8. Comparing traditional folding studies withthose performed using force spectroscopy

How far can the data obtained from a force modeAFM experiment be compared to data obtained frommore established methods? In traditional studies un-folding is typically initiated by rapidly mixing a nativeprotein solution with a solution of denaturant. Unfold-ing is then followed using a sensitive probe of changein environment (usually intrinsic tryptophan fluores-cence). The unfolding rate constant at any concentra-tion of denaturant can be measured directly with highaccuracy. To obtain the unfolding rate constant at 0 Mdenaturant it is possible to extrapolate directly. Thisrequires neither fitting to a complex model nor MonteCarlo simulation. However, the longer the extrapola-tion the more error associated with this extrapolatedvalue ofk0

u. However, unlike an AFM experiment it ispossible to improve the extrapolation by varying theexperimental conditions. If, for instance, the proteinonly unfolds at high concentrations of urea (equal to6 M) so that the extrapolation is long and data arecollected over only a small range of denaturant con-centrations, it is possible to use a stronger denaturantsuch as guanidinium chloride to address both theseproblems. The other variable that can be determinedfrom chemical unfolding data is an unfoldingm-value,determined from the dependence of the unfolding rateconstant on denaturant concentration. This can be

used to determineβT, analogous to a Tanfordβ-value,which gives an estimate of the position of the transitionstate along the folding co-ordinate, in terms of solventexposure, relative to the native state[63]. In AFMexperiments, the analogous measure of the distancebetween the native state and the unfolding transitionstate isxu. It is not clear howxu and the unfoldingm-value are related, if at all, and there is certainly notlikely to be a simple relationship between them, al-though where these have been compared for TI I27 itis possible to say, in both the cases, that the transitionstate is close to the native state[23]. When using forcemode AFM neitherk0

u nor xu can be obtained with asmuch confidence as the corresponding values from aset of stopped–flow experiments. For a protein like TII27 that has a strong dependence of force on pullingspeed, data are collected over a wide range of forces(>100 pN) and errors are at a minimum. However, forTNfn3 the dependence of force on pulling speed issignificantly lower, leading to a significantly lowerrange of pulling forces (∼30 pN) with a resultanthigher error on estimates of bothk0

u andxu (Fig. 8).There are important differences between these ex-

perimental techniques. In the unfolding experimentsusing denaturant there is an implicit understandingthat the principle of microscopic reversibility can beapplied. It is possible to study unfolding and refold-ing under a wide range of conditions and to test thatthis microscopic reversibility holds directly. This is

Page 17: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 103

Fig. 8. Where data can only be collected over a narrow range theconfidence in estimates ofku and xu is significantly lower. TheAFM data displayed as in a more traditional protein unfoldingexperiment with denaturant (in this case, force) on thex-axis and aparameter that is related to the unfolding rate (pulling speed) on they-axis. In effect, a Monte Carlo simulation extrapolates from theexperimental range of forces to zero force. Just as in determiningk0

u, where a chemical denaturant is used, the quality of such anextrapolation depends critically on the experimental range of thedata, the certainty with which the dependence ofku on denaturant(or in this case, the dependence of force on pulling speed) can bedetermined, and the length of the required extrapolation to zerodenaturant. Thus, where there is a large dependence of force onpuling speed (e.g. TI I27, range of forces at experimental pullingspeeds >100 pN, open circles) the unfolding rate at zero force(proportional to the pulling speed atF = 0) can be determined withrelative confidence. However, where the range of measured forcesis smaller, and close to the error in the measurements (TNfn3,range of forces at experimental pulling speeds∼30 pN, filledcircles) there is significant uncertainty in the determined valuesof k0

u. Unfortunately, unlike denaturant unfolding experiments, therange over which data are collected cannot be easily extended inan AFM experiment.

clearly not true in an AFM unfolding experiment. Itis not possible to study refolding under applied forcedue to the low probability of refolding occurring inthe experimental range of forces. Refolding rates atzero force, while the protein remains attached to thecantilever, have been determined using the force modeAFM [23]. However, it is quite probable that thisrefolding reaction will not be the reverse of the me-chanical unfolding. A force imposes a directional bias

on the pathway of unfolding and this will not be trueof refolding in the absence of force. This means thatfree energies cannot be calculated from AFM data,whereas in traditional experiments a��Gunfolding canoften be determined from the ratio of unfolding andrefolding rate constants. To determine the effect ofmutation on the free energy of a protein more tradi-tional equilibrium measurements will have to be used.

Perhaps the most important question, that needsto be addressed, is whether unfolding by force andchemical denaturant are the same: if they are alwaysidentical, are we learning anything more from forcespectroscopy studies? The evidence so far for thesystems that have been studied suggests that they areprobably different. Although the unfolding rate ofwild-type TI I27 extrapolated to zero force and zerodenaturant are tantalisingly similar, comparison ofMD simulations[58,61], a detailed study by mutage-nesis[64] and AFM studies on mutants of I27[26,32]suggest that the pathways are not the same. Force anddenaturant act in completely different ways, denatu-rant tending to favour expanded states with greatersolvent accessibility, with force being directional andtending to unravel the protein from its termini. Thishas also been seen for a molecular dynamics sim-ulation of the unfolding of barnase by thermal andmechanical means[25]. Therefore, force mode AFMis indeed a complimentary tool that can probe alter-native reaction pathways that in some cases (such astitin) may be physiologically relevant.

9. Conclusion

Force mode AFM is a new, accessible and excitingtool to protein folders. Just as in more traditional pro-tein folding experiments using, for example, trypto-phan fluorescence studies, mechanical unfolding willonly give us detailed information when combined withother techniques, such as mutation, changing solutionconditions (pH, temperature and addition of salts,denaturants and ligands) and, hopefully in the nearfuture, combined with other detection methods suchas single-molecule fluorescence experiments. Theusefulness of combining data from force spectroscopymeasurements with those obtained from more estab-lished techniques should not be underestimated. Thereis also the direct opportunity to work closely with

Page 18: Force mode atomic force microscopy as a tool for protein folding studies

104 R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105

theoreticians to analyse the dynamics of the systemand to devise experiments to specifically test pre-dictions from theory. More detailed analysis of theexperimental data has the potential to allow us toinvestigate rare events and low populations of foldedstates. Hopefully, the technology will develop to allowthe economical production of small, soft cantileversand to provide more sensitive instruments with alarger accessible range of pulling speeds and forcesyet maintain ease of use so that the non-specialist canmake use of the technique.

This field of single-molecule mechanical unfoldingis still in its infancy. It has already revealed new andexciting information about proteins and polymers. Me-chanical unfolding studies using different instrumentson a larger set of mutated proteins and on differentprotein families with different topologies coupled withthe information gained from simulations are sure toreveal even more exciting information on the way pro-teins behave under force.

Acknowledgements

Cambridge: R.B. Best is funded by a Mandela fel-lowship from the Cambridge Commonwealth Trust.J.L. Toca-Herrera and J. Clarke are funded by the Well-come Trust. We acknowledge support from the MRC.J. Clarke is a Wellcome Trust Senior Research Fel-low. Leeds: We would like to thank Godfrey Beddard,Peter Olmsted, John Trinick and Rebecca Zinober atLeeds University for many helpful discussions. D.J.Brockwell is funded by the University of Leeds andBBSRC, A.W. Blake is funded by the EPSRC and S.E.Radford is by a BBSRC Professorial Research Fellow.A.W. Blake, D.J. Brockwell, S.E. Radford and D.A.Smith are members of the Astbury Centre for Struc-tural Molecular Biology which is part of the North ofEngland Structural Biology Centre (NESBIC) and isfunded by the BBSRC.

References

[1] G. Binnig, C.F. Quate, C. Gerber, Phys. Rev. Lett. 56 (1986)930.

[2] N.A. Burnham, R.J. Colton, J. Vac. Sci. Technol. A 7 (1989)2906.

[3] A. Raab, W.H. Han, S.J. Smith-Gill, S.M. Lindsay, H.Schindler, P. Hinterdorfer, Nat. Biotech. 17 (1999) 902.

[4] E.L. Florin, V.T. Moy, H.E. Gaub, Science 264 (1994) 415.[5] G.U. Lee, L.A. Chrisey, R.J. Colton, Science 266 (1994) 771.[6] M. Rief, F. Oesterhelt, B. Heymann, H.E. Gaub, Science 275

(1997) 1295.[7] E. Evans, K. Ritchie, R. Merkel, Biophys. J. 68 (1995) 2580.[8] A. Ashkin, J.M. Dziedzic, Science 235 (1987) 1517.[9] F. Amblard, B. Yurke, A. Pargellis, S. Leibler, Rev. Sci.

Instrum. 67 (1996) 818.[10] L. Tskhovrebova, J. Trinick, J.A. Sleep, R.M. Simmons,

Nature 387 (1997) 308.[11] M. Rief, M. Gautel, F. Oesterhelt, J.M. Fernandez, H.E. Gaub,

Science 276 (1997) 1109.[12] M.S.Z. Kellermayer, S.B. Smith, H.L. Granzier, C.

Bustamante, Science 276 (1997) 1112.[13] J. Fritz, A.G. Katopodis, F. Kolbinger, D. Anselmetti, Proc.

Natl. Acad. Sci. U.S.A. 95 (1998) 12283.[14] P. Carl, C.H. Kwok, G. Manderson, D.W. Speicher, D.E.

Discher, Proc. Natl. Acad. Sci. U.S.A. 98 (2001) 1565.[15] E. Evans, A. Leung, D. Hammer, S. Simon, Proc. Natl. Acad.

Sci. U.S.A. 98 (2001) 3784.[16] M. Carrion-Vazquez, A.F. Oberhauser, T.E. Fisher, P.E.

Marszalek, H.B. Li, J.M. Fernandez, Prog. Biophys. Mol.Biol. 74 (2000) 63.

[17] T.E. Fisher, A.F. Oberhauser, M. Carrion-Vazquez, P.E.Marszalek, J.M. Fernandez, Trends Biochem. Sci. 24 (1999)379.

[18] K. Mitsui, M. Hara, A. Ikai, FEBS Lett. 385 (1996) 29.[19] T. Wang, A. Ikai, Jpn. J. Appl. Phys. 1 38 (1999) 3912.[20] A. Idiris, M.T. Alam, A. Ikai, Protein Eng. 13 (2000) 763.[21] A.F. Oberhauser, P.E. Marszalek, H.P. Erickson, J.M.

Fernandez, Nature 393 (1998) 181.[22] M. Rief, J. Pascual, M. Saraste, H.E. Gaub, J. Mol. Biol. 286

(1999) 553.[23] M. Carrion-Vazquez, A.F. Oberhauser, S.B. Fowler, P.E.

Marszalek, S.E. Broedel, J. Clarke, J.M. Fernandez, Proc.Natl. Acad. Sci. U.S.A. 96 (1999) 3694.

[24] P.F. Lenne, A.J. Raae, S.M. Altmann, M. Saraste, J.K.H.Horber, FEBS Lett. 476 (2000) 124.

[25] R.B. Best, B. Li, A. Steward, V. Daggett, J. Clarke, Biophys.J. 81 (2001) 2344.

[26] D.J. Brockwell, G.S. Beddard, J. Clarkson, R.C. Zinober,A.W. Blake, J. Trinick, P.D. Olmsted, D.A. Smith, S.E.Radford, Biophys. J. 83 (2002) 458.

[27] J.R. Potts, I.D. Campbell, Matrix Biol. 15 (1996) 313.[28] K.A. Scott, A. Steward, S.B. Fowler, J. Clarke, J. Mol. Biol.

315 (2002) 819.[29] S. Labeit, B. Kolmerer, Science 270 (1995) 293.[30] P.E. Marszalek, H. Lu, H.B. Li, M. Carrion-Vazquez, A.F.

Oberhauser, K. Schulten, J.M. Fernandez, Nature 402 (1999)100.

[31] H.B. Li, M. Carrion-Vazquez, A.F. Oberhauser, P.E.Marszalek, J.M. Fernandez, Nature Struct. Biol. 7 (2000)1117.

[32] S.B. Fowler, R.B. Best, J.L. Toca-Herrera, T.J. Rutherford,A. Steward, E. Paci, K. Karplus, J. Clarke, J. Mol. Biol. 322(2002) 841.

Page 19: Force mode atomic force microscopy as a tool for protein folding studies

R.B. Best et al. / Analytica Chimica Acta 479 (2003) 87–105 105

[33] H. Li, A.F. Oberhauser, S.B. Fowler, J. Clarke, J.M.Fernandez, Proc. Natl. Acad. Sci. U.S.A. 92 (2000) 6527.

[34] G. Yang, C. Cecconi, W.A. Baase, I.R. Vetter, W.A. Breyer,J.A. Haack, B.W. Matthews, F.W. Dahlquist, C. Bustamante,Proc. Natl. Acad. Sci. U.S.A. 97 (2000) 139.

[35] R.B. Best, J. Clarke, Chem. Comm. (2002) 183.[36] A. Steward, J.L. Toca-Herrera, J. Clarke, Protein Sci. 11

(2002) 2179.[37] A.S. Politou, M. Gautel, C. Joseph, A. Pastore, FEBS Lett.

352 (1994) 27.[38] S.J. Hamill, A.E. Meekhof, J. Clarke, Biochemistry 37 (1998)

8071.[39] A.E. Meekhof, S.J. Hamill, V.L. Arcus, J. Clarke, S.M.V.

Freund, J. Mol. Biol. 282 (1998) 181.[40] B. Cappella, G. Dietler, Surf. Sci. Rep. 34 (1999) 1.[41] J.L. Hutter, J. Bechhoefer, Rev. Sci. Instrum. 64 (1993)

1868.[42] E.L. Florin, M. Rief, H. Lehmann, M. Ludwig, C. Dornmair,

V.T. Moy, H.E. Gaub, Biosens. Bioelectron. 10 (1995)895.

[43] V. Hlady, J. Buijs, in: M. Malmsten (Ed.), Biopolymers atInterfaces, Marcel Dekker, New York, 1998, pp. 181.

[44] M. Grandbois, M. Beyer, M. Rief, H. Clausen-Schaumann,H.E. Gaub, Science 283 (1999) 1727.

[45] R. Merkel, P. Nassoy, A. Leung, K. Richie, E. Evans, Nature397 (1999) 50.

[46] M.B. Viani, T.E. Schaffer, A. Chand, M. Rief, H.E. Gaub,P.K. Hansma, J. Appl. Phys. 86 (1999) 2258.

[47] R.C. Zinober, D.J. Brockwell, G.S. Beddard, A.W. Blake, P.D.Olmsted, S.E. Radford, D.A. Smith, Protein Sci. 11 (2002)2759.

[48] E. Evans, K. Ritchie, Biophys. J. 72 (1997) 1541.[49] E. Evans, K. Richie, Biophys. J. 76 (1999) 2439.[50] R.B. Best, S.B. Fowler, J.L. Toca-Herrera, J. Clarke, Proc.

Natl. Acad. Sci. U.S.A. 99 (2002) 12143.[51] A.F. Oberhauser, P.K. Hansma, M. Carrion-Vazquez, J.M.

Fernandez, Proc. Natl. Acad. Sci. U.S.A. 98 (2001) 468.[52] G.I. Bell, Science 200 (1978) 618.[53] D.E. Makarov, P.K. Hansma, H. Metiu, J. Chem. Phys. 114

(2001) 9663.[54] M. Rief, J.M. Fernandez, H.E. Gaub, Phys. Rev. Lett. 81

(1998) 4764.[55] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery,

Numerical Recipes in C, Cambridge University Press,Cambridge, UK, 1992.

[56] M. Rief, M. Gautel, A. Schemmel, H.E. Gaub, Biophys. J.75 (1998) 3008.

[57] Y. Oberdorfer, H. Fuchs, A. Janshoff, Langmuir 16 (2000)9955.

[58] H. Lu, B. Isralewitz, A. Krammer, V. Vogel, K. Schulten,Biophys. J. 75 (1998) 662.

[59] H. Lu, K. Schulten, Biophys. J. 79 (2000) 51.[60] D.K. Klimov, D. Thirumalai, Proc. Natl. Acad. Sci. U.S.A.

96 (1999) 6166.[61] E. Paci, M. Karplus, Proc. Natl. Acad. Sci. U.S.A. 97 (2000)

6521.[62] A.F. Oberhauser, P.E. Marszalek, M. Carrion-Vazquez, J.M.

Fernandez, Nat. Struct. Biol. 6 (1999) 1025.[63] A.R. Fersht, Structure and Mechanism in Protein Science: A

Guide to Enzyme Catalysis and Protein Folding, Freeman,New York, 1998.

[64] S.B. Fowler, J. Clarke, Structure 9 (2001) 355.