Crystal and NMR structures of a Trp-cage mini-protein ... · Crystal and NMR structures of a...

5
Crystal and NMR structures of a Trp-cage mini-protein benchmark for computational fold prediction Michele Scian a,1 , Jasper C. Lin a , Isolde Le Trong b,c , George I. Makhatadze d , Ronald E. Stenkamp b,c,e,1 , and Niels H. Andersen a,1 a Department of Chemistry, University of Washington, Seattle, WA 98195; b Department of Biological Structure, University of Washington, Seattle, WA 98195; c Biomolecular Structure Center, University of Washington, Seattle, WA 98195; d Department of Biology, Rensselaer Polytechnic Institute, Troy, NY 12180; and e Department of Biochemistry, University of Washington, Seattle, WA 98195 Edited by Alan R. Fersht, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom, and approved June 25, 2012 (received for review December 26, 2011) To provide high-resolution X-ray crystallographic structures of a peptide with the Trp-cage fold, we prepared a cyclized version of this motif. Cyclized Trp-cage is remarkably stable and afforded two crystal forms suitable for X-ray diffraction. The resulting higher re- solution crystal structures validate the prior NMR models and pro- vide explanations for experimental observations that could not be rationalized by NMR structural data, including the structural basis for the increase in fold stability associated with motif cyclization and the manner in which a polar serine side chain is accommodated in the hydrophobic interior. A hexameric oligomer of the cyclic pep- tide is found in both crystal forms and indicates that under appro- priate conditions, this minimized system may also serve as a model for proteinprotein interactions. proteinprotein interaction folding simulation target protein cyclization peptide oligomerization H-bonding motifs T rp-cage miniproteins (at 1820 residue length) are among the smallest peptides to fold into a stable protein-like structure (1, 2); as a result, the Trp-cage fold has emerged as a protein fold- ing paradigm (3, 4). Trp-cage species have been extensively stu- died experimentally (411) and computationally (1219), but the conclusions regarding the folding process and results drawn from these studies are not in accord at the level that should be attain- able for such a small system. The structural model used for testing computational folding methods has, in almost all cases, been that derived by NMR for the initially reported Trp-cage (TC5b), which displays a rather low fold stability, T m ¼ 43 °C (1). Additional Trp-cage structures based on NMR measurements are now available (2, 6, 10, 2022). All of these appear to possess a common fold topology that is sup- ported by an extensive set of structuring-induced shifts (CSDs, che- mical shift deviations from random coil norms) that are largely the result of ring current effects. The uniformity of these CSDs suggests a well-defined common structure. However, other spec- troscopic measurements of Trp-cage molecules have been inter- preted as being consistent with a folded structure undergoing a greater degree of dynamic fluctuation (7, 8). In addition, foldedstructures from molecular dynamics simulations, with a few notable exceptions (18, 19), fail to predict the ring current effects observed by NMR (2). The failure of numerous calculations to reproduce the NMR structures and contrary conclusions from other experi- mental methods is problematic and points to outstanding issues concerning the continuing* use of a Trp-cage as a benchmark for molecular dynamics folding simulations. Further structural studies of Trp-cage species are required to validate the target and to as- certain the key interatomic interactions leading to the folded state. Herein we report X-ray structure determinations for two crys- tal forms of a cyclized Trp-cage (23), cyclo-TC1 (-GDAYAQW- LADGGPSSGRPPPSG-). Raising additional questions about the oligomeric state of the molecule, a Trp-cage hexamer is found in both crystal forms. However, the single chain structures ob- served in the solid-state (nine unique molecular structures in two crystal forms) are remarkably similar to solution-state NMR structures, validating this computational target and providing a more detailed view of the atomic interactions stabilizing the folded state. In addition, the hexameric state observed in the crys- tals may serve as a tractable model for computational studies of proteinprotein interactions and oligomer formation. Results Models of Trp-cages derived from NMR measurements all dis- play fraying at the N and C termini and considerable variability in the conformations of exposed side chains. Reasoning that this type of fluxionality might retard crystallization, we replaced Lys8 with a helix-favorable alanine residue and added a glycine residue to each terminus to set the stage for cyclization. The resulting acyclic peptide (TC1) could, analogously to an early study of BPTI (23), be cyclized using a peptide coupling agent under aqu- eous conditions. Cyclo-TC1 proved to be the most stable Trp-cage fold prepared to date, significantly more stable than our prior hyperstable construct (6) in which Gly10 and Gly15 had been mutated to D-Ala (TC16b, T m ¼ 83 °C, ΔG U 280 ¼ 18.6 kJmol). Cyclo-TC1 displayed significant NH exchange protection at pD 78, with the largest protection factor (Gly11-H N ) corresponding to ΔG U ¼ 20.5 kJmol. NMR melts (see SI Appendix) support a two-state unfolding scenario with the folded state stabilized by circa 8 kJmol by the cyclizing loop. Circular dichroism (CD) measurements provided an accurate determination of the melting point (T m ¼ 95 1 °C). This cor- responds to a ΔT m in excess of 35 °C versus the acyclic control (TC1, T m ¼ 58 °C) of identical sequence. Existing calibrations allow increases in T m , as measured by CD melts, to be converted into ΔΔG U 280 values (2, 5, 6): ΔΔG U cyc ¼ 8.2 kJmol. The addi- tional fold stability of cyclo-TC1 was even greater in the presence of denaturants; the CD melts for cyclo-TC1 throughout a GdmCl denaturation experiment appear in Fig. S1B in SI Appendix. The fold stabilization observed upon cyclization far exceeds that ex- pected from entropy effects for a cyclic peptide of this size (24) and thus requires some structural rationale. Author contributions: N.H.A. designed research; M.S., J.C.L., I.L.T., and G.I.M. performed research; M.S., J.C.L., I.L.T., G.I.M., R.E.S., and N.H.A. analyzed data; and M.S., R.E.S. and N.H.A. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 3UC7, 3UC8, and 2LL5); the NMR chemical shifts have been deposited in the BioMagResBank, www.bmrb.wisc.edu (accession no. 18023). *As detailed in the SI Appendix, at least 40 publications reporting either new experiments on Trp-cage species or the use of the Trp-cage as the target structure for computational folding or fold prediction methods appeared in 2011 and the first few months of 2012. 1 To whom correspondence may be addressed. E.mail: [email protected] or [email protected] or [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1121421109/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1121421109 PNAS July 31, 2012 vol. 109 no. 31 1252112525 BIOPHYSICS AND COMPUTATIONAL BIOLOGY

Transcript of Crystal and NMR structures of a Trp-cage mini-protein ... · Crystal and NMR structures of a...

Crystal and NMR structures of a Trp-cage mini-proteinbenchmark for computational fold predictionMichele Sciana,1, Jasper C. Lina, Isolde Le Trongb,c, George I. Makhatadzed, Ronald E. Stenkampb,c,e,1, andNiels H. Andersena,1

aDepartment of Chemistry, University of Washington, Seattle, WA 98195; bDepartment of Biological Structure, University of Washington, Seattle, WA98195; cBiomolecular Structure Center, University of Washington, Seattle, WA 98195; dDepartment of Biology, Rensselaer Polytechnic Institute, Troy,NY 12180; and eDepartment of Biochemistry, University of Washington, Seattle, WA 98195

Edited by Alan R. Fersht, Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom, and approved June 25, 2012 (received forreview December 26, 2011)

To provide high-resolution X-ray crystallographic structures of apeptide with the Trp-cage fold, we prepared a cyclized version ofthis motif. Cyclized Trp-cage is remarkably stable and afforded twocrystal forms suitable for X-ray diffraction. The resulting higher re-solution crystal structures validate the prior NMR models and pro-vide explanations for experimental observations that could not berationalized by NMR structural data, including the structural basisfor the increase in fold stability associated with motif cyclizationand themanner in which a polar serine side chain is accommodatedin the hydrophobic interior. A hexameric oligomer of the cyclic pep-tide is found in both crystal forms and indicates that under appro-priate conditions, this minimized systemmay also serve as a modelfor protein–protein interactions.

protein–protein interaction ∣ folding simulation target ∣ protein cyclization ∣peptide oligomerization ∣ H-bonding motifs

Trp-cage miniproteins (at 18–20 residue length) are among thesmallest peptides to fold into a stable protein-like structure

(1, 2); as a result, the Trp-cage fold has emerged as a protein fold-ing paradigm (3, 4). Trp-cage species have been extensively stu-died experimentally (4–11) and computationally (12–19), but theconclusions regarding the folding process and results drawn fromthese studies are not in accord at the level that should be attain-able for such a small system.

The structural model used for testing computational foldingmethods has, in almost all cases, been that derived by NMR forthe initially reported Trp-cage (TC5b), which displays a ratherlow fold stability, Tm ¼ 43 °C (1). Additional Trp-cage structuresbased on NMR measurements are now available (2, 6, 10, 20–22).All of these appear to possess a common fold topology that is sup-ported by an extensive set of structuring-induced shifts (CSDs, che-mical shift deviations from random coil norms) that are largelythe result of ring current effects. The uniformity of these CSDssuggests a well-defined common structure. However, other spec-troscopic measurements of Trp-cage molecules have been inter-preted as being consistent with a folded structure undergoing agreater degree of dynamic fluctuation (7, 8). In addition, “folded”structures frommolecular dynamics simulations, with a few notableexceptions (18, 19), fail to predict the ring current effects observedby NMR (2). The failure of numerous calculations to reproducethe NMR structures and contrary conclusions from other experi-mental methods is problematic and points to outstanding issuesconcerning the continuing* use of a Trp-cage as a benchmark formolecular dynamics folding simulations. Further structural studiesof Trp-cage species are required to validate the target and to as-certain the key interatomic interactions leading to the folded state.

Herein we report X-ray structure determinations for two crys-tal forms of a cyclized Trp-cage (23), cyclo-TC1 (-GDAYAQW-LADGGPSSGRPPPSG-). Raising additional questions aboutthe oligomeric state of the molecule, a Trp-cage hexamer is foundin both crystal forms. However, the single chain structures ob-served in the solid-state (nine unique molecular structures in two

crystal forms) are remarkably similar to solution-state NMRstructures, validating this computational target and providing amore detailed view of the atomic interactions stabilizing thefolded state. In addition, the hexameric state observed in the crys-tals may serve as a tractable model for computational studies ofprotein–protein interactions and oligomer formation.

ResultsModels of Trp-cages derived from NMR measurements all dis-play fraying at the N and C termini and considerable variabilityin the conformations of exposed side chains. Reasoning that thistype of fluxionality might retard crystallization, we replaced Lys8with a helix-favorable alanine residue and added a glycine residueto each terminus to set the stage for cyclization. The resultingacyclic peptide (TC1) could, analogously to an early study ofBPTI (23), be cyclized using a peptide coupling agent under aqu-eous conditions. Cyclo-TC1 proved to be the most stable Trp-cagefold prepared to date, significantly more stable than our priorhyperstable construct (6) in which Gly10 and Gly15 had beenmutated to D-Ala (TC16b,Tm ¼ 83 °C,ΔGU

280 ¼ 18.6 kJ∕mol).Cyclo-TC1 displayed significant NH exchange protection at pD7–8, with the largest protection factor (Gly11-HN) correspondingto ΔGU ¼ 20.5 kJ∕mol. NMR melts (see SI Appendix) support atwo-state unfolding scenario with the folded state stabilized bycirca 8 kJ∕mol by the cyclizing loop.

Circular dichroism (CD) measurements provided an accuratedetermination of the melting point (Tm ¼ 95� 1 °C). This cor-responds to a ΔTm in excess of 35 °C versus the acyclic control(TC1, Tm ¼ 58 °C) of identical sequence. Existing calibrationsallow increases in Tm, as measured by CD melts, to be convertedinto ΔΔGU

280 values (2, 5, 6): ΔΔGUcyc ¼ 8.2 kJ∕mol. The addi-

tional fold stability of cyclo-TC1 was even greater in the presenceof denaturants; the CD melts for cyclo-TC1 throughout a GdmCldenaturation experiment appear in Fig. S1B in SI Appendix. Thefold stabilization observed upon cyclization far exceeds that ex-pected from entropy effects for a cyclic peptide of this size (24)and thus requires some structural rationale.

Author contributions: N.H.A. designed research; M.S., J.C.L., I.L.T., and G.I.M. performedresearch; M.S., J.C.L., I.L.T., G.I.M., R.E.S., and N.H.A. analyzed data; and M.S., R.E.S. andN.H.A. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The atomic coordinates and structure factors have been deposited in theProtein Data Bank, www.pdb.org (PDB ID codes 3UC7, 3UC8, and 2LL5); the NMR chemicalshifts have been deposited in the BioMagResBank, www.bmrb.wisc.edu (accessionno. 18023).

*As detailed in the SI Appendix, at least 40 publications reporting either new experimentson Trp-cage species or the use of the Trp-cage as the target structure for computationalfolding or fold prediction methods appeared in 2011 and the first few months of 2012.

1To whom correspondence may be addressed. E.mail: [email protected] [email protected] or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1121421109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1121421109 PNAS ∣ July 31, 2012 ∣ vol. 109 ∣ no. 31 ∣ 12521–12525

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

Both TC1 and cyclo-TC1 display all the diagnostic NMRCSDsof the Trp-cage fold; these were not altered upon introduction ofthe cyclization constraint (Fig. 1). Cyclo-TC1 also displayed thefull complement of through space NOE connectivities of a Trp-cage. An NMR structure ensemble (Fig. 2B) was calculated basedon the NOEs (Protein Data Bank ID code 2LL5, Biological Mag-netic Resonance Bank entry 18023); it is compared to both anensemble from the crystal structures (Fig. 2C) and a prior NMRensemble of an acyclic Trp-cage (Fig. 2A) in Fig. 2 and Table S1 inSI Appendix. The NMR ensemble statistics appear in Table 1.Further details and the distance constraints employed appearin SI Appendix (Tables S2 and S3).

The NMR ensemble for cyclo-TC1 shows all the features iden-tified in previous Trp-cage structures (Fig. 2A) including thepacking of side chains around the tryptophan indole ring andthe secondary structure units: a C-terminal poly-ProII helix, ashort 310 helix (residues 12–14 have the classic ϕ∕ψ values),and the N-terminal α helix. The latter appears to be significantlyextended in the cyclic structure (Fig. 2B) with helical ϕ∕ψ anglesfrom Asp1 to Asp9. In acyclic structures, helical backbone valuesare observed only from ψ2 through ϕ9. An 8% increase in theresidue molar ellipticity, −½θ�222, confirms additional helicity.The side chains of Arg16 and Asp9 are close to each other, con-sistent with a potential for forming a salt-bridge interaction (1,22). The cyclizing Gly–Gly linkage does not significantly alterthe Trp-cage structure, but the NMR ensemble provided no in-formation on the conformational preference, if any, within theconnecting PSGGD loop.

Two crystal forms (monoclinic and tetragonal) of cyclo-TC1were obtained from sitting drop vapor-diffusion experiments. Dueto residual acid in the peptide solutions from the HPLC purifica-tion protocol and the high peptide concentrations employed forcrystallization, the pH during crystallization is estimated as 2–3;the aspartate side chain carboxyl units are likely to be protonatedin the solid-state structures. The structure of the monoclinic formwas obtained using one of the NMR models of cyclo-TC1 formolecular replacement calculations with Phaser (25). A trimericstructure from the monoclinic form served as a test structure formolecular replacement in the tetragonal crystal form; Table 2 con-tains data set statistics for both crystal forms. Both structures havebeen refined using Refmac-5 (26) in the CCP4 program package

(27). The final refinement statistics are also given in Table 2, andcoordinates and structure factors have been deposited with PDBID codes 3UC7 and 3UC8. The resulting solid-state structure en-semble of nine crystallographically unique molecules (Fig. 2C)displays a backbone rmsd of 0.35� 0.10 Å over the residue 3–19span and a similar agreement with the NMR ensemble. However,the crystal structures show less variation than do the NMR mod-els; only five backbone torsion angles have standard deviationsgreater than �10° (Table S1 in SI Appendix).

Cyclo-TC1 exists as a dimer of trimers in both crystal forms(Fig. 3 and Fig. S2 in SI Appendix) with a number of close inter-molecular contacts within the trimer units. A complete hexamer

Fig. 1. CSDs at 280 K for cyclo-TC1, its acyclic precursor (TC1), and TC16b (DAY-AQWLADaGPASaRPPPS, a ¼ D-Ala). The proton sites include all instances ofCSDs greater than 0.4 ppm providing excellent coverage of the full sequenceand including both upfield and downfield shifts. The deviations relative toTC16b at 13HN and 15HN reflect the Gly → D-Ala substitutions at Gly10 andGly15 and the S13A mutation. The largest shifts (Gly11-Hα2, Pro18-Hα andHβ3 and ΣðPro19-Hδ2; Hδ3Þ are employed for analyzing motif melting (2, 6).

Fig. 2. Trp-cage structure ensembles showing the complete backbones andthe heavy atoms of side chains of Tyr3, Trp6, Leu7, Pro18, and Pro19. (A) Thesolution-state NMR ensemble (22) for TC10b (DAYAQWLKDGGPSSGRPPPS,the closest acyclic analog to cyclo-TC1 for which we have detailed structuralinformation) is shown in gray with a representative cyclo-TC1 backbone fromthe NMR ensemble shown in bold black (model 20 from PDB ID code 2LL5).With the exception of a short span in the 310 helix, the cyclo-TC1 backbonelies within the acyclic structure ensemble over the common residue 3–19span. (B) The solution-state NMR ensemble of cyclo-TC1 (intra-ensemble re-sidue 3–19 backbone rmsds of 0.29� 0.10 Å). (C) Superposition of the nineindependent molecules in the two crystal structures (intra-ensemble residue3–19 backbone rmsds of 0.35� 0.10 Å). The inter-ensemble residue 3–19backbone rmsds for matched numbers of structures are: (A) vs. (B),0.45� 0.15 Å (versus intra-ensemble rmsds of 0.42� 0.14 Å for the TC10bensemble), (B) vs. (C), 0.37� 0.10 Å. A complete backbone ϕ∕ψ comparisonfor these and other Trp-cage structures appears as Table S1 in SI Appendix.Over the residue 3–19 span, the average phi and psi angles in ensemble(C) display an rmsd of 8.8° with the averages over all of the NMR structuresfor acyclic Trp-cages.

Table 1. NMR structure and refinement statistics

Cyclo-TC1

NMR distance and dihedral constraintsDistance constraintsTotal NOEs 192Intra-residue 79Inter-residue 113Sequential (ji − jj ¼ 1) 55Medium-range (ji − jj ≤ 4) 28Long-range (ji − jj ≥ 5) 30Intermolecular 0Hydrogen bonds 0Total dihedral angle restraintsϕ 0ψ 0Structure statisticsViolations (mean and SD)Distance constraints (Å) 0.043 ± 0.036Dihedral angle constraints (°) N/AMax. dihedral angle violation (°) N/AMax. distance constraint violation (Å) 0.20Deviations from idealized geometryBond lengths (Å) 0.0036 ± 0.0001Bond angles (°) 0.340 ± 0.014Impropers (°) 0.176 ± 0.009Average pairwise rmsd* (Å)Heavy (residues 3–19) 0.82 ± 0.16Backbone (residues 3–19) 0.29 ± 0.10

*Pairwise rmsd was calculated among 33 refined structures.

12522 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1121421109 Scian et al.

constitutes the asymmetric unit of the monoclinic crystal form.The two trimers in the hexamer are related by a noncrystallo-graphic twofold rotation axis (Fig. 3). The two C3 axes are notaligned, so the point group symmetry of the hexamer is C2. Theaverage solvent accessible surface area (28) for individual subu-nits in the absence of neighboring molecules is 1;676 Å2. In iso-lated trimers, the accessible surface area for each subunit is1;299 Å2; 380 Å2 is buried per subunit in forming the trimer.

An additional 1;572 Å2 is buried in forming a hexamer fromtwo trimers. Details of the inter-subunit contacts for the hexam-ers in the two crystal forms are provided in Table S4 in SIAppendix. These interactions include five inter-trimer hydrogenbonds. The subunit–subunit interfaces in the hexamer are some-what larger than the crystal packing interfaces. In the monoclinic

crystal form, each hexamer is in contact with 14 neighboring hex-amers. The largest crystal-packing interface buries about 650 Å2,while the next largest interface buries about 450 Å2. In the tetra-gonal crystal form, with 13 neighboring hexamers, the largest in-ter-hexamer interface buries 787 Å2, with 369 Å2 buried in thenext largest one.

DiscussionThe monomeric state of the original Trp-cage (TC5b) in solutionwas confirmed (29) by analytical ultracentrifugation (AUC). Theobservation of oligomeric crystal structures for cyclo-TC1prompted us to re-examine this issue for the solution-state con-ditions used in the NMR experiments. The radial distribution ofcyclo-TC1 (as well as the acyclic TC1 control) in sedimentationequilibrium AUC experiments (see SI Appendix) could be fittedto a single species model and the calculated atomic mass corre-sponds to the monomeric state (30). Nonetheless, the NMR andcrystal structures of cyclo-TC1 are in substantial agreement (Fig. 2).

Turning to differences between the NMR and crystal structureensembles, perhaps the most notable one is the location of re-gions of structural variance. As previously noted, the least definedregion in the NMR ensemble is the PSGGD loop; in the crystalstructures this region is less variable with Gly-1 and Asp1 at he-lical ϕ∕ψ values and Gly21 in a poly-ProII conformation. Thisconfirms the helix extension suggested by the CD data and alsoprovides a rationale for the enhanced stability of cyclo-TC1. Hy-drogen bonding interactions are observed between Ala4-HN →O═C-Gly-1 and Tyr3-HN → O═C-Gly21 (Ala4-N⋯O═C-Gly-1 ¼ 3.17� 0.05 Å, Tyr3-N⋯O═C-Gly21 ¼ 2.81� 0.04 Å);these serve as efficient capping interactions that stabilize theα-helix and thus the entire fold (2, 5).

In all respects, the X-ray structures are in closer accord withthe prior NMR ensembles for Trp-cages than with computedstructures. Even though the Gly10 to Arg16 loop, includingthe 310 helix, is the sequence span with the greatest variabilityin the crystal structures, the Ser14 side chain conformation isnearly constant (χ1 ¼ 72� 3°). This placement explains how thispolar side chain function can be placed deep within the hydro-phobic core. Two hydrogen bonding interactions (Fig. S3 in SIAppendix) are implicated by the heavy atom locations: Ser14-Hγ → O═C-Gly11 [accounting for the exchange protection pre-viously noted (2) for this serine hydroxyl] and Ser14-Oγ←HN-Arg16 (which provides a rationale for the previously unexplainedexchange protection at Arg16-HN). The observed protection fac-tors for Arg16-HN have been somewhat larger or comparable tothose observed for the 310 helix sites. The lesser degree of ex-change protection for these sites, versus Gly11-HN (reflecting theGly11-HN → O═C-Trp6 hydrogen bonding interaction) and theresidue 6–8 HN sites (reflecting typical α-helical interactions), canbe attributed to a more fluxional 310 helix conformation, evi-denced in the crystal structure ensemble as a greater variation inϕ∕ψ values in the Gly10-Arg16 span.

The likely differences in the ionization states of Asp1 and Asp9(vide supra) in the solution and crystal structures confound thestructural roles of these two side chains, helix N-capping and apresumptive Asp9/Arg16 hydrogen bonded salt-bridge (22) notedfor the pH 7 solution-state. However, some tentative conclusionscan be suggested. The Asp1 side chain takes on several confor-mations in the solution structure but is somewhat more localizedin the crystal structures (Fig. 4). The helix-stabilizing hydrogenbonding interactions provided by the Gly–Gly loop serve to lessenthe need for N-capping by the Asp side chain. The Asp9 sidechain takes on a single conformation in the solid-state ensemble(χ1 ¼ −69� 2°). This is also the major, but less well-defined,conformer in the solution-state ensemble. In both ensembles,the Arg16 side chain is highly variable (Fig. 4). In one of the mo-lecules in the solid-state ensemble there is no density beyond Cγ;a well-defined side-chain geometry is observed in the crystals only

Table 2. Crystallographic data collection and refinement statistics

Monoclinic crystalform*

Tetragonal crystalform*

Data collectionSpace group C2 P4322Cell dimensionsa, b, c (Å) 57.70, 48.75, 35.26 36.86, 36.86, 66.21α, β, γ (°) 90.0, 116.9, 90.0 90.0, 90.0, 90.0Resolution (Å) 50.0–0.99 (1.03–0.99)† 50.0–1.32 (1.37–1.32)†

Rmerge 0.089 (>1.00) 0.032 (0.096)hIi∕hσIi 28.8 (0.9) 59.5 (9.1)Completeness

(%)81.3 (31.1) 86.2 (29.9)

Redundancy 11.4 (3.7) 10.0 (2.5)RefinementResolution (Å) 35.4–1.10 36.9–1.33No. reflections 31,848 9,316Rwork∕Rfree 0.149∕0.194 0.136/0.198No. atomsProtein 936 479Ligand/ion 2 16Water 111 104B-factorsProtein 17.4 13.1Ligand/ion 19.9 24.7Water 25.7 24.2rmsdBond lengths (Å) 0.029 0.025Bond angles (°) 2.42 2.06

*One crystal for each dataset.†Values in parentheses are for highest-resolution shell.

Fig. 3. Two cyclo-TC1 trimers (red and green outer surfaces) form a hexamerwith C2 symmetry. The local C3 axes for each trimer are also shown. These donot intersect with the molecular C2 axis, making each monomer unique ineach trimer.

Scian et al. PNAS ∣ July 31, 2012 ∣ vol. 109 ∣ no. 31 ∣ 12523

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY

when the Arg side chain is oriented near the Asp9 carboxyl. Weview this as evidence for a transient hydrogen bonding interactionthat would be strengthened by a favorable Coulombic interactionat pH 7.

The trimers in the crystal structures consist of subunits relatedby a C3 axis. A subunit–subunit interaction found in all trimersis an intermolecular hydrogen bond between Tyr-OH andO═C-Gly-1 (Fig. 5). Although the AUC studies indicate thatcyclo-TC1 is monomeric in the NMR buffer over a wide rangeof concentrations (60 μM to greater than 25 mM), solution-statetrimers or hexamers may form under high ionic strength and lowpH conditions, and they may be required for crystallization. Priorto solving the structure by molecular replacement, we carried outextensive crystallization trials on the Tyr → para-bromoPhe deri-vative of cyclo-TC1 to introduce a strong anomalous scatterer forsolving the phase problem. No crystals were observed, consistentwith the loss of the proton donor for the Tyr-Gly trimer-stabilizinghydrogen bond.

The presence of specific inter-residue interactions in themonomer–monomer interfaces suggests the trimer could be astable structure at higher protein concentrations. An intermole-cular hydrophobic cluster made up of Ala4 and Leu7 side chains(Fig. 5) accounts for significant apolar surface burial at the tri-mer–trimer interface. The strongest evidence for the existenceof the hexamer is that the same arrangement of six subunits isfound in each crystal form. The packing environments for thehexamers, however, are not equivalent in the two forms, and thissuggests that hexamers are the molecular species that crystallize.The small size of this hexamer could be a benefit for computa-tional studies providing a tractable model for simulations of pro-tein–protein interactions and oligomer formation.

It is apparent that the overall structure of a Trp-cage is notchanged by cyclization or by the extensive intermolecular inter-actions in the crystal lattice. The higher resolution crystal struc-tures provide answers to a number of questions that remainedeven after detailed examination of Trp-cage NMR structures,including that of cyclo-TC1. The solid-state ensemble has ϕ∕ψvalues that conform to the established norms throughout theα-, 310-, and polyProII-helices, and these structures predict thering current shifts observed in NMR studies of the monomer. Cy-clization and residue mutations result in very minor changes in

the Trp-cage structure, but the key features that stabilize this pro-tein-like fold produce a remarkably consistent structural motifthat can serve to benchmark fold predictions. A previously re-ported Trp-cage structure (e.g., TC10b, ref. 2) or that observedfor residues 3–19 in the crystal structures presented here canserve in that capacity. It remains for computational folding meth-ods to achieve this degree of consistency in predicting the foldfeatures and the protein–protein association interface interac-tions shown in Fig. 5.

Materials and MethodsPeptide Synthesis and Purification. The peptide was synthesized on an AppliedBiosystem 433A synthesizer using standard Fmoc (9-fluorenylmethoxycarbo-nyl) solid-phase peptide synthesis methods. The linear peptide was purifiedby RP-HPLC, using C18 stationary phases and a water (0.1% trifluoroaceticacid, TFA)/acetonitrile (0.085% TFA) gradient as previously described (2).A Wang resin preloaded with the C-terminal amino acid was used for thesynthesis. The peptide was cleaved from the resin using a 95/2.5/2.5 TFA/triisopropylsilane/water mixture.

Cyclo-Trp-cage1 (cyclo-TC1) can be formed by native chemical ligation witha preformed thioester prepared using the 4-sulfamylbutyryl AM “safety-catch”resin (31). However, folding-mediated cyclization of fully deprotected GDA-YAQWLADGGPSSGRPPPSG provided a more reproducible synthesis; optimumresults were obtainedwith 0.5 mM1-ethyl-3-(3-dimethylaminopropyl)carbodii-mide hydrochloride (EDC · HCl), 75 μM N-hydroxysulfosuccinimide (Sulfo-NHS)sodium salt at room temperature over a four-day period at moderately highdilution (50 μM peptide in 25 mM 3-(N-morpholino)propanesulfonic acid(MOPS), pH 6.5). In the preparative scale reactions, a quench with 2-mercap-toethanol (5 mM) and hydroxylamine · HCl (150 μM) was essential in orderto prevent Trp-cage oligomerization during the reaction work up. The cyclicproduct was obtained in acceptable yield (10%) and unreacted acyclic materialcould be recovered (HPLC) and recycled to increase the net conversion.

Circular Dichroism Spectroscopy (CD). Peptide stock solutions of approximately200 μMwere prepared using 50 mM phosphate buffer, pH 7.0. Accurate con-centrations were determined by UV spectroscopy assuming standard molarabsorptivities for the Trp and Tyr residues. The peptide was typically dilutedto about 30 μM, and CD spectra were recorded on a Jasco J720 spectropo-larimeter using 0.1 cm pathlength cells over a UV range of 190–270 nm aspreviously described (22). For determining the melting point for the peptide,the CD melts were performed without added denaturant and with 1–7 Mguanidinium chloride (GdmCl) present.

NMR Spectroscopy. Samples for 2D NMR spectral studies consisted of1–1.5-mM peptide in 50-mM phosphate buffer at pH 7.0, 10% D2O, with

Fig. 4. Side-chain conformation comparisons between the solution andsolid-state structure ensembles of cyclo-TC1. The side chains of Asp1, Gln5,Asp9, and Arg16 are shown in cyan for the NMR ensemble and green forthe X-ray structures, with a representative backbone also displayed withthe same color-coding. The conformations of Asp1 and Asp9 differ betweenthe NMR and X-ray ensembles, likely due to the protonation state differencenoted in the text.

Fig. 5. A more detailed view of one trimer unit. The hydrogen bonds fromthe Tyr hydroxyls appear at the surface closest to the reader in this view; theTyr3Oη⋯O═C-Gly-1 distance is remarkably constant (2.68� 0.03 Å) in thecrystal structures. Below these is the hydrophobic clustering of Ala4 andLeu7 side chains from different monomers.

12524 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1121421109 Scian et al.

4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) as an internal chemical shiftreference. NMR experiments were collected at 500 or 750 MHz on BrukerDRX and AV spectrometers. Full 1H spectral assignments were made by usinga combination of 2D NOESY and TOCSY experiments. The resulting chemicalshifts were converted to folding-associated shifts, given as chemical shiftdeviations (CSDs) from random coil norms, using the CSDb algorithm (32–34).The large diagnostic CSDs for cyclo-TC1 and its acyclic precursor (TC1) appearin Fig. 1 wherein they are compared to the values recorded for the prior hy-perstable Trp-cage species (6). The CSDs were monitored over the 280–320 Ktemperature range to provide an alternative measure of melting and ΔΔGU

values (SI Appendix).

NH Exchange Protection. The NH exchange experiments were performed at280 K by adding pre-cooled D2O phosphate buffer (pD ¼ 7.0 and 8.0) tothe lyophilized peptide sample in a pre-cooled NMR tube and then recording1D proton spectra at various time intervals. Exchange rates (kobs) wereobtained from 1D spectra as the slopes of plots of ln(NH signal intensity)versus time. The reference random coil rate constant for exchange (krc) wascalculated using the appropriate Molday factors (35), as previously describedfor TC5b and TC10b (1, 2); protection factors were calculated as PF ¼ krc∕kobs.The relationship, χF ¼ 1 − ð1∕PFÞ, was employed to convert protection factorsto an extent of folding measure (ΔGU). Cyclo-TC1 displayed NH exchangeprotection factors ranging from 103.52–103.81 for the hydrogen bondedsites at the C terminus of the N-terminal helix (Trp6, Leu7, Ala8, ΔGU ¼19.5� 0.6 kJ∕mol). As observed in all other Trp-cages, Gly11-HN displayedthe largest protection factor providing the measure of overall fold stability,ΔGU ¼ 20.5 kJ∕mol.

NMR Structure Ensemble Calculations. The CNS-based refinement procedure,as well as the derivation of NOE distance constraints from the NOESY inten-sities, followed the examples of prior Trp-cage structure determinations(2, 22, 31). Further details appear in SI Appendix. The NMR structure ensem-ble has been deposited with the Protein Data Bank (ID code 2LL5) and theBiological Magnetic Resonance Bank (entry 18023).

Crystallization, Crystal Structure Determination, and Refinement. Two crystalforms of cyclo-TC1 grew from sitting drop vapor-diffusion experiments. Amonoclinic crystal form was obtained with 2-μL drops made by combining

1 μL of 40 mg∕mL protein (in MilliQ water) and 1 μL of reservoir solution(0.16 M Tris · HCl, 0.16 M MgCl2, 1.75 M NaCl, pH 7.0). A tetragonal form re-sulted from changing the reservoir solution to 0.2 M Tris · HCl, 0.2 M MgCl2,2.0 M NaCl, pH 7.0. Diffraction data for the monoclinic crystals were collectedat beamline 11-1 at SSRL. Those for the tetragonal crystals weremeasuredusinga Rigaku Raxis-IV++ image plate system and processed using HKL2000 (36).Table 2 contains data set and final refinement statistics for both crystal forms.

Themonoclinic formwas solved using one of the NMRmodels of cyclo-TC1for molecular replacement (model 1 from PDB ID code 2LL5). Phaser (25)placed five molecules in the asymmetric unit, and a sixth was added manuallyby fitting it in a difference electron density map. One of the trimers served asa test structure for molecular replacement in the tetragonal crystal form.Molrep (37) found a structure solution in space group P4322. Again, the tet-ragonal crystals contain an asymmetric hexamer; however, unlike the mono-clinic form, the twofold axis is now a crystallographic symmetry axis.

Both structures have been refined using Refmac-5 (26) in the CCP4 pro-gram package (27). The models have been validated using Molprobity (38),and coordinates and structure factors have been deposited in the ProteinData Bank (39) (PDB ID codes 3UC7 and 3UC8). The quality of the model fitto electron density is illustrated in Fig. S3 in SI Appendix. Subunit–subunitinteractions are listed in SI Appendix, Table S4 and illustrated in Fig. S4 inSI Appendix.

All structural figures were made with the PyMol Molecular Graphic Sys-tem, Version 1.3, Schrödinger LLC.

ACKNOWLEDGMENTS. We thank Maria M. Lopez for helping us collect theAUC data and Eric Larson for helpful discussion and suggestions. This workhas been supported in part by National Institutes of Health (NIH) grantsGM059658 and GM099889 (N.H.A., P.I.). Portions of this research were carriedout at the Stanford Synchrotron Radiation Lightsource, a Directorate of SLACNational Accelerator Laboratory and an Office of Science User Facility oper-ated for the U.S. Department of Energy Office of Science by Stanford Univer-sity. The SSRL Structural Molecular Biology Program is supported by the DOEOffice of Biological and Environmental Research, the NIH National Instituteof General Medical Sciences (including P41GM103393), and the NationalCenter for Research Resources (P41RR001209). The contents of this publica-tion are solely the responsability of the authors and do not necessarily repre-sent the official view of NIGMS, NCRR, or NIH.

1. Neidigh JW, Fesinmeyer RM, Andersen NH (2002) Designing a 20-residue protein. NatStruct Biol 9:425–430.

2. Barua B, et al. (2008) The Trp-cage: Optimizing the stability of a globular miniprotein.Protein Eng Des Sel 21:171–185.

3. Searle MS, Ciani B (2004) Design of beta-sheet systems for understanding the thermo-dynamics and kinetics of protein folding. Curr Opin Struct Biol 14:458–464.

4. Mok KH, et al. (2007) A pre-existing hydrophobic collapse in the unfolded state of anultrafast folding protein. Nature 447:106–109.

5. Lin JC, Barua B, Andersen NH (2004) The helical alanine controversy: An (Ala)6 inser-tion dramatically increases helicity. J Am Chem Soc 126:13679–13684.

6. Williams DV, Barua B, Andersen NH (2008) Hyperstable miniproteins: Additive effectsof D- and L-Ala mutations. Org Biomol Chem 6:4287–4289.

7. Ahmed Z, Beta IA, Mikhonin AV, Asher SA (2005) UV-resonance raman thermal unfold-ing study of Trp-cage shows that it is not a simple two-state miniprotein. J Am ChemSoc 127:10943–10950.

8. Neuweiler H, Doose S, Sauer M (2005) A microscopic view of miniprotein folding:Enhanced folding efficiency through formation of an intermediate. Proc Natl AcadSci USA 102:16650–16655.

9. Bunagan MR, Yang X, Saven JG, Gai F (2006) Ultrafast folding of a computationallydesigned Trp-cage mutant: Trp2-cage. J Phys Chem B 110:3759–3763.

10. Neuman RC, Jr, Gerig JT (2008) Solvent interactions with the Trp-cage peptide in 35%ethanol-water. Biopolymers 89:862–872.

11. Wafer LN, StreicherWW,Makhatadze GI (2010) Thermodynamics of the Trp-cagemini-protein unfolding in urea. Proteins 78:1376–1381.

12. Simmerling C, Strockbine B, Roitberg AE (2002) All-atom structure prediction andfolding simulations of a stable protein. J Am Chem Soc 124:11258–11259.

13. Snow CD, Zagrovic B, Pande VS (2002) The Trp cage: Folding kinetics and unfoldedstate topology via molecular dynamics simulations. J Am Chem Soc 124:14548–14549.

14. Chowdhury S, Lee MC, Xiong G, Duan Y (2003) Ab initio folding simulation of theTrp-cage mini-protein approaches NMR resolution. J Mol Biol 327:711–717.

15. Zhou R (2003) Trp-cage: Folding free energy landscape in explicit water. Proc Natl AcadSci USA 100:13280–13285.

16. Linhananta A, Boer J, MacKay I (2005) The equilibrium properties and folding kineticsof an all-atom Go model of the Trp-cage. J Chem Phys 122:114901.

17. Juraszek J, Bolhuis PG (2006) Sampling the multiple folding mechanisms of Trp-cage inexplicit solvent. Proc Natl Acad Sci USA 103:15859–15864.

18. Xu W, Mu Y (2008) Ab initio folding simulation of Trpcage by replica exchange withhybrid Hamiltonian. Biophys Chem 137:116–125.

19. Day R, Paschek D, Garcia AE (2010) Microsecond simulations of the folding/unfoldingthermodynamics of the Trp-cage miniprotein. Proteins 78:1889–1899.

20. Liu Y, Liu Z, Androphy E, Chen J, Baleja JD (2004) Design and characterization of helicalpeptides that inhibit the E6 protein of papillomavirus. Biochemistry 43:7421–7431.

21. Hudaky P, et al. (2008) Cooperation between a salt bridge and the hydrophobic coretriggers fold stabilization in a Trp-cage miniprotein. Biochemistry 47:1007–1016.

22. Williams DV, Byrne A, Stewart J, Andersen NH (2011) Optimal salt bridge for Trp-cagestabilization. Biochemistry 50:1143–1152.

23. Goldenberg DP, Creighton TE (1983) Circular and circularly permuted forms of bovinepancreatic trypsin inhibitor. J Mol Biol 165:407–413.

24. Zhou HX (2003) Effect of backbone cyclization on protein folding stability: Chainentropies of both the unfolded and the folded states are restricted. J Mol Biol332:257–264.

25. McCoy AJ, et al. (2007) Phaser crystallographic software. J Appl Crystallogr 40:658–674.26. Murshudov GN, Vagin AA, Dodson EJ (1997) Refinement of macromolecular structures

by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr 53:240–255.27. Anonymous (1994) The CCP4 suite: Programs for protein crystallography. Acta Crystal-

logr D Biol Crystallogr 50:760–763.28. Lee B, Richards FM (1971) The interpretation of protein structures: Estimation of static

accessibility. J Mol Biol 55:379–400.29. Streicher WW, Makhatadze GI (2007) Unfolding thermodynamics of Trp-cage, a 20

residue miniprotein, studied by differential scanning calorimetry and circular dichro-ism spectroscopy. Biochemistry 46:2876–2880.

30. Makhatadze GI, Medvedkin VN, Privalov PL (1990) Partial molar volumes of polypep-tides and their constituent groups in aqueous solution over a broad temperaturerange. Biopolymers 30:1001–1010.

31. Lin JC (2007) Application of the Trp-cagemothif to polypeptide folding questions. PhDThesis (University of Washington, Seattle).

32. Fesinmeyer RM, Hudson FM, Andersen NH (2004) Enhanced hairpin stability throughloop design: The case of the protein G B1 domain hairpin. J AmChem Soc 126:7238–7243.

33. Fesinmeyer RM, et al. (2005) Chemical shifts provide fold populations and register ofbeta hairpins and beta sheets. J Biomol NMR 33:213–231.

34. Eidenschink L, Crabbe E, Andersen NH (2009) Terminal sidechain packing of a designedbeta-hairpin influences conformation and stability. Biopolymers 91:557–564.

35. Bai Y, Milne JS, Mayne L, Englander SW (1993) Primary structure effects on peptidegroup hydrogen exchange. Proteins 17:75–86.

36. Otwinowski Z, Minor W (1997) Processing of X-ray diffraction data collected in oscilla-tion mode. Methods Enzymol 276:307–326.

37. Vagin A, Teplyakov A (1997) MOLREP: An automated program for molecular replace-ment. J Appl Crystallogr 30:1022–1025.

38. Davis IW, et al. (2007) MolProbity: All-atom contacts and structure validation for pro-teins and nucleic acids. Nucleic Acids Res 35:W375–383.

39. Berman HM, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242.

Scian et al. PNAS ∣ July 31, 2012 ∣ vol. 109 ∣ no. 31 ∣ 12525

BIOPH

YSICSAND

COMPU

TATIONALBIOLO

GY