Investigating diproline segments in proteins: Occurrences, conformation and classification
-
Upload
indranil-saha -
Category
Documents
-
view
212 -
download
0
Transcript of Investigating diproline segments in proteins: Occurrences, conformation and classification
Investigating Diproline Segments in Proteins: Occurrences, Conformationand Classification
Indranil Saha,* Narayanaswamy ShamalaDepartment of Physics, Indian Institute of Science, Bangalore-560012, India
Received 3 May 2011; revised 18 July 2011; accepted 18 July 2011
Published online 6 September 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/bip.21703
This article was originally published online as an accepted
preprint. The ‘‘Published Online’’ date corresponds to the
preprint version. You can request a copy of the preprint by
emailing the Biopolymers editorial office at biopolymers@wiley.
com
INTRODUCTION
The uniqueness of proline among the 20 genetically
coded amino acids lies in its covalent linkage between
the side-chain and the backbone nitrogen atom. The
direct consequence of the pyrrolidine ring formation
restricts the torsional angle / to values of2608 � 308for the L-proline residue which has been clearly exploited in
the design of well structured peptides.1–6 Generally, proline
exists in the following three stable conformations: (i) polypro-
line (PII) region (/ � 2608, w � 1208), (ii) the C7
region or c-turn (/ � 2708, w � 708), and (iii) right handed
a helical region aR (/ � 2608, w � 2308). It is now
well established that the presence of L-proline in proteins has
a large effect on their conformation.7–10 This effect is due to
the fact that proline is a part of a five-membered ring. As
a consequence, proline, when linked in a peptide chain,
is devoid of amide hydrogen atoms and thus can only contrib-
ute to hydrogen bonded structures through its carbonyl
group.
A major consequence of the absence of the NH group
capable of participating in intra-molecular hydrogen bond-
ing, an interaction characteristic of polypeptide secondary
Investigating Diproline Segments in Proteins: Occurrences, Conformationand Classification
Additional Supporting Information may be found in the online version of this
article.
Correspondence to: Prof. N. Shamala; e-mail: [email protected]
*Present address: Structural Biology Laboratory, ELETTRA Synchrotron Light
Laboratory, Trieste-34149, Italy; e-mail: [email protected].
ABSTRACT:
The covalent linkage between the side-chain and the
backbone nitrogen atom of proline leads to the formation
of the five-membered pyrrolidine ring and hence
restriction of the backbone torsional angle / to values of
260 8� 308 for the L-proline. Diproline segments
constitute a chain fragment with considerably reduced
conformational choices. In the current study, the
conformational states for the diproline segment
(LPro-LPro) found in proteins has been investigated with
an emphasis on the cis and trans states for the Pro-Pro
peptide bond. The occurrence of diproline segments in
turns and other secondary structures has been studied
and compared to that of Xaa-Pro-Yaa segments in
proteins which gives us a better understanding on the
restriction imposed on other residues by the diproline
segment and the single proline residue. The study
indicates that PII–PII and PII–a are the most favorable
conformational states for the diproline segment. The
analysis on Xaa-Pro-Yaa sequences reveals that the Xaa-
Pro peptide bond exists preferably as the trans conformer
rather than the cis conformer. The present study may
lead to a better understanding of the behavior of proline
occurring in diproline segments which can facilitate
various designed diproline-based synthetic templates for
biological and structural studies. # 2011 Wiley
Periodicals, Inc. Biopolymers 97: 54–64, 2012.
Keywords: diproline segments; conformational states; cis
Pro-Pro peptide bond; trans Pro-Pro peptide bond;
flanking residue conformation
VVC 2011 Wiley Periodicals, Inc.
54 Biopolymers Volume 97 / Number 1
structure11–19 in proline, results in disfavoring of this
amino acid residue to occur in regular a-helices and b-sheets frequently. However, the presence of proline in the
N-cap region of helices is explained by the fact that the
backbone torsion angles needed for a-helix formation is
readily adopted by proline and the N-terminus residue
which have solvent exposed NH groups.20–27 An earlier
study also demonstrates the stabilization of a-helices using
CH. . .O hydrogen bonding in the proline residue.28 In
proteins, the loop regions29–32 and the turn regions1–3 are
rich in prolines which facilitates polypeptide chain reversal
permitting the formation of a globular structure. Several
insights have emerged from this large body of published
work.30,33–39 The rotational isomeric states, conformational
energies, and statistical weights of the local minimum-
energy conformers of di-, tri-, tetra-, and penta-L-proline,
with N-terminal acetyl and C-terminal methyl ester groups
have been investigated earlier.40 The occurrence of proline
in position i12 of the b-turn facilitates the formation of
type VIA1 turn,2,41,42 with a cis peptide bond between the
diproline segment. Diproline segments constitute a chain
fragment with considerably reduced conformational
choices. Proline has been recognized as playing a special
role in the folding and unfolding of globular proteins7
and a relatively high intrinsic probability (between 6.1 and
6.3, depending on the adjacent sequence) of existing as
the cis rather than the trans peptide isomer,43,44 whereas
for other amino acids the probability is much smaller.45
The presence of a cis-proline residue in the a-helix of a
protein or polypeptide has been suggested to cause a
reversal of direction of the helix,46,47 while a trans-proline
residue causes a disruption of the hydrogen bonding in
four residues of the a helix.48 The occurrence of cis pep-
tide bonds in proteins involving proline is a noteworthy
feature in proteins.49–51
The allowed regions of /, w space for both L-and D-pro-
line residues has been investigated in an earlier study,5,6
using designed peptides containing homochiral and hetero-
chiral diproline segments. In the current analysis, the pos-
sible conformational states for the diproline segment
(LPro-LPro) found in proteins taken from a non-redundant
dataset has been investigated and identified with an em-
phasis on the cis and trans states for the peptide bond
between the diproline segment. The occurrence of diproline
segments in type VIA1 turns (cis Pro-Pro peptide bond)
and other regular secondary structures like type III b-turnsand a-helices has also been studied. This has been followed
up by the amino acid distribution flanking the diproline
segment and the conformation adopted by Xaa-Pro and
Yaa-Pro segments in proteins.
MATERIALS AND METHODS
Selection of the Non-Redundant Dataset of Proteins
from PDBA dataset comprising of 2190, largely non-homologous (�25%
sequence similarity), high resolution (�1.8 A or better) protein
crystal structures was culled from the entire PDB52 using the
PISCES server.53 The R-factor was chosen to be � 25% for all the
structures in the search procedure. From this dataset, proteins con-
taining the following sequence stretches XPPY, XPPPY, XPPPPY,
XPPPPPY were identified and selected. This reduced the number of
proteins from 2190 to 653. Out of these 47 proteins were discarded
because they did not contain the total electron density for the
desired sequence stretches in the crystallographic data. Finally, with
606 protein structures in the non-redundant data set, the analysis
was carried out. The data set consisted of the PDB entries given in
List S1 of the Supporting Information (polypeptide chain identifiers
are indicated wherever homologous multiple chains are present).
Segregation of Sequences and Assignment of
Conformational Regions for the Proline ResidueIn the dataset, 809 sequences were retrieved that contained the
sequence Xaa-Pro-Pro-Yaa. Out of these 60 contained a cis Pro-Pro
peptide bond and the rest had a trans peptide bond between the
prolines. The 60 examples were then segregated into blocks of
secondary structure adopted by the Pro-Pro segment. Similar classi-
fication was also done for the 749 examples of the remaining
sequences where the peptide bond between the prolines is trans.
From the dataset, 20,654 sequences of the type Xaa-Pro-Yaa were
retrieved and grouped from 2132 proteins [List S2 of the Supporting
information] (from 2190 proteins) into conformational blocks
adopted by the Xaa-Pro and Pro-Yaa segments with cis and trans
conformations being taken care of. The flowchart shown in Figure 1
shows the scheme used. In the analysis, the following values of vari-
ous conformations were used: a region �/ 5 2308 to 2908, w 52208 to 2808; c region �/ 5 2408 to 21008, w 5 408 to 1008; PIIregion �/ 5 2308 to 2908, w 5 908–1808; Bridge region �/ 52508 to 21408, w 5 2258 to 258; Extended region �/ 5 21008 to21608, w5 908–1808.
RESULTS AND DISCUSSION
Distribution of Proline Conformations for the
Sequence Xaa-Pro-Pro-Yaa
The (/, w) plot for the proline resides in diproline segments
at positions i11 and i12 (cis and trans Pro-Pro peptide
bond) is shown in Figures 2A and 2D. The data indicate that
in the cis examples, the conformation adopted by proline in
position i11 is invariably PII and at position i12, proline
prefers a conformation in the PII or Bridge region. Hence,
the most favorable conformation for the prolines in diproline
segments would be one that has PII–Bridge or PII–PIIcombination. In the trans case, the conformation adopted by
proline in position i11 is overwhelmingly populated in the
Diproline Segments in Amino Acid Sequences of Proteins 55
Biopolymers
PII and right-handed a-helical region whereas for position
i12, the major conformation is PII and a with a substantial
amount of occurrences in Bridge and the C7 (c-turn) region.Hence, a lot of conformational diversity is observed in dipro-
line segments with trans peptide linkage between them.
Classification of the Sequences Having a cis Pro-ProPeptide Bond and trans Pro-Pro Peptide Bond
In our non-redundant dataset of 606 proteins, the Pro-Pro
bond was observed to be cis in 60 instances and trans in 749
instances. Among the examples with cis bonds, 30 partici-
pated in type VIA1 b-turns (Table I), 21 in PII–PII conforma-
tion, and the remaining 9 in loop regions. Among the exam-
ples with trans bonds, 32 participated in either a-helices (23),310 helices (7) or type III b-turns (2) (Table I). c-turnsoccurred in 27 examples with one of the prolines in the
diproline segment adopting a conformation in the C7
(c-turn) region (Table I). The rest 231 examples belonged to
loop regions in proteins and 459 examples belonged to the
category PII–PII.
Conformation of Pro-Pro Segments in Proteins
Table II lists the grouping of Pro-Pro segments in proteins
into the following 10 categories: (i) a–a, (ii) a–PII, (iii) PII–a,
(iv) PII–PII, (v) PII–c, (vi) PII–Bridge, (vii) c–PII, (viii)
Extended–PII, (ix) Extended–Bridge, and (x) Others (which
did not belong to anyone of the categories mentioned above).
Considering the sequence Xaa-Pro-Pro-Yaa, the first cis/trans
in the table refers to the Xaa-Pro peptide bond being cis or
trans and the second cis/trans in the table refers to the Pro-
Pro peptide bond being cis or trans.
Cis–Cis and Trans–Cis Configurations. The data is indica-
tive of the fact that the cis–cis configuration of the peptide
bond is quite rare for the diproline segment. Out of the seven
examples, for six of them the conformation adopted by the
first proline is inevitably PII followed by the second proline
residue conformation which occurs preferentially in the
Bridge region. A total of 53 examples were observed in the
trans–cis category with six of them belonging to the ‘‘Others’’
category. The first proline residue inevitably adopts a confor-
mation in the PII region (even in the six examples of the
‘‘Others’’ category, the conformation adopted by the first
proline residue is PII) with the second proline overwhelm-
ingly populating the Bridge region thereby justifying the
large number of occurrences of type VIA1 turns. This is
followed up by the conformation PII at position i12. It is
noteworthy that for cis Pro-Pro peptide bond, there is nearly
FIGURE 1 Flowchart of the scheme used for the analysis.
56 Saha and Shamala
Biopolymers
no occurrence (only one example found) of the PII–a confor-
mation, which is one of the most favored conformations
noted for trans Pro-Pro peptide bond.
Cis–Trans and Trans–Trans Configurations. Fifty-five
examples occurring in the category cis–trans points to the
fact that the difference in cis/trans isomerisation energies are
small in peptide bonds considering proline residues.49–51
Unlike the other two categories mentioned above, in this cat-
egory a substantial population (43 examples) adopts a PII–PIIconformation followed by 8 examples of PII–a conformation
and 3 examples of the diproline segment taking up a PII–cconformation. Number of examples in the PII–Bridge and
Extended–PII categories is negligibly small (one example
each). The data indicate that with a cis–trans peptide linkage,
PII–PII conformation is the most stable and favored confor-
mation for the Pro-Pro segment in proteins. The
highest number of examples (691) in the category trans–trans
FIGURE 2 Ramachandran map showing the distribution of backbone torsion angles (/, w) for(A) Proi11, (B) Proi12, for the sequence Xaa-Pro-Pro-Yaa and for which the Pro-Pro peptide bond is
cis. (C) Proi11 and (D) Proi12 for the sequence Xaa-Pro-Pro-Yaa and for which the Pro-Pro peptide
bond is trans.
Diproline Segments in Amino Acid Sequences of Proteins 57
Biopolymers
TableI
ListofTypeVIa1b-Turns,IIIb-Turns,Helices,an
dc-Turns
PDBID
TypeVIa1
b-Turn
Location
(/,w,x):
Pro
i11(D
eg)
(/,w,x):
Pro
i12(D
eg)
PDBID
Locationof
b-Turn/H
elix
(/,w,x):
Pro
i11(D
eg)
(/,w,x):
Pro
i12(D
eg)
b-Turn/H
elix
PDBID
Locationof
c-Turn
(/,w,x):
Pro
i11(D
eg)
(/,w,x):
Pro
i12(D
eg)
1NNLA
CP188P189A
271,168,0
274,210,178
1D3GA
GP364P365V
251,235,179
262,219,178
a1T9H
ARP73P74I
265,161,178
277,96,178
1V0WA
CP349P350L
275,173,2
291,22,2180
1I6LA
YP126P127L
257,245,178
262,234,178
310
1U71A
LP322P323H
271,156,2173
276,63,177
2F7VA
NP23P24R
264,163,0
285,21,2180
1IRDB
TP324P325V
254,244,177
266,235,174
a1C61A
YP339P340I
274,133,2176
280,63,179
2DPLA
KP301P302A
273,169,1
282,217,2176
1JG
1A
FP154P155K
252,237,173
256,228,2178
310
2CULA
VP168P169G
268,50,65
242,162,180
1AK0
NP88P69T
268,150,0
291,3,2179
1M22A
MP327P328L
242,251,2180
261,239,2179
a1F8EA
SP166P167T
288,161,179
279,98,2180
1LUGA
TP200P201L
247,137,12
286,8,168
1O2DA
TP205P206S
260,248,179
265,239,2180
a1LLFA
EP31P32V
254,122,2176
283,56,2174
1BS0A
RP350P351T
246,152,22
294,19,177
1O8XA
CP41P42A
257,254,2178
264,220,177
a2BKM
AGP61P62L
257,149,2174
279,63,2172
1H4GA
RP106P107G
282,169,3
2115,15,175
1QQFA
VP1250P1251V
253,244,2180
264,237,179
a1PMI
DP353P354I
277,158,180
281,42,176
1IFRA
AP508P509T
264,155,0
293,3,2178
1VNS
GP47P48L
261,258,2179
261,245,2180
a1QW9A
AP315P316L
266,155,2178
282,64,2177
1JFBA
DP91P92E
263,149,23
288,26,2178
1VNS
TP300P301R
250,243,179
253,231,179
a1UEKA
DP196P197Y
269,141,174
277,60,170
1JZ8A
NP111P112F
259,142,5
2102,20,2176
3SIL
FP267P268M
256,236,2176
262,225,177
TypeIII
1VNS
KP395P396F
251,133,2179
272,56,179
1K0M
ACP90P91R
262,154,4
283,212,2174
1V54A
LP106P107S
257,245,179
261,242,180
a1SU8A
LP584P585I
281,165,2171
284,67,2177
1N97A
HP201P202L
271,148,0
277,24,2179
1R7AA
LP256P257L
249,258,180
271,229,179
a1S9UA
SP85P86W
261,126,2174
282,42,179
2ERL
CP37P38Y
265,148,23
293,22,179
1NZJA
DP49P50R
259,235,168
253,228,180
310
1VLAA
EP101P102K
266,150,175
279,73,2165
1V54A
YP130P131L
290,163,2
277,214,2180
1RK6A
YP193P194A
249,244,179
261,221,2177
310
1X6OA
MP125P126D
271,157,174
277,46,2177
1V6SA
VP322P323F
259,150,0
287,17,2178
1UCDA
WP11P12A
245,250,2180
259,241,2180
a1W23A
VP239P240F
278,167,173
278,83,2165
1S1DA
QP207P208G
255,147,4
285,12,175
1V9FA
YP124P125I
246,236,178
258,219,177
310
1T1UA
LP240P241I
254,158,2178
278,61,2165
1VLPA
RP284P285Y
256,141,5
283,218,2176
1T6UA
KP65P66H
245,245,179
261,233,177
a2B61A
TP203P204D
276,160,2179
279,63,2175
1SG4A
NP23P24V
266,161,3
281,212,2169
1OJ8
ARP41P42R
250,248,2180
252,244,2179
a2EX4A
IP24P25T
255,139,2179
286,72,2178
1YXYA
YP81P82N
267,156,4
281,26,2177
1S7IA
IP106P107G
243,255,2180
260,223,2179
a/310
2FPQA
KP61P62R
278,159,179
280,55,179
1ZZM
AFP15P16F
268,151,1
284,5,2177
1VPM
ALFP20P21D
250,237,176
263,219,2176
310
2AGYA
LP172P173K
266,151,2172
277,59,2179
1TUOA
TP241P242S
255,153,0
291,19,180
1YDIA
MP75P76A
248,248,2179
255,236,2180
a2CDUA
VP120P121I
281,168,2175
283,47,175
2BW4A
KP22P23F
255,149,10
287,3,177
2AVDA
NP139P140E
257,243,2176
257,232,174
a2EX2A
AP163P164A
252,119,2176
281,76,2171
1WXCA
LP168P169Y
251,141,8
288,21,173
2AXOA
CP58P59A
242,246,180
256,240,2180
a2FUKA
AP142P143A
259,125,2179
270,60,178
2FHZB
TP54P55D
270,158,0
294,22,2180
2AEUA
NP241P242L
253,240,178
259,240,174
a2G8JA
FP62P63V
275,163,2177
280,68,2177
2B0JA
KP114P115K
262,144,6
291,21,2179
1WBEA
LP31P32F
248,240,178
257,224,2180
a/310
2ICYA
YP189P190G
286,131,2177
277,46,177
2F26A
RP315P316A
264,149,0
2101,24,2179
2ETVA
YP213P214F
249,243,176
262,235,174
a2GCIA
VP149P150L
271,132,174
276,60,2158
2DCFA
EP29P30H
280,151,0
286,1,2172
2DE3A
IP76P77L
259,248,180
267,230,179
a2JEKA
GP99P100D
252,144,6
291,11,2176
2HA8A
KP86P87N
251,240,177
262,220,179
a/310
2OJ5
ALP376P377L
248,142,8
295,15,174
2IO
IA
SP1125P1126L
255,252,2175
260,229,180
TypeIII
2IN
UA
YP344P345F
231,254,180
248,231,2178
310
2NT0A
SP98P99A
249,244,176
259,240,180
a
immediately leads to the conclusion that the trans peptide
bond is mostly favored between the diproline segment.
Thirty-two examples of helical conformation (Table I) fol-
lowed by nearly an equal number of examples (33) in the
PII–Bridge region adopted by the diproline segment immedi-
ately leads to the conclusion that with trans peptide bond
linkage, a–a and PII–Bridge conformations are equally likely.
However, the data strongly indicate that with trans peptide
linkage between the Prolines, PII–PII is the most stable and
favored conformation (416 examples). Left-handed polypro-
line II helices which are ‘‘very locally driven’’54 have, however,
not been investigated in this analysis. Twenty-four examples
were noted in the C7 (c-turn) region. The overall percentagedistribution of conformational states (Table II) reveals that
PII–PII and PII–a are the most favored states for the diproline
segment with percentage occurrences of 59.26% and 22%,
respectively followed by PII–Bridge even though the percent-
age occurrence is much less compared to the first two catego-
ries mentioned above. The table indicates that the population
in trans–cis and cis–trans states are comparable indicating
that the energy differences between these states is small.
However, trans–trans is the most populated state with a
percentage occurrence of 85.43%.
The puckering states of the pyrrolidine ring and its possi-
ble influence on diproline segment conformation has been
studied in a separate analysis (to be published later).
a–PII Conformations in Trans–Cis and Tran–TransConfigurations
Three examples of a–PII conformation observed in the data
merit mention, which have not been observed earlier.55 The
first example occurs unexpectedly in the trans–cis category in
the protein (Histone-Lysine N-methyltransferase, PDB ID:
2F69 A) with proline occurring at positions 341 and 342 of
the amino acid sequence. Proline at position 342 has been
inserted by molecular modeling methods and hence the cis
peptide bond may be an outcome of this procedure.56 The
rest two examples are found in the trans–trans category. Out
of these, in the protein ‘‘Probable Glutaminase YBAS; PDB
ID: 1U60 A,’’57 the diproline segment is part of a bend
characterized by a–PII conformation connecting two
b-strands. In the third example of a protein (Periplasmic
Binding Protein BUGD; PDB ID: 2F5X A),58 the diproline
segment is a part of an eight residue loop that connects a
helix and a b-strand. Hence, the data suggest that adoption
of a–PII conformation is primarily governed by the local
interaction present in that region of the protein, to facilitate
folding and initiating specific interactions.
Number of Occurrences of Proline in Various
Conformational States
Table III shows the number of occurrences of proline present
in diproline segments. For cis proline, PII is the most
preferred conformation at position i11 whereas for position
i12, the preferred conformation adopted by the proline
residue lies in the PII and Bridge regions. In the case of trans
proline, PII is still the dominant conformation both at posi-
tion i11 and i12 followed by an appreciable number of
occurrences in the right-handed a-helical conformation. The
number of occurrences of proline in the Bridge region and
the c-turn conformation are comparable in number. Thus,
this table clearly indicates that PII is the most preferred con-
formation at position i11 for both cis and trans proline and
the number of appreciably populated conformational states
occupied by proline at position i12 is greater in case of trans
proline than in cis proline.
Distribution of Flanking Residues and their
Preferred Conformational States (/,w)
The number of occurrences of all amino acids except proline
in the flanking positions is given in Table IV. A histogram
representation of these occurrences for each amino acid
(grouped into hydrophobic, polar, and charged categories)
except proline is also shown in Figure 3. Considering cis Pro-
Pro peptide bond (60 examples), for the left flanking posi-
tion i, the distribution shows a greater population for polar
and charged amino acids like Thr, Arg, and Lys. For position
i13 (right flanking position), hydrophobic amino acids like
Leu and Phe show a greater affinity. For the trans configura-
tion (749 examples) and considering position left flanking
position (i), the data shows an affinity of hydrophobic amino
Table II Conformation of Pro-Pro-Segments in Proteins
Type Cis–Cis Cis–Trans Trans–Cis Trans–Trans
Overall
%
a-a — — — 32 3.95
a–PII — — 1 2 0.37
PII–a — 8 1 169 22.00
PII–PII 2 43 19 416 59.26
PII–c — 3 — 23 3.21
PII–Bridge 4 1 26 33 8.27
c–PII — — — 1 0.12
Extended–PII — 1 — — 0.12
Extended–
Bridge
— — — 1 0.12
Others 1 2 6 14 2.83
Total 7 58 53 691 809
% 0.86 7.16 6.54 85.43
Diproline Segments in Amino Acid Sequences of Proteins 59
Biopolymers
acids (Leu in particular) to occur in this position whereas for
the right flanking position (i13), there is more tendency of
hydrophobic amino acids like Gly, Ala, Val, Leu and charged
amino acids like Glu and Lys to occur. Among the polar
amino acids only Thr shows an appreciable affinity to occur
in this position.
The distribution of the backbone torsion angles /,w in
the left flanking position (i) and right flanking position
(i13) for all 19 amino acids except proline is shown in Sup-
porting Information Figures S1–S5. Analysis of the left flank-
ing position (i) reveals that the Extended region is preferred
by Gly, Val, and Leu whereas Extended and polyproline
regions are both favored for the amino acids Ala, Ser, Thr,
Ile, Asp, Glu, Lys, Arg, Cys, Phe, and Tyr. The c-turn region
shows a marked affinity for the amino acids Asn, His, Phe,
and Tyr whereas the helical region is substantially populated
by Asn only. Considering the right flanking position (i13),
the following preferences are observed: helical region (ASN,
Gln, Val, and Leu); Bridge and helical (Gly, Lys, Arg, His,
Phe, and Tyr); Extended and helical (Ala, Ser, Thr, Asp, Glu,
Ile, and His); polyproline (Asp, Glu, and Ile). Met and Trp
are present in very small numbers in the dataset and hence
no statistically valuable conclusion can be made from theirs
distribution plot.
Conformation of Xaa-Pro and Pro-Yaa Segments in
Proteins (Sequences of the Type Xaa-Pro-Yaa)
Twenty thousand six hundred fifty-four sequences of the
type Xaa-Pro-Yaa (Xaa and Yaa are amino acids other than
proline) were retrieved from the dataset and then grouped
into various conformational categories with a clear demarca-
tion between cis and trans peptide bonds between the Xaa-
Pro and Pro-Yaa segments. Table V lists the various confor-
mational categories. Comparison with Table II clearly reveals
more allowed combinations of conformational states for
these segments than that are allowed for the diproline seg-
ment which stresses the effect of the torsional restriction
imposed by the pyrrolidine ring of proline on its conforma-
tion and on the neighboring residue. Table V indicates that
the Xaa-Pro peptide bond exists preferably as the trans con-
former. The same is valid for Pro-Yaa segments with the cis
conformer being populated to even lesser extent. The data
show that a–a, PII–a, PII–PII, and Extended–PII are the most
Table III Number of Occurrences of Proline in Various Conformations
Xaa-P-P-Yaa Sequence (cis Proline) Xaa-P-P-Yaa Sequence (trans Proline) Xaa-P-P-P-Yaa Sequence (trans Proline)
No. of occurrences No. of occurrences No. of occurrences
Conformation i11 Position i12 Position i11 Position i12 Position i11 Position i12 Position i13 Position
a 1 1 34 209 — 2 12
PII 5217 22 697112 46314 41 38 20
Bridge — 30 — 35 — — 8
c — — 1 26 — — 1
Extended — — 2 — — — —
Cis or trans proline refers to the Pro-Pro peptide bond being cis or trans. The numbers in bold font are examples form the category ‘‘Others.’’
Table IV Number of Amino Acid Occurrences for the Left
Flanking Position (i) and the Right Flanking Position (i13)
(seq. Xaa-P-P-Yaa)
Left Flanking
Position (i)
Right Flanking
Position (i13 )
Amino
acid
Cis
Pro-Pro
Peptide
Bond
Trans
Pro-Pro
Peptide
Bond
Cis
Pro-Pro
Peptide
Bond
Trans
Pro-Pro
Peptide
Bond
Gly (G) 4 41 3 85
Ala (A) — 56 4 68
Val (V) 2 65 3 58
Leu (L) 5 126 8 63
Ile (I) — 43 1 26
Met (M) — 14 — 8
Phe (F) 1 53 11 23
Trp (W) — 5 — 4
Asn (N) 5 39 1 28
Cys (C) 5 19 — 11
Ser (S) 2 28 1 41
Thr (T) 8 56 5 47
Tyr (Y) 3 26 6 24
Gln (Q) 4 31 2 34
Glu (E) 2 31 4 83
Asp (D) 4 29 3 36
Arg (R) 7 36 3 32
His (H) 1 24 4 28
Lys (K) 7 27 1 50
60 Saha and Shamala
Biopolymers
populated states for Xaaa-Pro and Pro-Yaa segments as
compared to PII–a and PII–PII states observed for the Po-Pro
segment.
Cis Peptide Linkage. Table V indicates that a–a, a–PII, PII–a,PII–c, c–PII, and Extended–Bridge conformational combina-
tions are less favored for the Xaa-Pro segment as in the case
of Pro-Pro segments. PII–PII and PII–Bridge are favored com-
binations in both the cases. However, the Extended–PII is
heavily populated which is absent in the case of Pro-Pro.
This clearly points that when proline takes up a PII confor-
mation, the preceding amino acid residue either adopts an
Extended or a PII conformation in majority of cases. How-
ever, when proline takes up a conformation in the Bridge
region, the preceding amino acid is invariably takes up a PIIconformation. This agrees with the occurrence of Type VIA1
turns in Xaa-Pro segments with a cis peptide linkage between
Xaa and Proline. Two categories that merit mention are
Extended–a (96 examples) and Extended–PII combinations
(334 examples), which is absent for diproline segments hav-
ing a cis Pro-Pro peptide bond. These two categories repre-
sent examples belonging to cis-Pro-touch turns.59 The other
combinations listed in Table V shows that they are infrequent
and seldom present.
The Pro-Yaa segment very rarely takes up a cis configura-
tion of the peptide bond between them. There is no appreci-
able population in any category than ‘‘Others.’’ Only nine
examples of PII–Extended conformation are observed.
Trans Peptide Linkage. For trans Xaa-Pro peptide bond
linkage, it is observed that PII–PII is the most favored combi-
nation of conformation followed by the Extended–PII combi-
nation (unlike diproline segment analysis). a–a and PII–a are
populated to a substantial extent as in the case of Pro-Pro.
FIGURE 3 Amino acid distribution (except proline) showing the number of occurrences of all
amino acids in the flanking position i and i13 for the sequence stretch X-P-P-Y with (A) cis Pro-Pro
peptide bond, (B) trans Pro-Pro peptide bond.
Diproline Segments in Amino Acid Sequences of Proteins 61
Biopolymers
However, both PII–c and PII–Bridge conformational blocks
show a decrease in percentage than that observed for Pro-
Pro. A total of 38 examples are observed for the a–PII combi-
nation which was present in trace quantities for Pro-Pro.
Considering the category trans Pro-Yaa peptide bond, the
conformational blocks are more populated than any other
category. It is observed that PII–PII, a–a, and PII–a conforma-
tional combinations are highly preferred for this segment in
the trans case which corroborates with the analysis on dipro-
line segments mentioned earlier in this analysis. a–PII combi-
nation is present in large numbers in this category. The
a–Bridge combination interestingly is quite heavily popu-
lated in this category.
CONCLUSIONSThe data and analysis presented in this present study leads to
a number of conclusions pertaining to preferred proline con-
formation in diproline segments. It is observed that for cis
Pro-Pro peptide bond, the conformation adopted by the first
Proline lies in PII region whereas the second proline inevita-
bly adopts a conformation in the Bridge region, leading to
the formation of the type VIA1 b-turn structure. However, in
the trans case, the conformation adopted by the first proline
is overwhelmingly populated in the PII (polyproline) and
right-handed a-helical region. For position i12, the major
conformation adopted by proline is PII and a with a substan-
tial amount of occurrences in Bridge and the C7 (c-turn)region. The analysis also reveals that the cis–cis configuration
of the peptide bond is very rare when considering the dipro-
line segment (Table II). With a cis–trans peptide linkage, PII–
PII conformation is the most stable and favored conforma-
tion for the Pro-Pro segment in proteins. The trans peptide
bond is mostly favored between the diproline segment pro-
teins. With trans peptide bond linkage between the proline
residues, a–a and PII–Bridge conformations are equally likely.
The overall percentage distribution of conformational states
for the diproline segment reveals that PII–PII and PII–a are
Table V Conformation of Xaa-Pro and Pro-Yaa Segments in Proteins (20,654 Sequences of the Type Xaa-Pro-Yaa )
Type
Xaa-Pro Segments Pro-Yaa Segments
Total Overall %Cis Trans Cis Trans
a–a 1 1624 3886 5511 13.34
a–PII 6 38 138 182 0.44
a–c 10 26 36 0.10
a–Bridge 1 211 1944 2156 5.22
a–Extended 3 456 459 1.11
PII–a 11 2418 2007 4436 10.74
PII–PII 150 4344 3 3121 7618 18.44
PII–c 1 257 169 427 1.03
PII–Bridge 188 655 1 563 1407 3.41
PII–Extended 5 1 9 2983 2998 7.26
Bridge–a 3 5 1193 1201 2.91
c–a 99 99 0.24
c–PII 3 117 120 0.29
c–c 3 3 0.01
c–Bridge 2 2 38 42 0.10
c–Extended 1 1 1 192 195 0.47
Bridge–PII 1 4 285 290 0.70
Bridge–c 24 24 0.06
Bridge–Extended 2 295 297 0.72
Extended–a 96 1383 11 1490 3.61
Extended–PII 334 4290 20 4644 11.24
Extended–c 1 169 170 0.41
Extended–Bridge 519 519 1.26
Extended–Extended 2 4 11 17 0.04
Others 231 3677 33 3026 6967 16.86
Total 1036 19,618 47 20,607 41,308
% 2.51 47.49 0.11 49.89
Cis or trans proline refers to the Xaa-Pro/Pro-Yaa peptide bond being cis or trans.
62 Saha and Shamala
Biopolymers
the most favorable conformational states for the diproline
segment with percentage occurrences of 59.26% and 21.97%,
respectively. PII–Bridge is the third most preferred conforma-
tion even though the percentage occurrence is much less
compared to the first two categories mentioned above. The
a–a and PII–c conformations are populated nearly equally
likely. The population in trans–cis and cis–trans states are
comparable indicating that the energy differences between
these states is small. However, trans–trans is the most popu-
lated state with a percentage occurrence of 85.43%.
The analysis and comparison of conformational states
with the Xaa-Pro-Yaa sequence reveals that the Xaa-Pro pep-
tide bond exists preferably as the trans conformer rather than
the cis conformer. The same is valid for Pro-Yaa segment,
with the cis conformer being populated to even lesser extent.
The data show that a–a, PII–a, PII–PII, and Extended–PII are
the most populated states for Xaa-Pro and Pro-Yaa segments
as compared to PII–PII and PII–a and states observed for the
Pro-Pro segment. Considering individual proline residues,
PII is the most preferred conformation at position i11 for
both cis and trans proline. The data presented in Table V
immediately leads to the conclusion that the amino acid fol-
lowing the proline in majority of cases adopts either a right-
handed helical (a) or Bridge conformation when proline
takes up a right-handed helical (a) conformation. With pro-
line taking up a PII conformation, the amino acid following
proline preferably adopts a conformation in the right-handed
helical (a), polyproline (PII), or Extended region of the Ram-
achandran map. The Extended–a and Extended–PII blocks
that were quite heavily populated in the case of trans Xaa-
Pro segments are nearly absent in the case of trans Pro-Yaa
segments. Thus, these results in turn may lead to better
understanding of the behavior of proline occurring in dipro-
line segments which can then be utilized for designing
various diproline-based synthetic templates for biological
and structural studies.
Authors sincerely thank Prof. P. Balaram (MBU, IISc) and Prof. N.V.
Joshi (CES, IISc) for their valuable comments on the preparation of
the manuscript. Authors also thank Dr. Raghurama Hegde who had
helped to prepare the perl scripts needed for the study.
REFERENCES1. Chou, P. Y.; Fasman, G. D. J Mol Biol 1977, 115, 135–175.
2. Wilmot, C. M.; Thornton, J. M. J Mol Biol 1988, 203, 221–232.
3. Richardson, J. S.; Richardson, D. C. Prediction of Protein Con-
formation; Plenum: New York, 1989; pp 1–98.
4. Richardson, J. S.; Richardson, D. C. Trends Biochem Sci 1989,
14, 304–309.
5. Chatterjee, B.; Saha, I.; Raghothama, S.; Aravinda, S.; Rai, R.;
Shamala, N.; Balaram, P. Chem—Eur J 2008, 14, 6192–6204.
6. Saha, I.; Chatterjee, B.; Shamala, N.; Balaram, P. Biopolymers
(Peptide Sci) 2008, 90, 537–543.
7. Levitt, M. J Mol Biol 1981, 145, 251–263.
8. MacArthur, M. W.; Thornton, J. M. J Mol Biol 1991, 218, 397–
412.
9. Reimer, U.; Scherer, G.; Drewello, M.; Kruber, S.; Schutkowski,
M; Fischer, G. J Mol Biol 1998, 279, 449–460.
10. Eyles, S. J.; Gierasch, L. M. J Mol Biol 2000, 301, 737–747.
11. Schimmel, P. R.; Flory, P. J. J Mol Biol 1968, 34, 105–120.
12. Chou, P. Y.; Fasman, G. D. Biochemistry 1974, 13, 211–222.
13. Chou, P. Y.; Fasman, G. D. Biochemistry 1974, 13, 222–245.
14. Anfinsen, C. B.; Scheraga, H. A. Adv Protein Chem 1975, 29,
205–300.
15. Robson, B.; Suzuki, E. J Mol Biol 1976, 107, 327–356.
16. Zimmerman, S. S.; Scheraga, H. A. Proc Natl Acad Sci USA
1977, 74, 4126–4129.
17. Richardson, J. S.; Richardson, D. C. Science 1988, 240, 1648–
1652.
18. Smith, C. K.; Withka, J. M.; Regan, L. Biochemistry 1994, 33,
5510–5517.
19. Minor, D.L., Jr.; Kim, P. S. Nature 1994, 367, 660–663.
20. Piela, L.; Nemethy, G.; Scheraga, H. A. Biopolymers 1987, 26,
1587–1600.
21. Presta, L. G.; Rose, G. D. Science 1988, 240, 1632–1641.
22. Yun, R. H.; Anderson, A. D.; Hermans, J. Proteins Struct Funct
Genet 1991, 10, 219–228.
23. Adzhubei, A. A.; Sternberg, M. J. E. J Mol Biol 1993, 229, 472–
493.
24. Aurora, R.; Rose, G. D. Protein Sci 1998, 7, 21–38.
25. Gunasekaran, K.; Gomathi, L.; Ramakrishnan, C.; Balaram, P.
J Mol Biol 1998, 284, 1505–1516.
26. Viguera, A. R.; Serrano, L. Protein Sci 1999, 8, 1733–1742.
27. Kim, M. K.; Kang. Y. K. Protein Sci 1999, 8, 1492–1499.
28. Chakrabarti, P.; Chakrabarti, S. J Mol Biol 1998, 284, 867–873.
29. Searle, M. S.; Williams, D. H.; Packman, L. C. Nat Struct Biol
1995, 2, 999–1006.
30. Gunasekaran, K.; Ramakrishnan, C.; Balaram, P. Protein Eng
1997, 10, 1131–1141.
31. Simpson, E. R.; Meldrum, J. K.; Bofill, R.; Crespo, M. D;
Holmes, E.; Searle, M. S. Angew Chem Int Ed Engl 2005, 44,
4939–4944.
32. Bofill, R; Simpson, E. R.; Platt, G. W.; Crespo, M. D.; Searle, M.
S. J Mol Biol 2005, 349, 205–221.
33. Venkatachalapathi, Y. V.; Balaram, P. Nature 1979, 281, 83–84.
34. Smith, J. A.; Pease, L. G. CRC Crit Rev Biochem 1980, 8, 315–
399.
35. Balaram, P. Proc Ind Acad Sci Chem Sci 1984, 93, 703–717.
36. Sibanda, B. L.; Thornton, J. M. Nature 1985, 316, 170–174.
37. Gellman, S. H. Curr Opin Chem Biol 1998, 2, 717–725.
38. Balaram, P. J Pept Res 1999, 54, 195–199.
39. Kaul, R.; Balaram, P. Bioorg Med Chem 1999, 7, 105–117.
40. Tanaka, S.; Scheraga, H. A. Macromolecules 1974, 7, 698–705.
41. Richardson, J. S. Adv Protein Chem 1981, 34, 167–339.
42. Hutchinson, E. G.; Thornton, J. M. Protein Sci 1994, 3, 2207–
2216.
43. Brandts, J. F.; Halvorson, H. R.; Brennan, M. Biochemistry
1975, 14, 4953–4963.
44. Grathwohl, C.; Wuthrich, K. Biopolymers 1976, 15, 2025–2041.
45. Ramachandran, G. N.; Mitra, A. K. J Mol Biol 1976, 107, 85–92.
Diproline Segments in Amino Acid Sequences of Proteins 63
Biopolymers
46. Pauling, L. J Am Chem Soc 1940, 62, 2643–2657.
47. Edsall, J. T. J Polym Sci 1954, 12, 253–280.
48. Toma, F.; Fermandjian, S.; Low, M.; Kisfaludy, L. Biochim Bio-
phys Acta 1978, 534, 112–122.
49. Stewart, D. E.; Sarkar, A.; Wampler, J. E. J Mol Biol 1990, 214,
253–260.
50. Pal, D.; Chakrabarti, P. J Mol Biol 1999, 294, 271–288.
51. Wedemeyer, W. J.; Welker, E.; Scheraga, H. A. Biochemistry
2002, 41, 14637–14644.
52. Bernstein, F. C.; Koetzle, T. F.; Williams, G. J. B.; Meyer, E. F.; Jr.
Brice, M. D.; Rodgers, J. R.; Kennard, O.; Shimanouchi, T.;
Tasumi M. J Mol Biol 1977, 112, 535–542.
53. Wang, G.; Dunbrack, R. L., Jr. Bioinformatics 2003, 19, 1589–
1591.
54. Creamer, T. P. Proteins Struct Funct Genet 1998, 33, 218–226.
55. Rai, R.; Aravinda, S.; Kanagarajadurai, K.; Raghothama, S.;
Shamala, N.; Balaram, P. J Am Chem Soc 2006, 128, 7916–7928.
56. Couture, J. F.; Collazo, E.; Hauk, G.; Trievel, R. C. Nat Struct
Mol Biol 2006, 13, 140–146.
57. Brown, G.; Singer, A.; Proudfoot, M.; Skarina, T.; Kim, Y.;
Chang, C.; Dementieva, I.; Kuznetsova, E.; Gonzalez, C. F.; Joa-
chimiak, A.; Savchenko, A.; Yakunin, A. F. Biochemistry 2008,
47, 5724–5735.
58. Huvent, I.; Belrhali, H.; Antoine, R.; Bompard, C.; Locht, C.;
Dubuisson, F. J.; Villeret, V. J Mol Biol 2006, 356, 1014–1026.
59. Videau, L. L.; Arendall, W. B., III; Richardson, J. S. Proteins
Struct Funct Bioinformatics 2004, 56, 298–309.
Reviewing Editor: J. Andrew McCammon
64 Saha and Shamala
Biopolymers