EXACTLY SOLVABLE MODEL OF PROTEIN FOLDING: RUBIK… · April 23, 1999 11:2 WSPC/140-IJMPB 0083 328...
Transcript of EXACTLY SOLVABLE MODEL OF PROTEIN FOLDING: RUBIK… · April 23, 1999 11:2 WSPC/140-IJMPB 0083 328...
April 23, 1999 11:2 WSPC/140-IJMPB 0083
International Journal of Modern Physics B, Vol. 13, No. 4 (1999) 325–361c© World Scientific Publishing Company
EXACTLY SOLVABLE MODEL OF PROTEIN FOLDING:
RUBIK’S MAGIC SNAKE MODEL
KAZUMOTO IGUCHI∗
70-3 Shinhari, Hari, Anan, Tokushima 774-0003, Japan
Received 14 December 1998
I study the conceptual framework of protein folding considering an exactly solvablemodel — the Rubik’s magic snake model. I discuss the mathematical representation ofthe model, the model Levinthal paradox, the non-unique compact folded structure, thefunction of the chain, the ground state energy, the commensurability between the foldedstructure and the potential sequence, the relationship between the unique ground stateand broken symmetry in this model and the dual model and the inflation of the magicsnake chain, respectively.
PACS number(s): 36.20.-r, 87.10.+e, 87.15.By
1. Introduction
Denaturation of proteins has been of strong interest for both biological and the-
oretical physicists for a long time.1 A denatured protein can fold into a native
three-dimensional structure which is coded by the amino acid sequence in the pri-
mary structure of the protein.2 It has been a long standing problem to answer the
following questions:
(1) How does the protein find a pathway to the folded structure quite rapidly from
astronomically many possibilities of the structure? — Levinthal paradox 3;
(2) Is the native structure unique? — Unique ground state problem4;
(3) What is the relationship between the native folded structure and the amino
acid sequence? — The second genetic code problem.5
There have been mainly three approaches to study the protein folding (PF)
problem:
(a) One is the experimental determination of the PF.4,6 In this approach, sub-
millisecond laser pulse spectroscopy experiments6 have revealed the nature of
the relaxation process from unfolded to folded structures in the PF, where
∗E-mail: [email protected]
325
April 23, 1999 11:2 WSPC/140-IJMPB 0083
326 K. Iguchi
the intensity of pulse in the relaxation process is fitted by the Williams–Watts
function for describing a nonexponential decay.7
(b) The second is the so-called inverse PF problem.8 In this approach, it has been
conjectured that the structures of the order of a thousand are realized in the
real PF in Nature and the other structures are obtained by a combination of
these structures.
(c) The third is the computer associated determination of the PF. There have re-
cently appeared many theoretical investigations.9 In this approach, one mimics
a protein as a self-avoiding linear chain of heteropolymer placed on a three- (or
two-) dimensional cubic (or square) lattice. Then, given an amino acid sequence
with defining potential differences, one searches by computer to find the lowest
configurational energy out of many configurations of folded structures.
However, since there is the limit of computer power and time, one must be re-
stricted to usually adopt anN ×N ×N (= N3 sites, N = 3 ∼ 5) cubic lattice for
the purpose. Therefore, even if one successfully obtains the lowest energy state of
the folded protein, one cannot answer the above questions from this approach, so
far. Hence, one needs a much more different approach to understand the concep-
tual framework of the PF problem, which is sufficient to conceptually answer the
above problems. In this paper, I would like to present such an alternative approach,
considering a toy model10 for the PF problem, part of which has been published
recently.11
It may be appropriate here to describe the motivation underlying the study of
a system of toy models. One would like, of course, to study the general PF problem
with any amino acid sequence. Such a program can be formalistically carried out.
It is, however, generally recognized that to draw any definite physical conclusions
from such a general program is very difficult. If one makes approximations on the
general problem in order to arrive at concrete results, one usually encounters the
great difficulty of defining and justifying the validity of the approximation made. I
therefore start instead from a concrete model, which is sufficiently simple so that
one might hope to be able to discuss the validity of the method of approach.
The organization of the paper is as follows. In Sec. 2, I will introduce the toy
model of Rubik’s magic snake chain for the PF and the mathematical representation
of the model using the four types of rotational operations for the conformation of
the model. In Sec. 3, I will discuss the model Levinthal paradox which seems to be
important to discuss the Levinthal paradox in the real proteins. In Sec. 4, I will
mathematically represent the compact folded structures. In Sec. 5, I will discuss
how to construct and classify the compact folded structures mathematically, using
a graph theoretical approach. In Sec. 6, I will discuss the function of the magic snake
chain. In Sec. 7, I will define the model Hamiltonian for the system. In Sec. 8, I will
present the ground state energy of the system. In Sec. 9, I will discuss the ground
state energy difference between the folded and the unfolded structures of the magic
snake chain. In Sec. 10, I will discuss the ground state energy difference between
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 327
the folded structures with different helicity of S = 0 and S = ±1. In Sec. 11, I
will discuss the commensurability between the potential sequence and the folded
structure. In Sec. 12, I will discuss the relationship between the appearance of the
lowest ground state and the concept of broken symmetry in the PF problem. In
Sec. 13, I will discuss the important nature that is inherent in the geometry of the
magic snake model, such as the dual model and the inflation of the magic snake
chain. In Sec. 14, I will draw a conclusion.
2. Toy Model of Protein Folding
Let us first introduce a toy model for understanding the conceptual framework of
PF. The reason why I study this model is described as follows.
In solid state physics, it is very well-known that a solid is constructed by close-
packedly piling up single building blocks — unit cells, and the varieties of solids
appear from symmetry of the unit cells.12 This concept has recently been extended
to nonperiodic solids — the so-called quasicrystals, where the building blocks are
a combination of several unit segments of different shapes and the varieties of
quasicrystals appear from the way of piling up the unit segments.13 Thus, the whole
macroscopic geometry of a solid is governed by the microscopic geometry of the unit
cell. In the sense that the native structure of a protein is governed by a sequence of
20 types of amino acids to make a close-packed compact structure, the PF problem
seems similar to that of solid state physics. One now would like to ask whether or
not such unit cells or segments which dominate the whole geometry of the folded
structure of proteins exist in the PF. This can be thought of as a kind of inverse
problem for the PF. Because, the normal problem for the PF is that given an amino
acid sequence of a protein, one usually considers what the folded structure is.
Is there any example for this? Yes, there are simple models which satisfy the
above condition. One is the Rubik’s magic snake model.10 Another is the Fuller’s
tensegrity model14 where there are many varieties of the models. Although Fuller’s
tensegrity models are more realistic to the PF problem, I would like to restrict
myself to consider only the former in this paper. Because even if the model is so
simple, there still exist many unsolved interesting problems within this model.
The Rubik’s magic snake10 is constructed by 24 triangular segments, each of
which has five surrounding faces where the top and bottom faces are right-angled
isosceles triangles and three sides faces are two squares and one rectangle, respec-
tively. The square face of one segment is attached to the square face of the nearest
neighbor segment to make a chain of 24 triangular segments (Fig. 1), and each seg-
ment can rotate around the faces such that mainly four directions of rotation are
fixed to define four configurations of the adjacent two segments: cis (c) (no rotation
of φ = 0), trans (t) (rotation of φ = 180), the right gauche (g+) (rotation of
φ = 90) and the left gauche (g−) (rotation of φ = 270) positions, respectively
(Fig. 2).
I now find simple geometrical constraints as
April 23, 1999 11:2 WSPC/140-IJMPB 0083
328 K. Iguchi
Fig. 1. Rubik’s magic snake model. The upper figure is the folded structure with helicity of S = 0seen from the three-fold symmetry axis and the lower figure the unfolded structure.
Fig. 2. Four types of configuration between the two adjacent segments. There are cis (c) (norotation of φ = 0), trans (t) (rotation of φ = 180), the right gauche (g+) (rotation of φ = 90)and the left gauche (g−) (rotation of φ = 270) positions, respectively.
c4 = cccc = 1 (1)
which means that four successive operations of cis-configurations make a square
segment — a cycle or closed loop of period of four [Fig. 3(a)] and
g+g−g+g−g+g− = g−g+g−g+g−g+ = 1 (2)
which means that an alternative six operations of the two types of gauche config-
urations make a cycle of period six [Fig. 3(b)]. These can be regarded as defining
relations for the free group made by strings of the four symbols, c, t, g+ and g−.
Thus, the set of the four symbols forms an alphabet Λ for the folding problem
[i.e. Λ ≡ c, t, g+, g−].
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 329
(a)
(b)Fig. 3. Defining relations for the free group constructed by Λ = c, t, g±. (a) The definingrelation, c4 = 1, is drawn and (b) the defining relation, g+g−g+g−g+g− = g−g+g−g+g−g+ = 1,is drawn.
3. Model Levinthal Paradox
I now encountering a very similar problem to the Levinthal paradox3 in this model.
Since there are four possibilities of configuration coded by the four symbols, c, t,
g+ and g−, at each attached face and there are 23 such rotational faces, the total
number of configurations of this magic snake is 423 ≈ 1013. Here each configura-
tion is represented by a string (or word) of 23 letters of the four symbols such
as cctg+g−ttccttcg+g+g−g−cttctcg+g−, etc. Each string describes the sequence of
conformation of the chain. Do not confuse this with the amino acid sequence of the
chain constructed from the 20 amino acid codes. These are different from each other.
For example, I give some typical configurations: t23 is a linear chain; (g+)23 [(g−)23]
the right (left) helix; t10cct11 a hair pin; t6g+g−t6g+g−t6g+(≈ 1) a triangular loop.
Here ≈ means an equivalence relation.
The meaning of the equivalence relation is as follows.
t6g+g−t6g+g−t6g+g− = 1 (3)
means an exact closed loop without ends [Fig. 4(a)]. Here, to assign the sequence
of the symbols is not unique since one can read the sequence backward along the
course of the sequence of the chain. Therefore, the reverse order of the sequence of
the symbols,
g−g+t6g−g+t6g−g+t6 = 1 (4)
also represents the same folded structure. Hence, all cyclic and anticyclic permu-
tations of the sequence of the symbols represent the same closed loop structure as
April 23, 1999 11:2 WSPC/140-IJMPB 0083
330 K. Iguchi
(a)
(b)
Fig. 4. Meaning of equivalence unity. (a) A closed loop structure of g+t6g−g+t6g−g+t6g− = 1(a triangular loop) is drawn. (b) The equivalent closed loop structure with the ends in thechain is shown, where g− is removed to place the ends. This structure is represented byg+t6g−g+t6g−g+t6 ≈ 1.
well. This is always valid for any closed loop structure represented by “= 1”. If
there is no confusion, I use only one of them for the sake of simplicity.
Let us consider the closed-loop case of the magic snake chain model. For exam-
ple, consider the case of Eq. (3). In this case, there are two end faces that meet at one
position in the closed loop of the magic snake model. Suppose that such end faces
meet at the position where the rotational conformation is represented by one of the
three symbols of t, g+ and g− (say, “g−”). This is mathematically represented by
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 331
removing one of the 24 symbols in the sequence. In the case of t6g+g−t6g+g−t6g+g−,
the last g− is removed. But the geometry of the chain is still a closed loop [Fig. 4(b)].
This is the meaning of the equivalence relation. Therefore, a string representing a
closed loop turns out to be equivalence unity.
Suppose that I need a second to flip the segment at each time. Then, to find
a compact folded structure I need 1013 s ≈ 105 yr, which absolutely exceeds my
life time. Therefore, if I search the folded structure in this way such as statistical
random searching, then there is no hope for me to meet the desired structure in
my life. However, I can easily find a folded structure within a few minutes! This
situation seems very similar to the Levinthal paradox.3 Hence, I would like to call
this situation the model Levinthal paradox.
How can I find such a folded structure of the magic snake for a short time?
As one can recognize if one challenges to solve the toy model, a guiding principle
to reach the folded structure is to find locally compact or close-packed structure:
First, I want to make a local part in the chain as smallest as possible, which is the
structure of period four [i.e. c4-structure, see Fig. 3(a)]. But this is impossible by
volume exclusion (i.e. self-avoiding) of the magic snake structure. Second, I want to
find the next smallest part of period six, which is possible. The shape of this part is a
bit different from the closed loop of (g+g−)6 but locally close-packed represented as
−g+g+g−g+g−g−− (Fig. 5), which can be regarded as an example of short-ranged
local interaction of the chain.5 I keep continuing the same procedure to find a final
folded structure with helicity of S = 0 (This concept of helicity will be discussed
later):
g+g−g+g−g−g+g−g+g+g−g+g−g−g+g−g+g+g−g+g−g−g+g− ≈ 1(S = 0) , (5)
which is obtained from the corresponding closed loop structure without ends of the
chain:
g+g−g+g−g−g+g−g+g+g−g+g−g−g+g−g+g+g−g+g−g−g+g−g+ = 1(S = 0) , (6)
by removing the last symbol “g+” (Fig. 6).
Fig. 5. The locally compact structure. This is constructed by changing the closed loop struc-ture of period six, g+g−g+g−g+g− = 1 (or g−g+g−g+g−g+ = 1) to −g−g−g+g−g+g+− (or−g+g+g−g+g−g−−).
April 23, 1999 11:2 WSPC/140-IJMPB 0083
332 K. Iguchi
Fig. 6. The compact folded structures with helicity of S = 0,±1. There is one three-fold sym-metric axis for the folded structure of S = 0 while there is no three-fold symmetric axis for thefolded structure of S = ±1.
This procedure saves much time to find a folded structure as follows: I first
make a six-segment module and make the other three such modules, successively.
Since the total number of configurations of the six-segment module is 46 = 4096,
the total number of configurations of the four modules of six segments is about
4096× 4 ≈ 1.6× 104. Therefore, time to find the folded structure is of the order of
1.6×104 s = 273 min = 4.5 h. In this way, to find a locally close-packed structure is
very significant for the PF problem, which may solve the Levinthal paradox in the
real PF problem.
4. Configuratons of the Folded Structure
Let us consider next whether or not the compact folded structure is unique. Con-
trary to the expectation in the real PF problem,1–6,8,9 the folded structure of
Eq. (5) is not unique, but there are many other possible configurations. This is
a consequence of the closed loop structure of the folded structure of Eq. (6). There
are two square end faces in the magic snake model. These meet each other at
the same position in the closed loop of Eq. (6), which can be regarded as an ex-
ample of long-ranged nonlocal interaction of the chain.5 Unless this closed loop
structure is a real closed loop gluing the end faces, there appear 24 possibilities to
place the meeting position of the pair of end faces in the closed loop. This is math-
ematically represented by considering all cyclic permutations of the string given by
Eq. (5) such as g−g+g−g+g−g−g+g−g+g+g−g+g−g−g+g−g+g+g−g+g−g−g+, etc.
Here, I have assumed that I always start reading the sequence from the first inter-
face between the first and second segments (i.e. the rotational face nearest to one
end) to the 23rd interface between the 23rd and 24th segments (i.e. the rotational
face nearest to the other end). Thus, the structure of Eq. (5) has the 24 geomet-
rically equivalent degenerating configurations. This must be true in any kind of
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 333
closed loop structure of chains. However, this point is missing in the arguments in
the previous literature.1–6,8,9
Let us consider whether or not more folded structures exist. Since there is a
three-fold symmetry axis in the structure of Eq. (5), I can define the helicity (or
chirality) S such that the helicity of this folded structure is denoted by S = 0. By
the aid of this concept of helicity, I can find more folded structures with the right-
(left-) handedness where there is no three-fold symmetry axis and hence I define
helicity of S = 1(−1). These folded structures are given by
g+g−g+g−g−g+g+g−g−g+g−g+g+g−g+g−g−g+g+g−g−g+g− ≈ 1(S = −1) , (7)
g−g+g−g+g+g−g−g+g+g−g+g−g−g+g−g+g+g−g−g+g+g−g+ ≈ 1(S = 1) . (8)
Here there are 24 geometrically equivalent folded structures for each helicity
(Fig. 6). These are obtained from the closed loop structures of S = ±1:
g+g−g+g−g−g+g+g−g−g+g−g+g+g−g+g−g−g+g+g−g−g+g−g+ =1(S = −1) ,
(9)
g−g+g−g+g+g−g−g+g+g−g+g−g−g+g−g+g+g−g−g+g+g−g+g−=1(S = 1) ,
(10)
by removing the last symbol “g+(g−)”, respectively.
In this way, I find totally the 24 geometrically equivalent degenerating folded
structures for each helicity in the magic snake model. All of the 24 structures for
each helicity are regarded as one structure once all positions of local and nonlocal
interactions of the chain are glued [see, Eqs. (5), (9) and (10)]. And also, the com-
pactness of the folded structures for each helicity are very similar to each other.
Therefore, the compactness of all these is the same in the classical mechanics level.
Hence, I find totally the 72 geometrically almost equivalent degenerating folded
structures.15
5. Constructing Folded Structures from Modules
There are very interesting aspects of the folded structures. Suppose that
there are four segments of period six, each of which is represented by either
g+g−g+g−g+g− = 1 or g−g+g−g+g−g+ = 1. These segments of period six can
be regarded as a model for modules or domains for the PF.16 So, I call the seg-
ments modules. If I put the four modules together to make one compact structure,
then this is geometrically very similar to the folded structures. The symmetry of
this compact structure is tetrahedral with four three-fold symmetry axes and there
are six positions where rectangular faces of one module are attached to those of
the other three modules. From this, I can construct the three types of the folded
structures with different helicities.
Mathematically, this is carried out as follows: Let us represent the structure of
four contact modules (i.e. an assembly or a cohesive structure of four modules) by
April 23, 1999 11:2 WSPC/140-IJMPB 0083
334 K. Iguchi
(g+g−g+g−g+g−) ∪ (g+g−g+g−g+g−)
∪ (g+g−g+g−g+g−) ∪ (g+g−g+g−g+g−) , (11)
where ∪ means disjoint union of the objects17 (Fig. 7). First, consider a pair of
modules of period six, which is attached to one another at one position, i.e.,
(g+g−g+g−g+g−) ∪ (g+g−g+g−g+g−) . (12)
If I pick up a pair of adjacent attached rectangular faces in the modules and ex-
change the positions of the connected rectangular faces, then this mathematical
operation provides making a larger closed loop-like module of period 12 from two
smaller modules of period six (Fig. 8). This is mathematically nothing but a con-
nected sum ],17 represented by
(g+g−g+g−g+g−)](g+g−g+g−g+g−)
= g+g−g+g+g−g−g+g−g+g+g−g− = 1 , (13)
where all cyclic (and anticyclic) permutations of the symbols mean the same struc-
ture in the closed loops. Using this concept of the connected sum, if I do the same
Fig. 7. The cohesive assemble of four modules of period six. This is mathematically described asa disjoint union ∪ of the four modules.
Fig. 8. Connected sum of two modules. The case of two modules of period six is shown as anexample. By the operation of the connected sum, the two separated modules are connected to amodule of period 12. The left is phrased by the word “disconnected”, denoted by d and, the rightis phrased by the word “connected”, denoted by c. Therefore, in this example, the connected sumis represented by an operation, ] : d→ c.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 335
thing for one module of period 12 and one module of period six, then I get
(g+g−g+g+g−g−g+g−g+g+g−g−)](g+g−g+g−g+g−)
= g+g−g+g+g−g−g+g+g−g−g+g−g+g+g−g+g−g− = 1 , (14)
which represents a closed loop of period 18 and all cyclic (and anticyclic) permuta-
tions of the symbols stand for the same structure. Do the connected sum between
the loop modules of period 18 and of period six, once again. I obtain
(g+g−g+g+g−g−g+g+g−g−g+g−g+g+g−g+g−g−)](g+g−g+g−g+g−)
= g+g−g+g−g−g+g−g+g+g−g+g−g−g+g−g+g+g−g+g−g−g+g−g+ = 1 , (15)
which is a closed loop structure of period 24 with helicity S = 0 and also all cyclic
(and anticyclic) permutations of the symbols mean the same structure. To make
the two end points I remove one of the symbols in the string of Eq. (15). Hence,
there appear 24 possibilities to locate the ends in the loop. Thus, I finally obtain
the folded structure of S = 0, represented by Eq. (5).
Let us similarly do the connected sum for the two loop modules of period 12.
Then, I get
(g+g−g+g+g−g−g+g−g+g+g−g−)](g+g−g+g+g−g−g+g−g+g+g−g−)
= g−g+g+g−g−g+g−g+g+g−g+g−g−g+g+g−g−g+g−g+g+g−g+g− = 1 , (16)
which is the closed loop structure with helicity of S = 1 and its conjugate represents
the one with helicity of S = −1, and all cyclic (and anticyclic) permutations of the
symbols stand for the same closed-loop folded structure. To make the two end
points I remove one of the symbols in the string of Eq. (16). Hence, there appear 24
possibilities to locate the ends in the closed loop. Thus, I finally obtain the folded
structure of S = ±1, represented by Eqs. (7) and (8) as discussed before.
In this way, the folded structures are obtained by using the concepts of mod-
ules and the connected sum. Here, the sequence of the connected-sum operations
represents an evolution for constructing the folded structure from the four small-
est modules of period six to one largest module of period 24, which is the magic
snake chain. Hence, the mathematical concept of the connected sum can describe
the evolution of the magic snake chain from smaller modules to a larger module.
This might give a hint to understand biological evolution of a protein from smaller
molecules to larger molecules.
The above mathematical construction of the folded structures can be also con-
sidered as follows: As described before in this section, if I consider the four modules
of period six, which are put together to make one cohesive structure, then there are
six positions where the 12 rectangular faces meet each other. Let us put numbers to
the six positions by 1 through 6. Let us denote two configurations at each position
by connected (c) and disconnected (d). Here c (d) means that the configuration of
the two modules that sandwich one position is connected (disconnected) such that
the two modules are kept connected to make one module (disconnected to make
April 23, 1999 11:2 WSPC/140-IJMPB 0083
336 K. Iguchi
Fig. 9. The six positions of the connected sum in the folded structure. These are numbered from1 through 6.
Fig. 10. Duality in the connected sum. dddddd denotes an assembly of four separate modules ofperiod six, while cccccc denotes another assembly of four separate modules of period six. Theseare dual to each other.
two separate modules) (Fig. 9). Therefore, since there are two possibilities at each
position, the total possibility of all the configurations is 26 = 64.
Let us denote one configuration by u1u2u3u4u5u6, where uj = c, d. Now,
dddddd = d6 represents the assembly of the four disconnected modules of period
six. Similarly, cccccc = c6 represents the other assembly of the four disconnected
modules of period six, which is dual to the original one (Fig. 10). This is due to
the duality of the geometrical conformation of the assembly of the four modules of
period six. I call this the duality in the connected sum. Since there is this geometrical
duality, the 32 possibilities are meaningful among these 64 possibilities. Now I find
the following:
(1) There is one (= 6C0) possibility to construct an assembly of the four discon-
nected modules of period six described as d6.
(2) There are 6(= 6C1) possibilities such as dddddc, where one module of period
eight and three modules of period six are assembled.
(3) There are three possibilities such as dddcdc, where two modules of period 12
are assembled. And there are 12 possibilities such as ddddcc, where one module
of period 18 and one module of cycle six are assembled.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 337
(4) There are four possibilities of the closed-loop folded structure with S = 0 such
as dddccc, and there are three possibilities of the closed-loop folded structure
of S = 1(−1) such as ddccdc.
Hence, totally there are 32 configurations. Thus, one can represent the folded
structure in terms of the language of the representation of c and d.
6. Function of the Folded Structure
Let us consider the function of the folded structure of the magic snake chain. Amaz-
ingly, from its geometry, there is a cubic space with volume of two segments at the
center of the folded structure. Therefore, this shape of the folded structure is noth-
ing but a kind of shell which can contain something inside. Hence, the function of
the folded structure of the magic snake is a container. This function resembles that
of many real globular proteins such as cytochrome c in which a molecule like the
haem group is contained.6
There is another function that comes from duality of the folded structure. The
folded structure can be reversed by exchanging the role of the even and odd number
segments of the magic snake chain such that the reversed folded structure is identical
to the original one. In this sense, the folded structure is self-dual and I call this
duality the duality between the even and odd number segments.
I would like to note here the following geometrical nature of the folded structure.
Suppose that the total number of the segments of the magic snake chain is one or
two less than 24 such as 22 or 23, or it is one or two more than 24 such as 25 or 26.
The above argument in the previous section works for the former case as well, but
it does not work so well for the latter case. In the former case, the chain can fold
into the similar compact folded structure with lack of one or two segments, where
there appear a hole in the shell of the structure (Fig. 11). On the other hand, in the
latter case, the chain cannot fold into the similar compact folded structure due to
the volume exclusion and the geometry effect of the one or two residual segments
Fig. 11. The folded structure with the lack of a segment. If the magic snake chain is constructedby the 23 segments, then it can fold into a compact folded structure. However, a hole appears inthe shell of the folded structure.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
338 K. Iguchi
Fig. 12. The folded structure with the residue of a segment. If the magic snake chain is con-structed by the 25 segments, then it cannot fold into a compact folded structure, but can fold intoa partially compact folded structure. This is due to the volume exclusion of the residual segment.
(Fig. 12). In this sense, the number 24 is a critical number for constructing a closely
packed folded structure in the magic snake chain. It is exactly a magic number for
the folded structure. This point will be discussed later in Sec. 13.
7. Model Hamiltonian
Let us now consider the stability of the folded structure. To do so, one needs
consider the Hamiltonian of the system. Following the arguments in the literature,9
the configurational energy of the chain is given by
Hc =∑i<j
∆(ri − rj)Eσiσj , (17)
where ∆(ri−rj) = 1 if ri and rj are adjoining positions but i and j are not adjacent
in position along the sequence and ∆(ri−rj) = 0 otherwise. Depending on the types
σi of segments in contact, the interaction energy Eσiσj is considered. Therefore, the
energy of Eq. (17) takes into account only the local and nonlocal interactions along
the sequence of the chain.5 For example, if the hydrophobic (H) and hydrophilic or
polar (P) segments are taken into account, then only three types of energies, EHH,
EHP, EPP, are assigned.9
There is another configurational energy which comes from the rotational con-
formation between the adjacent segments. This is given by
Hrot =∑i
Uφi,i+1 , (18)
where Uφi,i+1 the rotational energy with angle φi,i+1 between adjacent segments.
For example, if the cis, trans, right gauche, and left gauche configurations are taken
into account, then only four energies, Uc, Ut, Ug+ and Ug− , are assigned.
When a protein is immersed in a medium such as water or oil, there emerges
an interaction energy between the segments of the protein and the surrounding
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 339
medium. This has been believed to be very important for the origin of the PF.5
This energy is given by
Hm =∑i
sσihσi , (19)
where hσi represents the energy cost according to the types σi of the segments
along the sequence of the chain, and sσi = 1 if the segment of type σi faces the
surrounding medium, sσi = 0 otherwise. For example, if the H and P segments are
taken into account, then only two types of energies, hH, hP, are assigned.
One must consider the quantum mechanical electronic energy of the system.
This comes from the energy of electrons in the protein chain under a certain con-
formation. This is given by the Hamiltonian:
Hel =∑i,j
tri,rjc†icj + viδijc†ici , (20)
where tri,rj means the hopping integral between segments, ri and rj such that
tri,rj = 1 if ri and rj are located in adjacent or adjoining segments along the
sequence of the chain, tri,rj = 0 otherwise, vj the potential at segment j and c†j the
usual electron creation operator that obeys the anticommutation relations.
Finally, the rotational energy of the protein should be taken into account if the
protein is regarded as a rigid body. This is given by
HR =1
2ImΩ2 , (21)
where Im is the moment of inertia of the protein and Ω the angular velocity. But
this term can be usually negligible as being very small or unimportant. Thus, the
total energy of the system is given by
Htot = Hc +Hrot +Hm +Hel +HR . (22)
However, in the standard arguments in the previous literature,9 only the energy of
Eq. (17) is taken into account for the PF.
8. Ground State Energy
The model Hamiltonian H can be defined over any conformation of the magic snake
model, represented by a string with 23 letters of c, t, g+, g−. For the unfolded linear
chain structure of t23, there is no contact energy of local and nonlocal interactions
of the chain. And suppose that the rectangular faces cost energy in a medium.
There are 24 such faces in the linear chain configuration. Hence, the total energy is
given by
Eunfold = 23Ut +24∑i=1
hσi +Eunfoldel +
1
2Iunfoldm Ω2 . (23)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
340 K. Iguchi
Here Eunfoldel stands for the electronic energy of the linear chain (i.e. the total energy
of electrons filled in the spectrum for the linear chain). It is given by
Eunfoldel = 2
∑j,occupied
Eunfoldj , (24)
where the factor 2 in front of the right hand side comes from spin degeneracy and
Eunfoldj are the eigenvalues of the Schrodinger equation:
Hunfoldel |Ψj〉 = Eunfold
j |Ψj〉 , (25)
with the potential arrangement under the linear chain configuration of the magic
snake chain, where |Ψj〉 =∑24j=1 Ψjc
†j |0〉. I note here that I do not take into account
many-body effects such as electron–electron interactions in the chain and I assume
that there is one electron per segment (i.e. the half-filled case) for later purposes.
For the folded structure of S = 0 [Eq. (5)], there appear the contact local and
nonlocal interactions at seven positions between the 1st and 24th, the 1st and 19th,
the 3rd and 9th, the 5th and 23rd, the 7th and 13th, the 11th and 17th, and the
15th and 21st segments along the sequence of the chain, respectively. And there are
12 rectangular faces outward. Hence, the total energy is given by
EfoldS=0 = ES=0
c + 23Ug +∑i=even
hσi +ES=0el +
1
2I foldm Ω2 , (26)
ES=0c = E1,24 +E1,19 +E3,9 +E5,23 +E7,13 +E11,17 +E15,21 , (27)
where I have assumed that Ug± = Ug and the angular frequency Ω is the same as
that of the unfolded structure. Notice here that Ec mainly comes from the interac-
tions between odd number segments while hσi comes from even number segments.
And the ES=0el is the total electronic energy of the system given by
ES=0el = 2
∑j,occupied
ES=0j , (28)
where the factor 2 in front of the right hand side comes from spin degeneracy and
ES=0j are the eigenvalues of the Schrodinger equation:
HS=0el |Ψj〉 = ES=0
j |Ψj〉 , (29)
with the potential arrangement under the compact folded structure of S = 0 of the
magic snake chain.
Similarly, I can obtain the total energyEfolds=±1 for the folded structure of S = ±1,
respectively. Here, the moment of inertia I foldm is also the same for these structures
since the compact structures are all identical. And I now find
EfoldS=±1 = ES=±1
c + 23Ug +∑i=even
hσi +ES=±1el +
1
2I foldm Ω2 , (30)
ES=±1c = E1,24 +E1,13 +E3,9 +E5,23 +E7,19 +E11,17 +E15,21 , (31)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 341
which is due to mirror symmetry of the folded structures with helicity of S = ±1.
Here the ES=±1el is the total electronic energy of the system given by
ES=±1el = 2
∑j,occupied
ES=±1j , (32)
where the factor 2 in front of the right hand side comes from spin degeneracy and
ES=±1j are the eigenvalues of the Schrodinger equation:
HS=±1el |Ψj〉 = ES=±1
j |Ψj〉 , (33)
with the potential arrangement under the compact folded structure of S = ±1 of
the magic snake chain. And I can, of course, do the same thing for any conformation
of the magic snake.
9. Ground State Energy Difference Between the Unfolded and the
Folded Structures
Let us consider which energy is the lowest between the ground state energy Eunfold
of the unfolded structure and the ground state energy EfoldS=0,±1 of the folded struc-
ture. This is carried out if and only if the amino acid sequence defining conforma-
tional interactions and potentials for the 24 segments along the chain is assigned.
For the sake of simplicity, I assume that the sequence is given by an alternative
repetition of the H and P segments such as
HPHPHPHPHPHPHPHPHPHPHPHP (34)
where I read this sequence from the left to the right putting the numbers of 1
through 24 .
In this case, Eqs. (23), (26) and (30) yield
Eunfold = 23Ut + 12(hP + hH) +Eunfoldel +
1
2Iunfoldm Ω2 , (35)
EfoldS=0 = ES=0
c + 23Ug + 12hP +ES=0el +
1
2I foldm Ω2 , (36)
EfoldS=±1 = ES=±1
c + 23Ug + 12hP +ES=±1el +
1
2I foldm Ω2 . (37)
In this case, I find that
∆Ec ≡ −ES=0,±1c > 0 , (38)
∆Eel ≡ Eunfoldel −ES=0,±1
el > 0 , (39)
∆Im ≡ Iunfoldm − I fold
m > 0 , (40)
are always valid.18 Therefore, if I assume Ut = Ug and hH = −hP = h > 0, then I
conclude
∆E ≡ Eunfold −EfoldS=0,±1 = 12hH + ∆Ec + ∆Eel +
1
2(∆Im)Ω2 0 . (41)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
342 K. Iguchi
Hence, the ground state energy of the unfolded structure is much larger than the
ground state energy of the folded structures. Thus, the energy difference (i.e. the
energy gap) between the unfolded and the folded structures is mainly dominated
by the hydrophobic energy cost, the contact configurational energy, the electronic
ground state energy and the rigid body rotation energy of the system, as expected.
10. Ground State Energy Difference Between the Folded
Structures of S = 0 and S = ±1
Next, consider the energy difference between the folded structures of S = 0 and
S = ±1, which is defined by
∆Efold ≡ ES=±1 − ES=0 = ES=±1c +ES=±1
el − (ES=0c +ES=0
el )
= ∆Efoldc + ∆Efold
el , (42)
where
∆Efoldc ≡ E1,13 +E7,19 − (E1,19 +E7,13) , (43)
∆Efoldel ≡ ES=±1
el −ES=0el . (44)
For the magic snake model with the particular sequence of Eq. (34), it is natural
to assume that
E1,19 = E3,9 = E1,19 = E5,23 = E7,13 = E11,17
= E15,21 = E1,13 = E7,19 ≡ EHH < 0 (45)
since all these are interactions between the same H-types of segments, except the
interaction between the ends of the chain,
E1,24 ≡ EHP < 0 . (46)
Hence, ∆Efoldc = 0. This provides
∆Efold = ∆Efoldel . (47)
To obtain this difference, I have to explicitly solve the Schrodinger equation of
Eqs. (25), (29) and (33), respectively. To do so, I have to assign the on-site poten-
tials vj and the hopping potentials tri,rj explicitly. Otherwise, I cannot obtain the
solution of the Schrodinger equation, since the Schodinger equation such as Eq. (29)
provides the eigenequation:
24∑i=1
tri,rjΨi + vjΨj = EΨj . (48)
Let us first consider the case when there is no effect of the on-site potential:
vj = 0 (j = 1, . . . , 24) . (49)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 343
This assumption means the following: Although the amino acid sequence is given
by Eq. (34), electrons inside the chain do not feel so strongly the effects of the
potential differences between the segments of the H and P types nor the effects of
the rotational conformations between the segments. On the other hand, only the
hydrophilic and hydrophobic energy costs play an important role in the energy of
the system. This would be good for the first starting point for the problem to see
how the electronic energy problem comes into the PF problem.
In the case of the unfolded chain structure [Eq. (25)], I can assign for the hopping
potentials as
tj,j+1 = −t (j = 1, . . . , 23) , (50)
otherwise tri,rj = 0, where t > 0 is the hopping potential (which is usually of order
of 0.1 ∼ 1 eV). This is due to the geometry of the linear chain structure such
that there is no electron hopping between adjacent segments through the contact
surfaces. The solution of this case is very well-known in quantum chemistry. The
energy spectrum is given by
Eunfoldj = −2t cos
(πj
25
)(j = 1, . . . , 24) . (51)
This provides the spectrum with the 24 eigenvalues:
±1.98423t, ±1.93717t, ±1.85955t, ±1.75261t, ±1.61803t, ±1.45794t ,
±1.25581t, ±1.07165t, ±0.851559t, ±0.618034t, ±0.374763t, ±0.125581t . (52)
Therefore, the total electronic energy of the system is given by
Eunfoldel = 2
∑j,occupied
Eunfoldj = −2(1.98423 + 1.93717
+ 1.85955 + 1.75261 + 1.61803 + 1.45794 + 1.25581 + 1.07165
+ 0.851559 + 0.618034 + 0.374763 + 0.125581)t = −29.845t . (53)
In the case of the folded structure of S = 0, I assume for the hopping potentials
as
tj,j+1 = −t (j = 1, . . . , 23) ,
t1,24 = t1,19 = t3,9 = t5,23 = t7,13 = t11,17 = t15,21 = −t , (54)
otherwise tri,rj = 0, since the rectangular faces of the 1st and 19th, the 3rd and 9th,
the 5th and 23rd, the 7th and 13th, the 11th and 17th, the 15th and 21st segments
and the square faces of the 1st and 24th are in contact with each other, respectively,
such that the electron hopping may appear through the adjacent segments as well
as the nearest neighbor segments, which are given by Eq. (27). Now, I obtain the
energy spectrum with the 24 eigenvalues of Eq. (29) as
−2.56155t, −2.1889t(2), −2t, −1.61803t(2), −1.30278t(2), −t, −0.45685t(2) ,
0, 0.45685t(2), 0.618034t(2), t(2), 1.56155t, 2t, 2.1889t(2), 2.30278t(2) (55)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
344 K. Iguchi
where (2) denotes double degeneracy of the level. This provides the total electronic
energy:
ES=0el = 2
∑j,occupied
ES=0j = −2(2.56155 + 2× 2.1889 + 2 + 2× 1.61803
+ 2× 1.30278 + 1 + 2× 0.45685)t = −33.389t . (56)
Similarly, in the case of the folded structures of S = ±1, I can assign for the hopping
potentials as
tj,j+1 = −t (j = 1, . . . , 23) ,
t1,24 = t1,13 = t3,9 = t5,23 = t7,19 = t11,17 = t15,21 = −t , (57)
otherwise tri,rj = 0, since the rectangular faces of the 1st and 13th, the 3rd and 9th,
the 5th and 23rd, the 7th and 19th, the 11th and 17th, the 15th and 21st segments,
and the square faces of the 1st and 24th are in contact with each other, respectively,
such that the electron hopping may appear through the adjacent segments as well
as the nearest neighbor segments, which are given by Eq. (31). Then I obtain the
energy spectrum with the 24 eigenvalues for Eq. (33) as
−2.56155t, −2.1889t, −2.11491t, −2.08187t, −1.61803t(2), −1.33784t ,
− 1.30278t, −t, −0.45685t, −0.268058t, 0, 0.254102t, 0.45685t ,
0.618034t(2), 0.715828t, 1.53675t, 1.56155t, 1.86081t ,
2t, 2.1889t, 2.30278t, 2.43519t . (58)
where (2) denotes double degeneracy of the level. This provides the total electronic
energy:
ES=±1el = 2
∑j,occupied
ES=±1j = −2(2.56155 + 2.1889 + 2.11491
+ 2.08187 + 2× 1.61803 + 1.33784 + 1.30278
+ 1 + 0.45685 + 0.268058)t = −33.098t . (59)
The spectra of the above three cases are shown in Fig. 13.
I now find the energy difference between the folded structures of S = 0 and
S = ±1:
∆Efoldel = ES=±1
el −ES=0el = −33.098t− (−33.389t) = 0.291t > 0 . (60)
Hence, the total electronic energy ES=±1el of the folded structure of S = ±1 is larger
than that of the folded structure of S = 0 such that the lowest ground state energy
of the system is realized in the folded structure with helicity of S = 0. I would
like to remark that the electronic energies of the folded structures of S = 0,±1 are
24 degenerate for each helicity, since there is no potential difference between the
24 configurations of the folded structures with each helicity. In the same way, I can
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 345
Fig. 13. Electronic spectrum of the folded structures of S = 0,±1. The case of vj = 0 andtend = tcont = t = 1 is shown.
calculate the energy difference between the unfolded and the folded structures of
S = 0,±1, respectively:
∆ES=0el = Eunfold
el −ES=0el = −29.845t− (−33.389t) = 3.544t , (61)
∆ES=±1el = Eunfold
el −ES=±1el = −29.845t− (−33.098t) = 3.253t . (62)
Therefore, the order of magnitude of the electronic energy difference between the
unfolded and the folded structures is much greater than that between the folded
structures of S = 0 and S = ±1. The former is 10 times as large as (i.e. one order
larger than) the latter. If I use this result together with Eqs. (35)–(37), (45) and
(46), I can rewrite Eq. (41) as
∆E ≡ Eunfold −EfoldS=0(±1) = 12hH − 6EHH + 3.544t(3.253t) +
1
2(∆Im)Ω2
≈ o(hH)− o(EHH) + o(t) 0 . (63)
This shows that the hydrophobic interactions with the surrounding medium such as
water, the contact configurational interactions between the hydrophobic segments
as well as the electronic energy of the system are very important in the PF.
Second, as another example, let us consider the case when there is the effect of
the on-site potential. I now adopt the on-site potential as
vj = t (−t) (64)
for j = even (odd), where I have assumed the value t to vj and the segments of type
P (H) are located on the even (odd) sites, for the sake of simplicity. This assumption
April 23, 1999 11:2 WSPC/140-IJMPB 0083
346 K. Iguchi
means the following: According to the amino acid sequence given by Eq. (34),
electrons inside the chain feel the effects of the potential differences between the
segments of the H and P types such that the potential value is t (−t) on the segment
of type P (H). In this case, there is a choice of sign in front of t in Eq. (64). This
different choice of the sign may cause a different energy spectrum, which is the
problem of commensurability of the potential sequence with the folded structure.
This will be discussed in the next section. On the other hand, I assume the same
hopping potentials for each structure as defined before.
Now, in the same way as before, I can calculate the the electronic spectrum for
the unfolded and the folded structures of S = 0,±1, respectively. They are given
as follows: For the unfolded structure, I obtain the electronic spectrum with the
24 eigenvalues for Eq. (25) as
±2.22197t, ±2.18005t, ±2.11138t, ±2.01783t, ±1.90211t, ±1.76793t ,
± 1.62026t, ±1.46576t, ±1.31345t, ±1.17557t, ±1.06792t, ±1.00785t . (65)
This provides the total electronic energy:
Eunfoldel = 2
∑j,occupied
Eunfoldj = −2(2.22197 + 2.18005 + 2.11138 + 2.01783
+ 1.90211 + 1.76793 + 1.62026 + 1.46576 + 1.31345 + 1.17557
+ 1.06792 + 1.00785)t = −39.70t . (66)
For the folded structure with helicity of S = 0, I obtain the electronic spectrum
with the 24 eigenvalues for Eq. (29) as
−2.56155t, −2.2687t(2), −2t, −1.79129t(2), −1.61803t(2) ,
− 1.56155t, −1.12406t(2), −t, 0.618034t(2), 0.738758t(2) ,
t, 1.56155t, 2t, 2.56155t, 2.654t(2), 2.79129t(2) , (67)
where (2) denotes double degeneracy of the level. This provides the total electronic
energy:
ES=0el = 2
∑j,occupied
ES=0j = −2(2.56155 + 2× 2.2687 + 2 + 2
× 1.79129 + 2× 1.61803 + 1.56155 + 2× 1.12406 + 1)t = −41.45t . (68)
For the folded structure with helicity of S = ±1, I obtain the electronic spectrum
with the 24 eigenvalues for Eq. (33) as
−2.56155t, −2.2687t, −2.16425t,−2.15309t, −1.79129t, −1.75844t ,
− 1.61803t(2),−1.56155t,−1.12406t, −1.09608t, −t ,
0.618034t(2), 0.738758t, 0.772866t, 0.90428t, 1.56155t ,
2.21183t, 2.39138t, 2.56155t, 2.654t, 2.79129t, 2.8915t , (69)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 347
Fig. 14. Electronic spectrum of the folded structures of S = 0,±1. The case of vj = (−1)j t andtend = tcont = t = 1 is shown.
where (2) denotes double degeneracy of the level. This provides the total electronic
energy:
ES=±1el = 2
∑j,occupied
ES=±1j = −2(2.56155 + 2.2687 + 2.16425
+ 2.15309 + 1.79129 + 1.75844 + 2× 1.61803 + 1.56155
+ 1.12406 + 1.09608 + 1)t = −41.43t . (70)
The spectra of the above three cases are shown in Fig. 14.
From the above results, I find the energy difference between the folded structures
of S = 0 and S = ±1:
∆Efoldel = ES=±1
el −ES=0el = −41.43t− (−41.45t) = 0.02t > 0 . (71)
This shows that in the present case of the two types of the on-site potentials asso-
ciated with the two types of segments, the energy difference becomes much smaller
than that in the previous case of no potential difference. Hence, in this case, the
electronic ground state energy of the folded structures become closer to be 72-fold
degenerate rather than 24-degenerate.
In the same way, I can calculate the energy difference between the folded and
the unfolded structures of S = 0,±1, respectively:
∆ES=0el = Eunfold
el −ES=0el = −39.70t− (−41.45t) = 1.75t , (72)
∆ES=±1el = Eunfold
el −ES=±1el = −39.70t− (−41.43) = 1.73t , (73)
April 23, 1999 11:2 WSPC/140-IJMPB 0083
348 K. Iguchi
which become of the same order. I now find from Eq. (41) that
∆E ≡ Eunfold −EfoldS=0(±1) = 12hH − 6EHH + 1.75t(1.73t) +
1
2(∆Im)Ω2
≈ o(hH)− o(EHH) + o(t) 0 . (74)
Hence, again, I find that the hydrophobic interactions with the surrounding medium
such as water, the contact configurational interactions between the hydropho-
bic segments and the electronic energy of the system are crucially important in
the PF.
11. Commensurability Between the Potential Sequence and the
Folded Structure
Let us consider the relationship between the potential sequence and the folded
structure. As was discussed before, if one considers the magic snake chain as a
classical object, then the ground state energy is highly degenerate such that many
similar folded structures can have the same ground state energy. However, if one
considers the magic snake chain as a quantum object, then some of the degenerate
ground state energies become lowered than many others even if they come from
the same folded structures.
Let us consider this problem. To see this point, let us go back to the case with
two types of potential. But now, I assume the different sign in front of the potential
as follows:
vj = −t (t) (75)
for j = even (odd), where I have assumed the value t to vj and the segments of
type P (H) are located on the even (odd) sites. On the other hand, all the hopping
potentials are kept in the same as before.
This assumption means the following: According to the amino acid sequence
given by Eq. (34), electrons inside the chain feel the effects of the potential differ-
ences between the segments of the H and P types such that the potential value is −t(t) on the segment of type P (H). Therefore, according to the change of the sign of
the on-site potential, the accumulation of electrons in the chain is influenced. If the
sign is negative (positive) on the segment of type H (P), then electrons are attracted
(resisted) to exist on the segment of type H (P) and vice versa. This difference of
distribution of electrons in the chain can cause the ground state energy difference.
In the same way, I can calculate the electronic spectrum for the unfolded and
the folded structures of S = 0,±1, respectively. They are given as follows: For the
unfolded structure, I obtain the same electronic spectrum with the 24 eigenvalues
for Eq. (25) as Eq. (65), since in the linear chain structure there is no effect of the
potential sign difference.
For the folded structure with helicity of S = 0, I obtain the electronic spectrum
with the 24 eigenvalues for Eq. (28) as
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 349
−3t, −2.654t(2), −2.56155t, −2.30278t(2), −1.30278t(2), −t ,
− 0.738758t(2), 0, t, 1.12406(2), 1.30278t(2), 1.56155t ,
2t(2), 2.2687t(2), 2.30278t(2) , (76)
where (2) denotes double degeneracy of the level. This provides the total electronic
energy:
ES=0el = 2
∑j,occupied
ES=0j = −2(3 + 2× 2.654 + 2.56155 + 2× 2.30278
+ 2× 1.30278 + 1 + 2× 0.738758)t = −41.12t . (77)
For the folded structure with helicity of S = ±1, I obtain the electronic spectrum
with the 24 eigenvalues for Eq. (32) as
−3t, −2.654t, −2.61803t,−2.59562t, −2.30278t(2), −1.37824t ,
− 1.30278t(2), −t, −0.738758t, −0.381966t, −0.34411t, t, 1.12406t ,
× 1.17167t, 1.30278t(2), 2t(3), 2.2687t, 2.30278t, 2.44469t , (78)
where (2) [(3)] denotes double (triple) degeneracy of the level. This provides the
total electronic energy:
ES=±1el = 2
∑j,occupied
ES=±1j = −2(3 + 2.654 + 2.61803 + 2.59562 + 2× 2.30278
+ 1.37824 + 2× 1.30278 + 1 + 0.738758 + 0.381966)t = −43.16t . (79)
The spectra of the above three cases are shown in Fig. 15.
From the above results, I find the energy difference between the folded structures
of S = 0 and S = ±1:
∆Efoldel = ES=±1
el −ES=0el = −43.16t− (−41.12t) = −2.04t < 0 . (80)
And I can calculate the energy differences between the unfolded and the folded
structures of S = 0,±1 are given by
∆ES=0el = Eunfold
el −ES=0el = −39.70t− (−41.12t) = 1.42t , (81)
∆ES=±1el = Eunfold
el −ES=±1el = −39.70t− (−43.16t) = 3.46t , (82)
respectively. I now find from Eq. (41) that
∆E ≡ Eunfold −EfoldS=0(±1) = 12hH − 6EHH + 1.42t(3.46t) +
1
2(∆I)Ω2
≈ o(hH)− o(EHH) + o(t) 0 . (83)
Hence, once again, I find that the hydrophobic interactions with the surrounding
medium such as water, the contact configurational interactions between the hy-
drophobic segments and the electronic energy of the system are crucially important
in the PF. This situation is schematically shown in Fig. 16, and it is frequently
April 23, 1999 11:2 WSPC/140-IJMPB 0083
350 K. Iguchi
Fig. 15. Electronic spectrum of the folded structures of S = 0,±1. The case of vj = −(−1)jtand tend = tcont = t = 1 is shown.
Fig. 16. The landscape of the ground state energy of the system. The horizontal axis means thewhole configuration space of the structures while the vertical axis means the ground state energyof the system. The landscape is not like a funnel structure but more like a bucket structure.
called the funnel structure of the ground state energy of the system in the litera-
ture Ref. 9. However, as is shown in Fig. 16, it is more like a bucket structure since
there is the bottom in the structure.
Equation (80) shows that in the case of Eq. (75), the ground state energy of
the folded structure with helicity S = ±1 is lower than that of the folded structure
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 351
Fig. 17. Schematic diagram of the ground state energy. If vP > 0 > vH, then ES=0el is the lowest
such that ES=0el < ES=±1
el (Solid lines). On the other hand, if vP < 0 < vH, then ES=0el > ES=±1
el(Gray lines). For convenience, the case of Eqs. (94)–(96) is also shown in this figure (Weak graylines). Thus, it is apparent that the ground state degeneracy changes and is removed as thepotential sequence changes.
with helicity S = 0. This is opposite to the former cases of the on-site potential
Eq. (64). The difference becomes much larger and much more enhanced than that
in the previous cases of no potential difference and of the two types of the on-
site potential of Eq. (64). Hence, in this case, the electronic ground state energy
of the folded structures become closer to be 24-fold degenerate rather than 72-
degenerate. Thus, for vP > 0 > vH (vP < 0 < vH), ES=0el is the lowest such that
ES=0el < (>)ES=±1
el . This is schematically shown in Fig. 17. I would like to note
here that this tendency depends upon the choice of the sign for trirj . In the above
argument, I have used the condition, trirj < 0. However, if I use the condition,
trirj > 0, then the situation would be opposite; for vP > 0 > vH (vP < 0 < vH),
ES=0el is the lowest such that ES=0
el > (<)ES=±1el .
The above result shows that even if the ground state energy is highly degenerate
in the classical mechanics level, there may appear a more favorable structure with the
lower ground state energy from the degenerate ground state energy in the quantum
mechanics level. Indeed, the above difference of the ground state energies between
the two different signs of the on-site potential comes from only the electronic ground
state energy. Therefore, it takes place only when the system is treated as a quantum
object by quantum mechanics. This is what I mean by the word “commensurability”
of the potential sequence with the folded structure. However, this point is totally
absent in the previous literature Ref. 4, 5, 9.
12. The Unique Ground State and Broken Symmetry
From the discussions in the previous sections, I have drawn that there are mainly
three types of sequence which are crucially important in the PF.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
352 K. Iguchi
(1) The first type is the conformational sequence coded by the rotational configu-
rations, c, t, g+, g−. This contributes to the energies, Uc, Ut, Ug+ , Ug−.(2) The second type is the sequence coming from the effects of the side chains of a
protein such as the sequence of the hydrophilic and hydrophobic segments, H,
P. This contributes to the energies, EHH, EHP, EPP;hP, hH.(3) The third is the potential sequence affecting electrons inside the chain such
as vj , tri,rj. This contributes to the energies, tc, tt, tg± , tcont; vP, vH, where
tcont means the electron hopping potential by the short-ranged and long-ranged
contact interactions between the contact segments along the chain.
All of these energies are related to the primary structure of the amino acid sequence.
To show this point, I have worked out the particular case with the conformational
sequence of the unfolded sequence of t23 and the folded sequences of Eqs. (5), (7)
and (9).
What can I do for other sequences? Of course, I can do the same thing for this
case, too. But much more fascinating idea for this problem is to use the concept
of broken symmetry. Suppose there is a sequence that is a bit different from the
sequence of Eq. (34). This discrepancy may cause broken symmetry of the chain
to yield the nondegenerate ground state so that one of the folded structures is
more favorable. For any sequence, it may be also true to destroy the multiple-
fold degeneracy of the ground state to produce a unique ground state of the magic
snake chain. This situation would provide a hint to consider the relationship between
the unique ground state of the system and the sequence of the chain. Hence, I
conjecture that this is also true for the real PF problem and in this sense the
second genetic code problem is nothing more than the unique ground state problem
in the real PF problem.
To investigate the above conjecture, let us first consider the on-site potential, vjand the hopping potential, trirj . These potentials are related to the configurations
of the two adjacent or nearest neighbor segments as well as the type of each segment
in the magic snake chain. Therefore, they are parameterized, in general, as follows:
trirj =
ti,i+1 ≡ −t(φi,i+1) if ri, rj ∈ NN
t1,24 ≡ −tend if ri = 1, rj = 24
ti,i+n ≡ −tcont if ri, rj ∈ SR,LR
0 otherwise
, (84)
Vj ≡ Vσj (85)
where NN means the nearest neighbor and SR (LR) the short-ranged (long-ranged)
contact interaction through the segments along the chain, and σj the type of the jth
segment and I assume that t(φi,i+1), tend, tcont > 0. Furthermore, it is reasonable
to assume that t(φi,i+1) can be parameterized as
t(φi,i+1) = t0 + δtr , (86)
where δtr = δtc, δtt, δtg± ≡ δt±.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 353
From the above parameter setting, Eq. (85) gives the sequence:
vH, vP, vH, vP, vH, vP, vH, vP, vH, vP, vH, vP ,
vH, vP, vH, vP, vH, vP, vH, vP, vH, vP, vH, vP . (87)
On the other hand, Eq. (84) gives the sequences of the hopping potential along the
chain: For the unfolded structure of t23, I have
−t,−t,−t,−t,−t,−t,−t,−t,−t,−t,−t,−t ,
− t,−t,−t,−t,−t,−t,−t,−t,−t,−t,−t , (88)
where t ≡ t0 + δtt. And I have
−t+,−t−,−t+,−t−,−t−,−t+,−t−,−t+,−t+,−t−,−t+ ,
− t−,−t−,−t+,−t−,−t+,−t+,−t−,−t+,−t−,−t−,−t+,−t− , (89)
for the folded structure of S = 0 and
−t±,−t∓,−t±,−t±,−t∓,−t±,−t±,−t∓,−t∓,−t±,−t∓,−t± ,
− t±,−t∓,−t±,−t∓,−t∓,−t±,−t±,−t∓,−t∓,−t±,−t∓ , (90)
for the folded structure of S = ±1, respectively, where t± ≡ t0+δt± = t+(δt±−δtt).Let us calculate the electronic ground state energy for this case, assuming that
vH = −t, vP = t, t+ = 1.5t, t− = t and tcont = tend = t, where the hopping
potentials through the contact faces are assumed to be the same as before. Now
I obtain the 24 eigenvalues for the folded structures of S = 0,±1, respectively, as
follows: For the folded structure of S = 0, I have
−3.41108t, −3.04527t, −2.9478t, −2.82609t, −2.48973t, −2.45298t ,
−1.7551t, −1.68918t, −1.32111t, −0.89807t, −0.89048t,−0.0523378t ,
1.100115t, 1.10838t, 1.1307t, 1.53181t, 1.57999t, 1.87538t, 2.27708t ,
2.42318t, 2.60062t, 2.60556t, 2.77775t, 2.86764t . (91)
For the folded structure of S = 1, I have
−3.43742t, −3.06193t, −3.0121t, −2.8248t, −2.58789t, −2.41811t ,
−1.74706t, −1.69604t, −1.37906t, −1.00607t, −0.820767t,−0.0564129t ,
t, 1.10546t, 1.20365t, 1.49754t, 1.58527t, 1.84581t, 2.38611t, 2.40616t ,
2.63656t, 2.7218t, 2.79472t, 2.86457t . (92)
For the folded structure of S = −1, I have
−3.39715t, −3.0595t, −2.95476t, −2.79955t, −2.58069t, −2.40453 ,
− 1.71997t, −1.66809t, −1.334467t, −0.955957t, −0.829761t,−0.0772168t ,
April 23, 1999 11:2 WSPC/140-IJMPB 0083
354 K. Iguchi
1.00089t, 1.09728t, 1.21308t, 1.45491t, 1.59244t, 1.85655t, 2.31277t ,
2.38584t, 2.59758t, 2.67628t, 2.79326t, 2.81099t . (93)
From these, I obtain the electronic ground state energy:
ES=0el = −47.58t , (94)
ES=1el = −41.41t , (95)
ES=−1el = −41.81t , (96)
respectively. The spectra of the above three cases are shown in Fig. 18.
Thus, the degeneracy of the electronic ground state energies between the folded
structures of S = 1 and S = −1 is removed by the difference of the hopping
potential sequences. This is what I mean by the word “broken symmetry”, which
picks up one of the potential sequences to lower the ground state energy.
Let us further consider the effect of the broken symmetry. As was discussed so
far, each ground state energy of the unfolded and the folded structures of S = 0,±1
is 24-fold degenerate, since there is no effect of the location of the end faces along
the chain on the ground state energy. However, if one of the on-site or hopping
potentials is affected by the location of the two end faces in the chain, then this
situation discriminates the ground state energy. To show that the different location
of the end faces leads to the non-degenerate ground state energy, let us suppose
Fig. 18. Electronic spectrum of the folded structures of S = 0,±1. The case of vH = −t, vP = t,t+ = 1.5t, t− = t and tend = tcont = t = 1 is shown.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 355
that the two end faces are placed in between the first and last segments. In this
case, it is natural to assume that the hopping potential between the 1st and 24th
segments, tend = t1,24, is changed such as
tend = t′ . (97)
Let us then calculate the ground state energy, assuming t′ = 0.2t, for example.
Now, I get the 24 eigenvalues for the folded structure of S = 0:
−3.39024t, −3.04662t, −2.87395t, −2.8226t, −2.52766t, −2.42333t ,
− 1.76479t, −1.6342t, −1.16859t, −0.932509t, −0.888087t ,
− 0.0980352t, 1.101262t, 1.08192t, 1.15696t, 1.49632t, 1.59128t, 1.89808t ,
2.18623t, 2.30263t, 2.60062t, 2.62237t, 2.775385t, 2.86772t . (98)
This provides the ground state energy:
ES=0el,(1,24) = −47.14t . (99)
On the other hand, if the two end faces are located in between the ith and
the (i + 1)th segments, then I assume that ti,i+1 = −t′ while the others are not
changed, and so forth. Let us next calculate the ground state energy, assuming
t10,11 = t11,10 = −0.2t, for example. Now, I get the 24 eigenvalues for the folded
structure of S = 0:
−3.35032t, −3.08966t, −2.89371t, −2.78998t, −2.50062t, −2.45168t ,
− 1.74959t, −1.59238t, −1.3443t, −0.919229t, −0.817707t,−0.0895456t ,
1.101373t, 1.06853t, 1.13301t, 1.52641t, 1.62462t, 1.85335t ,
2.21695t, 2.42356t, 2.55864t, 2.60954t, 2.75533t, 2.80506t . (100)
This provides the ground state energy:
ES=0el,(10,11) = −47.06t . (101)
The spectra of the above two cases are shown in Fig. 19.
The above result shows that the different location of the ends in the closed
loop structure of the magic snake chain causes the different electronic ground state
energy, which means that the 24-fold ground state degeneracy can be removed by
the effect of the location of the end faces in the chain. This can be thought of as an
example of the pinning of a defect by the potential sequence or the broken symmetry
by a defect between the folded protein structure and the potential sequence where
I have regarded the location of the end faces as a defect. In this way, I would like
to conclude that the second genetic code problem is related to broken symmetry of
the degenerate ground state energy to the unique ground state energy of the system
in the quantum mechanics level.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
356 K. Iguchi
Fig. 19. Electronic spectrum of the folded structure of S = 0 with the effect of the location ofthe ends in the magic snake chain. (a) The end faces are located in between the 1st and the 24thsegments. t1,24 = t24,1 = 0.2t is used. (b) The end faces are located in between the 10th and the11th segments. t10,11 = t11,10 = 0.2t is used. Here t = 1 is assumed for both cases.
13. Discussion
In this section, I would like to remark further some important nature of the geom-
etry of the magic snake chain model:
(1) There is a class of equivalent models, which can be regarded as a dual structure
model of the magic snake chain.
(2) The magic snake model with 24 segments can be inflated to the model with as
many segments as a multiple times the 24 segments.
First, let us consider the case (1). In this case, there is another class of models
that the triangular segment which constructs the magic snake chain can be replaced
by the bent rod segment (Fig. 20). I call this model the magic rod model (Fig. 21).
This looks more like the standard protein folding model using the self-avoiding
random walk on a cubic lattice.9 However, in this magic rod model the chain or
rod is allowed to meet or attach each other at one position in space, while in the
standard lattice model the chain is not allowed to do so. This place of attachment
corresponds to the position where the two rectangular faces meet each other to make
contact surfaces in the magic snake chain (Fig. 22). Since the faces of a segment
in the magic snake chain turn our to be the points of a segment in the magic rod
model, the magic rod model can be thought of as a dual model to the magic snake
model.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 357
Fig. 20. The correspondence between the magic snake model and the magic rod model. Thetriangular segment of the magic snake model corresponds to the bent rod segment of the magicrod model.
Fig. 21. The folded structure of the magic rod model. These are dual models in the sense thatthe faces of the segments in the magic snake model corresponds to the points at the segments inthe magic rod model.
Fig. 22. Relationship between the contact faces of the magic snake model and the contact rodsof the magic rod model. The attachment or contact of the rod segments is allowed in the magicrod model, while it is forbidden in the standard protein folding model using the three-dimensionalrandom walk on a cubic lattice.
Second, let us consider the case (2). The magic snake chain that is constructed
by the 24 segments can be made longer to consist of more segments in the chain
structure. Especially, there are a class of magic numbers such that one can find
the same type of the folded structure of the magic snake chain. It is given by the
numbers:
Nm = 24×m for m = 1, 3, 5, . . . . (102)
I call this class of the magic snake chains the inflated magic snake chains.
April 23, 1999 11:2 WSPC/140-IJMPB 0083
358 K. Iguchi
Fig. 23. Inflation scheme of the magic snake chain model. If the unit segment is inflated tothe unit segment with m (= 1, 3, 5, . . .) triangular segments, then the inflated magic snake chainrealizes the same folded structure as that of the original magic snake chain.
Fig. 24. The compact folded structure of the inflated magic snake chain with 72 segments. Thecase of helicity S = 0 is shown.
For example, consider the case of m = 3. The inflated magic snake chain is
constructed by the 24 inflated segments, each of which consists of the three segments
(Fig. 23). Hence, the total number of this magic snake chain is 72. The folded
structure is drawn in Fig. 24.19 Then, I find the total number of all configurations
of this inflated magic snake chain is 471 ≈ 5.6 × 1042. If I follow the argument
of the model Levinthal paradox in Sec. 3, then I have to conclude that generally
speaking, it is impossible for me to find a compact folded structure in a practical
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 359
time when I use the random searching. It is true even when I use a high performance
supercomputer. In fact, by this approach it is extremely difficult to find a folded
structure, since even if the supercomputer runs as fast as 10−12 s per one step (i.e. 1
tera-flops per second), it needs 10−12×1042 = 1030 s ≈ 1022 yr. Nevertheless, I can
find such a folded structure without searching the whole configuration space of the
structure. I can get the above class of folded structures once I inflate the system
by the inflation scheme for the constituting segments in the chain from the smaller
structure to the larger structures. This procedure saves huge time to find a folded
structure from the whole configuration space of the structure. Hence, I conjecture
that this kind of inflation of the unit segments in the chain structure can be a clue
to understand an evolution of the protein structure or the protein architecture.
From this point of view, there seems to exist a formal analogy between natural
language and the PF problem. Let us suppose that one writes a paper of 103 letters.
To do so, if one searches the whole configuration space of 261000 ≈ 105147, then it is
stupidly meaningless, since it cannot be accomplished at all. Therefore, one never
searches like this. Instead, one first uses a dictionary of words (i.e. a finite set of
words), where words are constructed by a finite number of letters. Second, one
combines the words to make a set of sentences. Third, one combines the sentences
to make a set of paragraphs. Finally, one arranges the paragraphs to make a paper,
and so forth.
In the PF problem, the situation is almost the same. If one searches the en-
tire configuration space of the structures, then it would be a hopeless procedure.
Instead, one first searches a structure of small segments constructed by a com-
bination of the unit segments, which are modules (i.e. the secondary structure).
Therefore, a module corresponds to a word in language. Second, one uses the mod-
ules to make a domain structure (i.e. the tertiary structure). Therefore, a domain
corresponds to a sentence in language. Third, one combines the domains to make a
three-dimensional structure (i.e. the quaternary structure). Therefore, a quaternary
structure corresponds to a paragraph in language.
In this way, the formal analogy between the language and the PF problem is
established. This is shown in Fig. 25. Although their correspondence is formal as
a physical problem, they are almost the same problem in the mathematical sense
that the entire search of the configuration space is never accomplished in both
cases. Thus, the above analogy is more than accidental. Hence, I speculate that
this analogy would play an important role to understand the real PF problem.
14. Conclusion
In conclusion, I have discussed a toy model of Rubik’s magic snake in order to
elucidate the conceptual framework of the PF problem. Even in this model, there
are many interesting problems such as the model Levinthal paradox, the nonunique
folded structure, constructing a folded structure from its modules, the function
of the chain, and the dual model and inflation of the magic snake chain. I have
April 23, 1999 11:2 WSPC/140-IJMPB 0083
360 K. Iguchi
Fig. 25. Formal analogy between natural language and the protein folding problem.
introduced the model Hamiltonian to discuss the ground state energy of the system.
The ground state is highly degenerate for the particular sequence of Eq. (34) as a
consequence of high possibilities of commensurability, and it is destroyed by an
arbitrary sequence as a broken symmetry to reach a unique ground state energy.
This type of argument may be useful to further investigate the intriguing nature of
the PF.
Acknowledgments
I would like to thank Prof. Mitiko Go, Prof. Satoshi Takahashi and Prof. Chao Tang
for sending me their recent works. I also thank Prof. Mitiko Go and Prof. Satoshi
Takahashi for useful discussions and Kazuko Iguchi for continuous support and
encouragement. This work is partially supported by The Mitsubishi Foundation.
References
1. T. E. Creighton, Proteins (Freeman, New York, 1993); G. E. Schulz and R. H.Schirmer, Principles of Protein Structure (Springer-Verlag, New York, 1979).
2. C. J. Epstein, R. F. Goldberger and C. B. Anfinsen, Cold Spring Harbor Symp. Quant.Biol. 28, 439 (1963); C. B. Anfinsen, Science 181, 223 (1973).
3. C. Levinthal, J. Chim. Phys. 65, 44 (1968).4. H. Frauenfelder, K. Chu and R. Philipp, “Physics From Proteins” in Biologically
Inspired Physics ed. L. Peliti (Plenum Press, New York, 1991).5. H. S. Chan and K. A. Dill, Physics Today 24 (1993), References therein.6. S. Takahashi, S.-R. Yeh, T. K. Das, C.-K. Chan, D. S. Gottfried and D. L. Rousseau,
Nature Struc. Biology. 4, 45 (1997); S.-R. Yeh, S. Takahashi, B. Fan and D. L.Rousseau, ibid. 4, 51 (1997); M. M. Millonas and D. A. Hanck, Phys. Rev. Lett.80, 401 (1998).
7. G. Williams and D. C. Watts, Trans. Faraday Soc. 66, 80 (1970); M. F. Schlesingerand E. W. Montroll, Proc. Natl. Acad. Sci. USA 81, 1280 (1984).
April 23, 1999 11:2 WSPC/140-IJMPB 0083
Exactly Solvable Model of Protein Folding . . . 361
8. J. U. Bowie, R. Luty and D. Eisenberg, Science 253, 164 (1991); M. Levitt andC. Chothia, Nature 261, 552 (1992); J. U. Bowie and D. Eisenberg, Curr. Opin.Struc. Biol. 3, 437 (1993); M. Wilmanns and D. Eisenberg, Proc. Natl. Acad. Sci.USA 90, 1379 (1993).
9. A. Sali, E. I. Shaknovich and M. Karplus, Nature 369, 248 (1994); E. I. Shaknovich,Phys. Rev. Lett. 72, 3907 (1994); P. G. Wolynes, J. N. Onuchic and D. Thirumalai,Science 267, 1619 (1995); J. Wang, J. Onuchic and P. Wolynes, Phys. Rev. Lett. 764861 (1996); P.-A. Lindgard and H. Bohr, ibid. 77, 779 (1996); H. Li, R. Helling,C. Tang and N. S. Wingreen, Science 273, 666 (1996); H. Li, C. Tang and N.S. Wingreen, ibid. 79, 765 (1997); T. Haliloglu, I. Bahar and B. Erman, ibid. 79,3090 (1997); H. J. Bussemaker, D. Thirumulai and J. K. Bhattacharjee, ibid. 79,3530 (1997); E. D. Nelson, L. F. Teneyck and J. N. Onuchic, ibid. 79, 3534 (1997);C. Micheletti, F. Seno, A. Maritan and J. R. Banavar, ibid. 80, 2237 (1998).
10. This is a toy for kids, which was invented by Rubik who once created the famousRubik cube. E. Rubik, Magic Snake (Tsukuda Original, Tokyo, 1996). The latestversion of this toy has recently been available. Poki Poki Magic Snake (TsukudaOriginal, Tokyo, 1998).
11. K. Iguchi, Mod. Phys. Lett. B12, 499 (1998).12. C. Kittel, Introduction to Solid State Physics, 7th edition (Wiley, New York, 1996);
N. W. Ashcroft and N. D. Mermin, Solid State Physics (Saunders College, New York,1976).
13. P. J. Steinhardt and S. Ostlund, The Physics of Quasicrystals (World Scientific, Sin-gapore, 1987).
14. D. E. Ingber, Scientific American 278, 30 (1998). One of the simplest tensegritymodels has been sold as a toy for babies (called Squwish) from the Boston Musium.
15. This statement is valid only if the system is regarded as a classical object. But if oneconsiders the quantum mechanical characters of the model such as electronic spectrumand vibrations, then it becomes false. Because the folded structure for each helicityprovides different eigenvalues. This will be discussed later.
16. M. Go, Nature 290, 90 (1981); Proc. Natl. Acad. Sci. (USA) 80, 1964 (1983); M. Goand M. Nosaka, Cold Spring Harb. Symp. Quant. Biol. 52, 915 (1987); C. Titiger,S. Whyard and V. K. Walker, Nature 361, 470 (1993).
17. For example, see M. Nakahara, Geometry, Topology and Physics (Adam Hilger, NewYork, 1990).
18. The realtion, Iunfoldm > I fold
m , is obvious from the geometry of the unfolded and thefolded structures of the magic snake chain. If I explicitly calculate them, then I getIunfoldm = 330I0 and I fold
m ≈ 10.8I0 with I0 = ma2, where a is the side length of asquare surface and m the mass of a segment. Hence, Iunfold
m > I foldm . The difference
∆Eel ≡ Eunfoldel − ES=0,±1
el will be discussed in the later sections.19. I would like to comment here that the structure shown in Fig. 24 looks very similar
to the three-dimensional structure of insulin, where both structures exhibit the sameone three-fold symmetry axis. This is a remarkable coincidence between the foldedstructure of rubik’s magic snake model and a real globular protein structure. Thereis also a strinking similarity between the former and the structure of cytochrome c.See Protein Data Bank (www.pdb.bnl.gov). Hence, this shows a kind of reality of ourmathematical approach to the problem of the real PF. On the other hand, if one usesthe standard lattice models Ref. 9, then one has been unable to predict such pysicallymeaningful three-dimensional structures.