Post on 21-Aug-2020
NOTICE: this is the author's version of a work that was accepted for publication in Methods.
Changes resulting from the publishing process, such as peer review, editing, corrections,
structural formatting, and other quality control mechanisms may not be reflected in this
document. Changes may have been made to this work since it was submitted for publication.
A definitive version will be published as DOI: 10.1016/j.ymeth.2009.04.004.
Constraint counting on RNA structures:
Linking flexibility and function
Simone Fulle1, Holger Gohlke1,2*
1Department of Biological Sciences, Molecular Bioinformatics Group,
Goethe-University, Frankfurt, Germany
2Department of Mathematics and Natural Sciences, Institute of Pharmaceutical and
Medicinal Chemistry,
Heinrich-Heine-University, Düsseldorf, Germany
Running title head: Constraint counting on RNA structures
* Universitätsstr. 1, 40225 Düsseldorf, Germany. Phone: (+49) 211 81 13662. Fax: (+49)
211 81 13847. E-mail: gohlke@uni-duesseldorf.de.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 2
Abstract
RNA structures are highly flexible biomolecules that can undergo dramatic
conformational changes required to fulfill their diverse functional roles. Constraint counting
on a topological network representation of an RNA structure can provide very efficiently
detailed insights into the intrinsic flexibility characteristics of the biomolecule. In the
network, vertices represent atoms and edges represent covalent and strong non-covalent
bonds and angle constraints. Initially, the method has been successfully applied to identify
rigid and flexible regions in proteins. Here, we present recent progress in extending the
approach to RNA structures. As a case study, we analyze stability characteristics of the
ribosomal exit tunnel and relate these findings to the tunnel’s active role in co-translational
processes.
Keywords: flexibility/rigidity, topological network representation, ribosome, exit tunnel,
antibiotics
Constraint counting on RNA structures – S. Fulle, H. Gohlke 3
Introduction
Precise knowledge about what can move and how provides important insights into the
physical basis of the function of biomolecules. This holds true particularly for RNA
structures, which take on diverse roles in the cell, ranging from transfer of the genetic code to
regulation via RNA interference or riboswitches to catalysis of chemical reactions. To
achieve these diverse functional roles, RNA structures undergo many motional modes,
spanning a large range of amplitudes and timescales and often triggered by external factors
(1): small changes such as shifts of a few nucleotides are observed upon binding of
aminoglycoside antibiotics to the aminoacyl-tRNA site in the ribosomal RNA (2), whereas
binding of argininamide to HIV-1 TAR RNA leads to large relative movements of both
helical domains (3, 4).
Determining RNA structures, e.g., by X-ray crystallography provides us with static
snapshots along conformational transitions, whereas the underlying dynamical processes
remain largely unclear. Cryo electron microscopy (cryo-EM) combined with single-particle
reconstruction allows for the visualization of even large structures at different transition
states, along which the underlying transitions can be inferred (5). By fitting atomic models
from X-ray structures into cryo-EM maps these analyses can nowadays lead to insights in
atomic detail. Nevertheless, the main source about the dynamics of RNA structures are still
crystallographic B-values, atomic fluctuations derived from NMR structural ensembles,
NMR relaxation measurements, and residual dipolar couplings (6, 7).
Alternatively, Molecular Dynamics (MD) simulations are widely applied to supplement
our understanding of RNA structure and dynamics in atomic detail (8-10). Unfortunately, the
simulations are still too computationally expensive to investigate RNA structures on a routine
basis for simulation times beyond the hundred(s) of nanosecond range. Consequently, more
efficient alternatives have been developed, which allow for simulations of biomolecules
Constraint counting on RNA structures – S. Fulle, H. Gohlke 4
within some hours or days, exploring large scale motions occurring over long timescales,
and/or investigating biomolecules as large as the ribosomal complex (11-14). One intriguing
example along these lines is the ratchet-like motion of the 70S ribosome observed in cryo-
EM experiments (15), which could be successfully detected by different elastic network
normal mode analyses (11, 13, 14).
In normal mode analysis (NMA) only a few of the lowest energy vibrational modes are
sufficient to represent collective movements of biomolecules (16), and classical all-atom
NMA methods can be considered reliable for describing internal motions of RNA structures
(17). However, as NMA approximates the potential energy landscape of a molecule by a
harmonic function, the description of motion is limited to those movements that occur in the
vicinity of a structure located at an energy minimum. In recent years, a number of other mode
analysis approaches have been introduced to describe the internal motions of RNA structures,
each showing a different compromise between computational cost and level of detail in the
calculation. Among the most widely used ones are approaches based on coarse-grained
network models of the RNA structure: the Gaussian Network Model (18, 19) and the
Anisotropic Network Model (20, 21). Although these models do not provide insights into
dynamical aspects in atomic detail, a comprehensive evaluation of the Gaussian Network
Model approach reveals that predicted fluctuations at the nucleotide level agree well with
experimental data (19, 22). On the contrary, the Anisotropic Network Model has been shown
to perform unsatisfactorily for predicting directions and magnitudes of motions of loosely
packed molecular systems, as given for most RNA structures (17).
A contact map is another network representation of a biomolecule structure, where
nucleotides and amino acids are represented as nodes and inter-residue contacts as edges. By
investigating characteristic network parameters of contact maps, functional sites in RNA
structures can be identified. As an example, the peptidyl transferase center (PTC), the A-site,
Constraint counting on RNA structures – S. Fulle, H. Gohlke 5
and the exit tunnel of the ribosome could be identified based on centrality measures of the
ribosomal contact map (23). Furthermore, this approach also distinguished mutations that
strongly affect ribosome function and assembly from mild mutations (23).
In this review, we focus on recent developments in the field of rigidity theory to
determine flexible and rigid regions of RNA and RNA-protein structures (24). For this,
constraint counting is applied to a topological network representation of the biomolecules.
The approach usually takes a few seconds so that it is also very efficiently applicable to large
macromolecules such as the ribosomal complex (11, 25). Rigid regions are those parts of a
molecule that have a well-defined equilibrium structure and are expected to move as a rigid
body with six degrees of freedom. Thus, no relative motion is allowed within rigid regions. In
turn, flexible regions are hinge regions of the molecule where bond-rotational motions can
occur without a high cost of energy.
In the following, we will first describe the constraint counting theory and then present a
new topological network representation of RNA structures that reliably captures RNA
flexibility/rigidity. As an application example, we will analyze stability characteristics of the
ribosomal exit tunnel and relate these findings to the tunnel’s active role in co-translational
processes.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 6
Determining RNA flexibility by constraint counting
In Figure 1, a workflow of a flexibility analysis is shown for HIV-1 TAR RNA. For the
analysis, a single, static 3D structure of RNA is required as input, as, e.g., obtained from the
Protein Data Base (PDB) or Nucleice Acid Data Base (NDB). For a full atomic
representation, missing hydrogen atoms have to be added. Afterwards, the biomolecule is
modeled as a topological network, where vertices (joints) represent atoms and edges (struts)
represent covalent and non-covalent bond constraints (strong hydrogen bonds, salt bridges,
and hydrophobic interactions) as well as angular constraints. In contrast to force fields, where
forces are of varying strength, in a topological network representation a constraint is "all-or-
nothing". Thus, sufficiently strong forces, which are included in the network, need to be
distinguished from weaker ones, which are omitted. Modeling covalent constraints is
straightforward in that respect. As the flexibility of RNA structures strongly depends on non-
covalent interactions, however, appropriately modeling these constraints is crucial for
correctly predicting RNA rigidity and flexibility (24).
Modeling of a topological network representation of RNA
The influence of non-covalent constraints on the network rigidity of RNA structures
can be understood in the following way. In 3-space, a structure consisting of n atoms has 3n
degrees of freedom, six of which describe the rotational and translational rigid body motions.
The flexibility of the structure is determined by the number of independent internal degrees
of freedom dof, which is given by subtracting six global degrees of freedom and the number
of independent constraints C from the overall number of degrees of freedom (Eq. 1).
dof = 3n - 6 - C Eq. 1
Constraint counting on RNA structures – S. Fulle, H. Gohlke 7
Hydrogen bonds and salt bridges are included as distance and angular constraints
between the hydrogen and the acceptor atom as well as two neighboring atoms, thereby
removing three degrees of freedom from the network (Figure 2) (26). Hydrogen bonds are
included depending on their geometry and interaction energy. For this, potential hydrogen
bonds are ranked according to an energy function that takes into account the hybridization
state of donor and acceptor atoms as well as their mutual orientation (26). By tuning the
energy threshold EHB strong hydrogen bonds can be distinguished from weaker ones.
Choosing EHB = -0.6 kcal/mol corresponds to the thermal energy at room temperature and so
provides a natural choice (26). EHB values of -1.0 kcal/mol have also been reported in the
literature, resulting in more flexible networks (27, 28).
Hydrophobic interactions are considered between a pair of carbon and/or sulfur atoms
if the distance between the atoms is smaller than the sum of the van der Waals radii (1.7 Å for
carbon, 1.8 Å for sulfur) plus a variable threshold. The threshold value is set to 0.25 Å in the
case of the protein parametrization (29), resulting in a distance cutoff DHC = 3.65 Å for two
carbon atoms. Hydrophobic interactions are modeled such that two degrees of freedom are
removed from the network. This is supposed to mimick a less geometrically restrained
interaction compared to a hydrogen bond.
Finally, interactions with divalent ions such as Mg2+ are known to affect the
conformational flexibility of RNA structures (30) and can be included as covalent bonds in
the network, together with surrounding water molecules when available.
Rigid cluster decomposition of RNA
Given a network representation of the RNA structure, the pebble game (31, 32), a fast
combinatorial algorithm, is applied to determine the number and spatial distribution of bond-
rotational degrees of freedom in the network. Based on the accessibility of rotational degrees
Constraint counting on RNA structures – S. Fulle, H. Gohlke 8
of freedom, each bond is identified as either part of a rigid cluster or a flexible link in
between.
As an example, a rigid cluster decomposition is given for unbound HIV-1 TAR-RNA
(Figure 1). Two larger rigid cluster are identified, located at the lower stem (comprising
nucleotides G17-G21 and C41-C45, colored in blue) and upper stem (comprising nucleotides
G26-G28 and C37-C39, colored in red). On the contrary, the bulge and the loop regions are
identified as flexible regions. Note that flexibility and rigidity are static properties and only
determine the possibility of motion, where nothing actually moves. Constraint counting thus
says nothing about the directions and magnitudes of existing motions (33). Yet, identifying
rigid and flexible regions already gives insights into the location of possible motions of the
biomolecule. Indeed, the rigid cluster decomposition is in agreement with experimental
findings that in the free conformation of TAR-RNA, the two stable helical stems collectively
undergo a large-amplitude, hinge-like motion around the flexible bulge region (34-36).
---Figure 1---
---Figure 2---
While the decomposition into rigid clusters and flexible regions only provides a
qualitative picture, a continuous quantitative measure is given by a flexibility index fi defined
for each covalent bond i (Eq. 1) (26).
−
=
regionainedoverconstraninC
R
clusterrigidllyisostaticaanin
regionrainedunderconstaninH
F
f
k
k
j
j
i 0
Eq. 1
Constraint counting on RNA structures – S. Fulle, H. Gohlke 9
A constraint in the network is considered to be independent, if breaking it affects the
flexibility of the network. In contrast, a constraint is redundant if it can be removed without
influencing the network rigidity. In underconstrained (flexible) regions j, the flexibility index
fi relates the number of independent internal degrees of freedom (Fj) to the number of
potentially rotatable bonds (Hj); in overconstrained regions k the number of redundant bonds
(Rk) is related to the number of constraints (Ck). If there are as many internal degrees of
freedom as there are constraints, the region is isostatically rigid. The flexibility index ranges
from -1 to 1, with negative values in rigid regions and positive values in flexible ones.
Overall, the index allows quantifying how much more flexible an underconstrained region is
compared to a minimally rigid region or how much more stable an overconstrained region is
(37).
Predicting RNA mobility using constrained geometrical simulations
To determine actual motion and its amplitude requires introducing a kinematics that
produces real movements. The computed decomposition of the macromolecular structure into
rigid and flexible regions can be used in a subsequent step as input for coarse-grained
simulations (27, 28, 33), which explore the molecule’s mobility. For example, the constrained
geometrical simulation (CGS) approach FRODA (28) (Framework Rigidity Optimized
Dynamic Algorithm) moves flexible parts of a molecule through stereochemically allowed
regions of conformational space using random Brownian type (Monte Carlo) dynamics,
whereas atoms in rigid clusters are moved collectively. Figure 3 shows an ensemble of RNA
conformers generated by a FRODA simulation within a few hours of computational time in
comparison with the conformational space of the HIV-1 TAR-RNA spanned by the NMR
ensemble.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 10
---Figure 3---
A topological network representation for analyzing flexibility characteristics of RNA
structures
Initially, the approach, implemented in the FIRST (Floppy Inclusion and Rigid
Substructure Topology) software package (26), has been successfully applied to the protein
world (26, 29, 37-40). Although it is straightforward to investigate RNA structures based on
the same flexibility and rigidity concepts applied to proteins, one needs to keep in mind that
both systems have different structural features. The dense packing in the core of proteins is
predominantly determined by interactions of hydrophobic side chains, and proteins are
generally globular (41). In contrast, RNA structures are elongated and more loosely packed
than proteins. In addition, they are stabilized mainly by hydrogen bonds and base stacking
interactions (17, 41). Thus, a network representation that has been developed for proteins
may not be appropriate for RNA systems. Indeed, we could show that a protein-based
parameterization does not capture flexibility characteristics of RNA structures satisfyingly
but rather leads to too rigid RNA structures in general (24).
In order to get a better understanding of the scope and limitations of the present
approach for RNA structures, we thoroughly tested different criteria to include hydrophobic
interactions and hydrogen bonds in a topological network representation of RNA structures
(24). Starting out by analyzing the network rigidity of a canonical A-form RNA, it became
obvious that it is the inclusion of hydrophobic contacts into the RNA topological network that
is crucial for an accurate flexibility prediction and that the number of contacts between
adjacent bases needs to be limited in order to capture the flexibility characteristics of RNA
reliably. The criteria to include non-covalent constraints were then adjusted based on
comparing results from constraint counting with crystallographic B-values of a tRNAASP
Constraint counting on RNA structures – S. Fulle, H. Gohlke 11
structure. In addition, conformational variabilities of NMR-derived ensembles of RNA
structures were compared with atomic fluctuations determined from FRODA-generated
ensembles (24). As the final RNA parameterization, I) the number of hydrophobic
interactions for base stacking is limited to one to prevent excessive hydrophobic contacts
between sequentially adjacent bases, II) the distance threshold up to which hydrophobic
contacts are included between bases is set to DHC = 3.55 Å between two carbon atoms, and
III) an energy cutoff for the inclusion of hydrogen bonds of EHB = -1.0 kcal/mol is used.
With these settings, flexibility predictions on RNA demonstrate good agreement both
qualitatively and quantitatively with experimental mobility data (24). For example, constraint
counting now identifies those nucleotides (U8 and U48, G26 and G45) in a tRNAASP
structure as flexible that are known to function as hinge regions (24). Likewise, convincing
results are obtained for mobility predictions obtained by constrained geometric simulations
on these networks using FRODA. As an example, Figure 4 shows a comparison of the
conformational variability of an NMR ensemble (PDB code 1ANR) with atomic fluctuations
calculated by FRODA for HIV-1 TAR-RNA, which results in a fair correlation coefficient of
R2 = 0.53 (see also Figure 3). Also note the good agreement between experimentally observed
and computed absolute amplitudes of motions, which was achieved without involving any
scaling.
Finally, the results using the new RNA parametrization were shown to be superior
compared to predictions based on a parametrization used previously by Jernigan and
coworkers (11) or developed for proteins (26, 29): the new RNA parametrization outperforms
the other two in 7 of the 12 tested NMR ensembles (24). When compared to GNM
calculations, a comparable performance is found, with the FRODA-generated ensembles
providing more detail than the coarse-grained GNM (24).
Constraint counting on RNA structures – S. Fulle, H. Gohlke 12
---Figure 4---
General considerations and recommendations on the network representation
The charged character of RNA structures results in a strong solvation and association of
ionic molecules. Both are known to affect the conformational flexibility of RNA structures
(42-44). Especially binding mediated by divalent ions such as Mg2+ should be included as
constraints in a network representation (24). Similarly, structural water molecules may
influence the stability of RNA structures. So far, only when bound to Mg2+ ions, water
molecules were considered as part of the constraint network (24). Other than that, water
molecules have not been included in the constraint counting analysis, mainly due to the
problem to distinguish tightly bound water molecules from fast-exchanging ones at the RNA
interface. Results from MD simulations can complement structural data in this respect (45).
However, by incorporating data from computationally expensive MD simulations, the
advantage of the highly efficient constraint counting approach with computing times on the
order of seconds even for the large ribosomal subunit would be lost. Encouragingly, previous
findings showed only a negligible difference in the flexibility characteristics of a protein-
protein complex when structural waters were considered (37). In addition, the influence of
solvent on the structural stability is already implicitly considered by including hydrophobic
interactions as constraints into the network (40).
Other improvements of the network representation that can be anticipated include
incorporating repulsion between negatively charged phosphate groups or modeling base
stacking interactions differently depending on the base types and the sequential context (24).
Stacking interactions in general increase in the order pyrimidine-pyrimidine < purine-
pyrimidine < purine-purine bases (46) and are larger for sequences rich in G-C rather than
A-U base pairs (47, 48). Regarding RNA/protein interface regions, which are strongly
Constraint counting on RNA structures – S. Fulle, H. Gohlke 13
stabilized by the formation of hydrogen bond networks and intermolecular hydrophobic cores
(49), we suggest to model these interactions according to the protein-based parameterization
(24, 25). On the same account, it appears reasonable to model the interface between a ligand
and RNA structures according to the protein-based parameterization. Finally, the quality of
the obtained flexibility characteristics depends on the quality of the experimental structure
used for the network representation, and we recommend using X-ray structures resolved to
< 2.5 Å.
Flexibility characteristics of the ribosomal exit tunnel as a case
study
Constraint counting provides rigidity and flexibility information at various structural
levels: I) flexibility characteristics at the bond level are instructive for analyzing, e.g., binding
site regions; II) flexibility characteristics of larger regions can be related to potential global
conformational changes; III) rigid cluster decompositions provide hints about movements of
structural parts as rigid bodies. In this chapter, we present constraint counting results on the
large ribosomal subunit to exemplify how rigidity and flexibility information can be used for
understanding biological function from a structural perspective. In particular, we focus on the
ribosomal exit tunnel and its active role in co-translational processes (50-54). The ribosome is
the protein synthesis machinery of the cell. After peptide bond formation at the peptidyl
transferase center (PTC) (55) the nascent polypeptide chain leaves the ribosome via the
ribosomal exit tunnel, which spans the entire large subunit of the ribosome and has a length
of ~80 Å (56). The tunnel surface is composed mainly of 23S rRNA, but non-globular parts
of ribosomal proteins contribute as well (57). The narrowest part of the tunnel is formed by
the proteins L4 and L22, which are highlighted in blue and red, respectively, in Figure 5a.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 14
Global backbone flexibility
The flexibility characteristics of the tunnel obtained from constraint counting are shown
color-coded in Figure 5b. For this, flexibility indices of the backbone regions of ribosomal
RNA and proteins were averaged and subsequently assigned to the phosphorus and Cα-atoms,
respectively (25, 37). Blue color indicates overconstrained regions, red color flexible regions;
minimally rigid regions are colored in green. The constraint counting identifies large parts of
the tunnel neighboring regions as rigid, whereas clusters of flexible tunnel components are
located at the PTC, the tunnel entrance, and the exit region (25). This picture also holds if one
considers the flexibility characteristics of side chains instead (data not shown). For this,
flexibility indices of either the glycosidic bonds in the case of nucleotides or the Cα-Cβ bonds
in the case of amino acids were analyzed.
Overall, the results demonstrate a conserved stable structural environment of the tunnel,
which renders unlikely deformations of the tunnel that move peptides down the tunnel in an
active manner (58). Furthermore, the stable environment rules out that the tunnel can adapt
widely such as to allow tertiary folding of nascent chains (52). Yet, the approach identifies
local zones of flexible nucleotides within the tunnel. Strikingly, these flexible zones agree
with previously identified zones of folding at the secondary level (59).
---Figure 5---
Flexibility characteristics at the bond level
Characterizing the flexibility of specific covalent bonds of antibiotics binding sites
provides hints as to the selectivity of antibiotics binding. Clinically important antibiotics
inhibit the activity of eubacterial ribosomes by binding, e.g., to the active site crevice at the
Constraint counting on RNA structures – S. Fulle, H. Gohlke 15
PTC, where the peptide bond formation occurs and the nascent peptide is released into the
tunnel region. A major part of the crevice is formed by two sequentially adjacent bases that
splay apart such that a wedge-shaped hydrophobic gap in between is formed, which allows
for stabilizing hydrophobic interactions with antibiotics (Figure 6a) (60). Subtle structural
differences within the antibiotics binding pockets of the prokaryotic and eukaryotic
ribosomes (61, 62) are the key to antibiotics selectivity (63). In this regard, the archaeal
H. marismortui ribosome is distinct from eubacterial ones of E. coli, D. radiodurans, and
T. thermophilus as the former possesses typical eukaryotic elements at the principal antibiotic
target sites and requires much higher than clinically relevant antibiotics concentrations for
binding (60, 63-66). Interestingly, such structural differences are also reflected in different
flexibility characteristics of the antibiotic binding crevices. Whereas the glycosidic bonds of
the crevice-forming nucleotides show a dual flexibility character in the case of the
H. marismortui structure (one glycosidic bond is flexible whereas the other one is rigid), the
two glycosidic bonds are found to be structurally stable across all three analyzed eubacterial
structures (25). When comparing the crevice of the H. marismortui structure to corresponding
clefts of the eubacterial structures, a wider active site crevice is found for the latter ones. This
has led to the notion that the active site cleft may always be open in eubacterial ribosomes
(67), which could be the reason why eubacteria are more sensitive to some of the active site
crevice antibiotics than archaea: the already open conformation would not require to have any
of the binding free energy expended to accommodate the appropriate bound conformational
state (67). Our constraint counting results provide support to this observation: eubacteria
would benefit if the open conformation of the active site crevice is structurally stabilized, as
given by overconstrained glycosidic bonds of its nucleotides.
Investigating the flexibility characteristics of specific covalent bonds also provides
insights into the active role of tunnel components in co-translational elongation regulation
Constraint counting on RNA structures – S. Fulle, H. Gohlke 16
(25). Experimental evidence suggests that certain nascent peptide chains can interact with the
tunnel, thereby stalling translation elongation (68). Several models have been proposed to
explain how the constriction point in the tunnel formed by ribosomal proteins L22 and L4 is
involved in this arrest (5, 51, 69, 70). In fact, studies of L22 mutant strains demonstrate this
protein’s active role (50, 69). L22 consists of a globular domain at the subunit surface and a
ß-hairpin region that lines part of the tunnel wall. Constraint counting identifies flexible
residues in the otherwise rigid ß-hairpin of L22. Encouragingly, the identified flexible
residues lie at the center of two hinge regions that have been inferred from an observed
conformational change of L22 upon binding of troleandomycin in the ribosomal exit tunnel,
too (51). The molecular hinges allow for a conformational change of the loop region towards
the opposite side of the tunnel wall (Figure 6b). In the presence of such a swung
conformation the tunnel voyage of nascent polypeptides would be blocked. The constraint
counting result thus strongly supports the hypothesis that L22 can function as a gate keeper
(51).
---Figure 6---
Collective correlated movements inferred from mechanically coupled rigid clusters
The question that remains to be answered then is how does the ribosome sense the
interaction between the nascent peptide chain and the tunnel? Based on cryo-EM derived
models of E. coli ribosome complexes at different states, a cascade of RNA rearrangements
was proposed that propagate from peptide chain-induced conformational changes of 23S
rRNA bases in the ribosomal exit tunnel through the large ribosomal subunit, influence the
small subunit, and lead to elongation arrest (70).
Constraint counting on RNA structures – S. Fulle, H. Gohlke 17
To investigate this finding we made use of the fact that within a rigid cluster a hierarchy of
regions of varying stability can exist. In order to reveal such a hierarchy, it is possible to
simulate a melting of the constraint network by successively removing hydrogen bonds in the
order of increasing strength. The underlying assumption is that weaker hydrogen bonds break
first if the molecular system is heated. Initially, analyses of this type have been used to
investigate unfolding events within proteins (29). In the case of the large ribosomal subunit,
we used the approach to identify those parts of the subunit that are weakly coupled to the
subunit core (25). Notably, elements that rapidly peel of the large rigid core of the subunit
and form rigid cluster by themselves upon melting agree very well with those elements for
which significant conformational changes have been observed in the above cryo-EM
experiments. This demonstrates that ribosomal elements observed to be mobile are indeed
only weakly coupled to the subunit core. Furthermore, the analysis revealed that, for some of
the cryo-EM proposed rearrangements, information may be transmitted through mechanically
coupled, structurally stable regions from the induced conformational changes within the
tunnel. This transmission resembles a domino effect-like transformation, which, in the case of
the tRNA sites being involved, occurs over a distance > 100 Å.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 18
Conclusion
Constraint counting on a topological network representation of RNA structures
provides a deeper understanding of the flexibility characteristics of RNA down to the atomic
level in a computational time on the order of seconds. Recently, a new parameterization for
the topological network representation has been introduced that reliably captures flexibility
characteristics of RNA structures (24). Compared to classical MD simulations, constraint
counting can be efficiently applied to large biomolecules such as the ribosome (11, 25),
which consists of more than 105 atoms.
The internal degrees of freedom within a bimolecular network representation can be
analyzed at different levels of detail. First, flexibility characteristics at the bond level are
instructive for analyzing, e.g., binding site regions; second, flexibility characteristics of larger
regions can be related to potential global movements; finally, rigid cluster decompositions
provide hints about movements of structural parts as rigid bodies. That way, static properties
of a biomolecule can be linked to biological function.
While primarily used in a descriptive manner for RNA so far, predictive flexibility
analyses have been presented for proteins (40, 71). Here, constraint counting was used to
identify structural features that impact the thermostability of a protein. This information can
be exploited subsequently by pointing to residues that should be varied to obtain a system
with higher thermostability. One can imagine using such an approach likewise for RNA for
predicting the impact of mutations that damage tertiary contacts. Finally, constraint counting
provides a natural coarse-grained representation of a biomolecule, which can be used
subsequently in computer simulations of the motions within the molecule (27, 28, 33).
Constraint counting on RNA structures – S. Fulle, H. Gohlke 19
Acknowledgements
This work was supported by the DFG (SFB 579, "RNA-ligand interactions"), Goethe-
University, Frankfurt, and Heinrich-Heine-University, Düsseldorf. SF acknowledges
financial support from the Hessian Science Program, the Frankfurt International Graduate
School for Science (FIGSS), and the Otto-Stern-School in Frankfurt. HG acknowledges
fruitful discussions at the workshop “Dynamics under constraints II”, McGill University's
Bellairs Research Institute, Barbados, 2007.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 20
Figure captions
Figure 1: Workflow for flexibility prediction of RNA structures based on constraint
counting. A PDB structures is used as input (I.) and modeled as a topological network
representation (II.). Constraints between nearest neighbors are indicated by straight lines,
constraints between next nearest neighbors (angle constraints) by dashed lines. For reasons of
clarity, angle constraints are only indicated in the sugar and base scaffolds, and hydrogen
bonds between bases are omitted. Hydrophobic constraints are indicated by black dashed
lines. Flexible hinges are shown in red, minimally rigid regions in green, and overconstrained
regions in blue. Based on the accessibility of rotational degrees of freedom each bond is
identified as either part of a rigid region or a flexible link in between. The resulting rigid
cluster decomposition of an unbound HIV-1 TAR RNA structure (PDB code 1ANR) using a
new RNA parametrization (24) is shown in the right panel (III.). Rigid clusters are depicted
as uniformly colored bodies.
Figure 2: Modeling of hydrogen bonds and hydrophobic interactions in a bond-bending
network within protein (a) and RNA (b) structures. Vertices represent atoms and edges
represent covalent and non-covalent constraints. Constraints between next-nearest neighbors
(shown by dashed lines) define coordination angles between bonded atoms. For reasons of
clarity, only angle constraints associated with hydrogen bonds and hydrophobic interactions
are indicated. Hydrogen bonds are modeled by a bond between the hydrogen and the acceptor
atom and two angular constraints associated with these atoms. Hydrophobic interactions are
modeled by a flexible linkage consisting of three additional vertices (pseudoatoms). This
restricts the maximum distance between the two hydrophobic atoms, while allowing them to
slide with respect to one another.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 21
Figure 3: Conformational space of the HIV-1 TAR-RNA spanned by a) the NMR
ensemble (PDB code 1ANR) and b) conformations generated by a FRODA simulation using
the RNA parameterization (24). From the NMR ensemble, the first 10 models of the PDB
entry are shown. From the FRODA simulation, the starting structure (orange) as well
conformations generated every 2000 steps for a total length of 20 000 steps are shown.
Following the setup of previous MD studies (4, 72), the second model of the NMR ensemble
(orange) was used as a starting structure for the FRODA simulation.
Figure 4: Atomic fluctuations predicted by FRODA simulations vs. conformational
variabilities as measured by NMR (PDB code 1ANR) for HIV-1 TAR-RNA. The alignment
was carried out with respect to all phosphorus atoms of the structure.
Figure 5: a) Atomic structure of the large ribosomal subunit. The narrowest part of the
tunnel is formed by the proteins L4 and L22, which are highlighted in blue and red,
respectively. A nascent peptide (turquoise) is emerging from the tunnel exit. The location of
the PTC is shown, and the binding position of TAO is indicated by a yellow star. b) Color-
coded representation of the flexibility characteristics of the ribosomal exit tunnel obtained by
constraint counting. The coloring of the backbone atoms of the RNA part is according to the
flexibility index of the P atoms and according to the Cα-atoms in the protein part. Blue color
indicates overconstrained regions and red color flexible regions; minimally rigid regions are
colored in green.
Figure 6: a) Binding mode of anisomycin (PDB code 1K73) in the active site crevice of a
H. marismortui 50S structure. A major part of the crevice is formed by two sequentially
adjacent bases EC2451 and EC2452 that splay apart such that a wedge-shaped hydrophobic
Constraint counting on RNA structures – S. Fulle, H. Gohlke 22
gap in between is formed, which allows for stabilizing hydrophobic interactions with the p-
methoxy-phenyl group of anisomycin (60). b) View through the ribosomal exit tunnel. The
tunnel constriction-forming proteins L4 and L22 are colored according to the flexibility
values of the Cα-atoms. Overconstrained regions are indicated by blue color, and flexible
regions are shown in red color. The swung conformation of L22 observed upon binding of
troleandomycin (PDB code 1OND) is shown in turquoise. Constraint counting on the 50S
structure from H. marismortui identified two flexible hinge regions located at residues EC85
and EC93 (E. coli numbering) in the ß-hairpin of L22.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 23
Figures
---Figure 1---
Constraint counting on RNA structures – S. Fulle, H. Gohlke 24
-
--Figure 2---
Constraint counting on RNA structures – S. Fulle, H. Gohlke 25
---Figure 3---
Constraint counting on RNA structures – S. Fulle, H. Gohlke 26
---Figure 4---
Constraint counting on RNA structures – S. Fulle, H. Gohlke 27
---Figure 5---
Constraint counting on RNA structures – S. Fulle, H. Gohlke 28
---Figure 6---
Constraint counting on RNA structures – S. Fulle, H. Gohlke 29
References
[1] H. Al-Hashimi, N. Walter, Current Opinion in Structural Biology 18 (2008) 321-329. [2] D. Fourmy, S. Yoshizawa, J.D. Puglisi, Journal of Molecular Biology 277 (1998)
333-345. [3] J.D. Puglisi, R. Tan, B.J. Calnan, A.D. Frankel, J.R. Williamson, Science 257 (1992)
76-80. [4] R. Nifosi, C.M. Reyes, P.A. Kollman, Nucleic Acids Research 28 (2000) 4944-4955. [5] K. Mitra, J. Frank, Annual Review of Biophysics and Biomolecular Structure 35
(2006) 299-317. [6] A. Perez, A. Noy, F. Lankas, F.J. Luque, M. Orozco, Nucleic Acids Research 32
(2004) 6144-6151. [7] M. Getz, X. Sun, A. Casiano-Negroni, Q. Zhang, H.M. Al-Hashimi, Biopolymers 86
(2007) 384-402. [8] M. Karplus, J.A. McCammon, Nature Structural Biology 9 (2002) 646-652. [9] M. Orozco, A. Noy, A. Perez, Current Opinion in Structural Biology 18 (2008) 185-
193. [10] E.S. McDowell, N.a. Spacková, J. Sponer, N. Walter, Biopolymers 85 (2007) 169-
184. [11] Y. Wang, A.J. Rader, I. Bahar, R. Jernigan, Journal of Structural Biology 147 (2004)
302-314. [12] J. Trylska, V. Tozzini, J.A. McCammon, Biophysical Journal 89 (2005) 1455-1463. [13] O. Kurkcuoglu, Z. Kurkcuoglu, P. Doruker, R.L. Jernigan, Proteins: Structure,
Function, and Bioinformatics (2008) in press. [14] F. Tama, M. Valle, J. Frank, C.L. Brooks, Proceedings of the National Academy of
Sciences of the United States of America 100 (2003) 9319-9323. [15] J. Frank, R. Agrawal, Nature 406 (2000) 318-322. [16] Y. Bomble, D. Case, Biopolymers 89 (2008) 722-731. [17] A.W. Van Wynsberghe, Q. Cui, Biophysical Journal 89 (2005) 2939-2949. [18] I. Bahar, A. Atilgan, B. Erman, Folding and Design 2 (1997) 173-181. [19] I. Bahar, R.L. Jernigan, Journal of Molecular Biology 281 (1998) 871-884. [20] A.R. Atilgan, S.R. Durell, R.L. Jernigan, M.C. Demirel, O. Keskin, I. Bahar,
Biophysical Journal 80 (2001) 505-515. [21] E. Eyal, L.-W. Yang, I. Bahar, Bioinformatics 22 (2006) 2619-2627. [22] L.W. Yang, A.J. Rader, X. Liu, C.J. Jursa, S.C. Chen, H.A. Karimi, I. Bahar, Nucleic
Acids Research 34 (2006) W24-W31. [23] H. David-Eden, Y. Mandel-Gutfreund, Nucleic Acids Reserach 36 (2008) 4641-4652. [24] S. Fulle, H. Gohlke, Biophysical Journal 94 (2008) 4202-4219. [25] S. Fulle, H. Gohlke, Journal of Molecular Biology 387 (2009) 502-517. [26] D.J. Jacobs, A.J. Rader, L.A. Kuhn, M.F. Thorpe, Proteins-Structure Function and
Bioinformatics 44 (2001) 150-165. [27] A. Ahmed, H. Gohlke, Proteins-Structure Function and Bioinformatics 63 (2006)
1038-1051. [28] S. Wells, S. Menor, B. Hespenheide, M.F. Thorpe, Physical Biology 2 (2005) 127-
136. [29] A.J. Rader, B.M. Hespenheide, L.A. Kuhn, M.F. Thorpe, Proceedings of the National
Academy of Sciences of the United States of America 99 (2002) 3540-3545. [30] D.E. Draper, RNA 3 (2004) 335-343. [31] D. Jacobs, B. Hendrickson, Journal of Computational Physics 137 (1997) 346-365.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 30
[32] D.J. Jacobs, Journal of Physics A: Mathematical and General 31 (1998) 6653-6668. [33] H. Gohlke, M.F. Thorpe, Biophysical Journal 91 (2006) 2115-2120. [34] H.M. Al-Hashimi, Y. Gosser, A. Gorin, W. Hu, A. Majumdar, D.J. Patel, Journal of
Molecular Biology 315 (2002) 95-102. [35] Q. Zhang, X. Sun, E.D. Watt, H.M. Al-Hashimi, Science 311 (2006) 653-656. [36] C. Musselman, H.M. Al-Hashimi, I. Andricioaei, Biophysical Journal 93 (2007) 411-
422. [37] H. Gohlke, L.A. Kuhn, D.A. Case, Proteins-Structure Function and Bioinformatics 56
(2004) 322-337. [38] B.M. Hespenheide, A.J. Rader, M.F. Thorpe, L.A. Kuhn, Journal of Molecular
Graphics and Modelling 21 (2002) 195-207. [39] A.J. Rader, I. Bahar, Polymer 45 (2004) 659-668. [40] S. Radestock, H. Gohlke, Engineering in Life Sciences 8 (2008) 507-522. [41] C. Hyeon, R.I. Dima, D. Thirumalai, Journal of Chemical Physics 125 (2006) [42] P. Auffinger, E. Westhof, Journal of Molecular Biology 300 (2000) 1113-1131. [43] P. Auffinger, Y. Hashem, Current Opinion in Structural Biology 17 (2007) 325-333. [44] P. Auffinger, S. Louise-May, E. Westhof, Biophysical Journal 76 (1999) 50-64. [45] A.C. Vaiana, E. Westhof, P. Auffinger, Biochimie 88 (2006) 1061-1073. [46] W. Saenger (1984) Principles of Nucleic Acid Structure, Springer-Verlag, New York. [47] R.L. Ornstein, R. Rein, D.L. Breen, R.D. Macelroy, Biopolymers 17 (1978) 2341-
2360. [48] J. Gralla, D.M. Crothers, Journal of Molecular Biology 78 (1973) 301-319. [49] A. Perederina, N. Nevskaya, O. Nikonov, A. Nikulin, P. Dumas, M. Yao, I. Tanaka,
M. Garber, G. Gongadze, S. Nikonov, RNA 8 (2002) 1548-1557. [50] H. Nakatogawa, K. Ito, Cell 108 (2002) 629-636. [51] R. Berisio, F. Schluenzen, J. Harms, A. Bashan, T. Auerbach, D. Baram, A. Yonath,
Nature Structural Biology 10 (2003) 366-370. [52] R.J. Gilbert, P. Fucini, S. Connell, S.D. Fuller, K.H. Nierhaus, C.V. Robinson, C.M.
Dobson, D.I. Stuart, Molecular Cell 14 (2004) 57-66. [53] C.A. Woolhead, P.J. McCormick, A.E. Johnson, Cell 116 (2004) 725-736. [54] S. Etchells, U. Hartl, Nature Structural & Molecular Biology 11 (2004) 391-392. [55] P. Nissen, J. Hansen, N. Ban, P.B. Moore, T.A. Steitz, Science 289 (2000) 920-930. [56] N.R. Voss, M. Gerstein, T.A. Steitz, P.B. Moore, Journal of Molecular Biology 360
(2006) 893-906. [57] N. Ban, P. Nissen, J. Hansen, P.B. Moore, T.A. Steitz, Science 289 (2000) 905-920. [58] P.B. Moore, T.A. Steitz, Annual Review of Biochemistry 72 (2003) 813-850. [59] J. Lu, C. Deutsch, Nature Structural & Molecular Biology 12 (2005) 1123-1129. [60] J.L. Hansen, P.B. Moore, T.A. Steitz, Journal of Molecular Biology 330 (2003) 1061-
1075. [61] T. Auerbach, A. Bashan, A. Yonath, Trends in Biotechnology 22 (2004) 570-576. [62] A. Yonath, A. Bashan, Annual Review of Microbiology 58 (2004) 233-251. [63] A. Yonath, Molecules and Cells 20 (2005) 1-16. [64] A.S. Mankin, R.A. Garrett, Journal of Bacteriology 173 (1991) 3559-3563. [65] J.L. Sanz, I. Marín, D. Ureña, R. Amils, Canadian Journal of Microbiology 39 (1993)
311-317. [66] J. Hansen, J. Ippolito, N. Ban, P. Nissen, P. Moore, T. Steitz, Molecular Cell 10
(2002) 117-128. [67] G. Blaha, G. Gürel, S.J. Schroeder, P.B. Moore, T.A. Steitz, Journal of Molecular
Biology 379 (2008) 505-519. [68] T. Tenson, M. Ehrenberg, Cell 108 (2002) 591-594.
Constraint counting on RNA structures – S. Fulle, H. Gohlke 31
[69] C.A. Woolhead, A.E. Johnson, H.D. Bernstein, Molecular Cell 22 (2006) 587-598. [70] K. Mitra, C. Schaffitzel, F. Fabiola, M.S. Chapman, N. Ban, J. Frank, Molecular Cell
22 (2006) 533-543. [71] D.R. Livesay, D.J. Jacobs, Proteins-Structure Function and Bioinformatics 62 (2006)
130-143. [72] Y. Mu, G. Stock, Biophysical Journal 90 (2006) 391-399.