Constraint counting on RNA structures: Linking …...Constraint counting on RNA structures – S....

NOTICE: this is the author's version of a work that was accepted for publication in Methods.

Changes resulting from the publishing process, such as peer review, editing, corrections,

structural formatting, and other quality control mechanisms may not be reflected in this

document. Changes may have been made to this work since it was submitted for publication.

A definitive version will be published as DOI: 10.1016/j.ymeth.2009.04.004.

Constraint counting on RNA structures:

Linking flexibility and function

Simone Fulle1, Holger Gohlke1,2*

1Department of Biological Sciences, Molecular Bioinformatics Group,

Goethe-University, Frankfurt, Germany

2Department of Mathematics and Natural Sciences, Institute of Pharmaceutical and

Medicinal Chemistry,

Heinrich-Heine-University, Düsseldorf, Germany

Running title head: Constraint counting on RNA structures

* Universitätsstr. 1, 40225 Düsseldorf, Germany. Phone: (+49) 211 81 13662. Fax: (+49)

211 81 13847. E-mail: [email protected].

Constraint counting on RNA structures – S. Fulle, H. Gohlke 2

Abstract

RNA structures are highly flexible biomolecules that can undergo dramatic

conformational changes required to fulfill their diverse functional roles. Constraint counting

on a topological network representation of an RNA structure can provide very efficiently

detailed insights into the intrinsic flexibility characteristics of the biomolecule. In the

network, vertices represent atoms and edges represent covalent and strong non-covalent

bonds and angle constraints. Initially, the method has been successfully applied to identify

rigid and flexible regions in proteins. Here, we present recent progress in extending the

approach to RNA structures. As a case study, we analyze stability characteristics of the

ribosomal exit tunnel and relate these findings to the tunnel’s active role in co-translational

processes.

Keywords: flexibility/rigidity, topological network representation, ribosome, exit tunnel,

antibiotics


Introduction

Precise knowledge about what can move and how provides important insights into the

physical basis of the function of biomolecules. This holds true particularly for RNA

structures, which take on diverse roles in the cell, ranging from transfer of the genetic code to

regulation via RNA interference or riboswitches to catalysis of chemical reactions. To

achieve these diverse functional roles, RNA structures undergo many motional modes,

spanning a large range of amplitudes and timescales and often triggered by external factors

(1): small changes such as shifts of a few nucleotides are observed upon binding of

aminoglycoside antibiotics to the aminoacyl-tRNA site in the ribosomal RNA (2), whereas

binding of argininamide to HIV-1 TAR RNA leads to large relative movements of both

helical domains (3, 4).

Determining RNA structures, e.g., by X-ray crystallography provides us with static

snapshots along conformational transitions, whereas the underlying dynamical processes

remain largely unclear. Cryo electron microscopy (cryo-EM) combined with single-particle

reconstruction allows for the visualization of even large structures at different transition

states, along which the underlying transitions can be inferred (5). By fitting atomic models

from X-ray structures into cryo-EM maps these analyses can nowadays lead to insights in

atomic detail. Nevertheless, the main source about the dynamics of RNA structures are still

crystallographic B-values, atomic fluctuations derived from NMR structural ensembles,

NMR relaxation measurements, and residual dipolar couplings (6, 7).

Alternatively, Molecular Dynamics (MD) simulations are widely applied to supplement

our understanding of RNA structure and dynamics in atomic detail (8-10). Unfortunately, the

simulations are still too computationally expensive to investigate RNA structures on a routine

basis for simulation times beyond the hundred(s) of nanosecond range. Consequently, more

efficient alternatives have been developed, which allow for simulations of biomolecules


within some hours or days, exploring large scale motions occurring over long timescales,

and/or investigating biomolecules as large as the ribosomal complex (11-14). One intriguing

example along these lines is the ratchet-like motion of the 70S ribosome observed in cryo-

EM experiments (15), which could be successfully detected by different elastic network

normal mode analyses (11, 13, 14).

In normal mode analysis (NMA) only a few of the lowest energy vibrational modes are

sufficient to represent collective movements of biomolecules (16), and classical all-atom

NMA methods can be considered reliable for describing internal motions of RNA structures

(17). However, as NMA approximates the potential energy landscape of a molecule by a

harmonic function, the description of motion is limited to those movements that occur in the

vicinity of a structure located at an energy minimum. In recent years, a number of other mode

analysis approaches have been introduced to describe the internal motions of RNA structures,

each showing a different compromise between computational cost and level of detail in the

calculation. Among the most widely used ones are approaches based on coarse-grained

network models of the RNA structure: the Gaussian Network Model (18, 19) and the

Anisotropic Network Model (20, 21). Although these models do not provide insights into

dynamical aspects in atomic detail, a comprehensive evaluation of the Gaussian Network

Model approach reveals that predicted fluctuations at the nucleotide level agree well with

experimental data (19, 22). On the contrary, the Anisotropic Network Model has been shown

to perform unsatisfactorily for predicting directions and magnitudes of motions of loosely

packed molecular systems, as given for most RNA structures (17).

A contact map is another network representation of a biomolecule structure, where

nucleotides and amino acids are represented as nodes and inter-residue contacts as edges. By

investigating characteristic network parameters of contact maps, functional sites in RNA

structures can be identified. As an example, the peptidyl transferase center (PTC), the A-site,


and the exit tunnel of the ribosome could be identified based on centrality measures of the

ribosomal contact map (23). Furthermore, this approach also distinguished mutations that

strongly affect ribosome function and assembly from mild mutations (23).

In this review, we focus on recent developments in the field of rigidity theory to

determine flexible and rigid regions of RNA and RNA-protein structures (24). For this,

constraint counting is applied to a topological network representation of the biomolecules.

The approach usually takes a few seconds so that it is also very efficiently applicable to large

macromolecules such as the ribosomal complex (11, 25). Rigid regions are those parts of a

molecule that have a well-defined equilibrium structure and are expected to move as a rigid

body with six degrees of freedom. Thus, no relative motion is allowed within rigid regions. In

turn, flexible regions are hinge regions of the molecule where bond-rotational motions can

occur without a high cost of energy.

In the following, we will first describe the constraint counting theory and then present a

new topological network representation of RNA structures that reliably captures RNA

flexibility/rigidity. As an application example, we will analyze stability characteristics of the

ribosomal exit tunnel and relate these findings to the tunnel’s active role in co-translational

processes.


Determining RNA flexibility by constraint counting

In Figure 1, a workflow of a flexibility analysis is shown for HIV-1 TAR RNA. For the

analysis, a single, static 3D structure of RNA is required as input, as, e.g., obtained from the

Protein Data Base (PDB) or Nucleice Acid Data Base (NDB). For a full atomic

representation, missing hydrogen atoms have to be added. Afterwards, the biomolecule is

modeled as a topological network, where vertices (joints) represent atoms and edges (struts)

represent covalent and non-covalent bond constraints (strong hydrogen bonds, salt bridges,

and hydrophobic interactions) as well as angular constraints. In contrast to force fields, where

forces are of varying strength, in a topological network representation a constraint is "all-or-

nothing". Thus, sufficiently strong forces, which are included in the network, need to be

distinguished from weaker ones, which are omitted. Modeling covalent constraints is

straightforward in that respect. As the flexibility of RNA structures strongly depends on non-

covalent interactions, however, appropriately modeling these constraints is crucial for

correctly predicting RNA rigidity and flexibility (24).

Modeling of a topological network representation of RNA

The influence of non-covalent constraints on the network rigidity of RNA structures

can be understood in the following way. In 3-space, a structure consisting of n atoms has 3n

degrees of freedom, six of which describe the rotational and translational rigid body motions.

The flexibility of the structure is determined by the number of independent internal degrees

of freedom dof, which is given by subtracting six global degrees of freedom and the number

of independent constraints C from the overall number of degrees of freedom (Eq. 1).

dof = 3n - 6 - C Eq. 1


Hydrogen bonds and salt bridges are included as distance and angular constraints

between the hydrogen and the acceptor atom as well as two neighboring atoms, thereby

removing three degrees of freedom from the network (Figure 2) (26). Hydrogen bonds are

included depending on their geometry and interaction energy. For this, potential hydrogen

bonds are ranked according to an energy function that takes into account the hybridization

state of donor and acceptor atoms as well as their mutual orientation (26). By tuning the

energy threshold EHB strong hydrogen bonds can be distinguished from weaker ones.

Choosing EHB = -0.6 kcal/mol corresponds to the thermal energy at room temperature and so

provides a natural choice (26). EHB values of -1.0 kcal/mol have also been reported in the

literature, resulting in more flexible networks (27, 28).

Hydrophobic interactions are considered between a pair of carbon and/or sulfur atoms

if the distance between the atoms is smaller than the sum of the van der Waals radii (1.7 Å for

carbon, 1.8 Å for sulfur) plus a variable threshold. The threshold value is set to 0.25 Å in the

case of the protein parametrization (29), resulting in a distance cutoff DHC = 3.65 Å for two

carbon atoms. Hydrophobic interactions are modeled such that two degrees of freedom are

removed from the network. This is supposed to mimick a less geometrically restrained

interaction compared to a hydrogen bond.

Finally, interactions with divalent ions such as Mg2+ are known to affect the

conformational flexibility of RNA structures (30) and can be included as covalent bonds in

the network, together with surrounding water molecules when available.

Rigid cluster decomposition of RNA

Given a network representation of the RNA structure, the pebble game (31, 32), a fast

combinatorial algorithm, is applied to determine the number and spatial distribution of bond-

rotational degrees of freedom in the network. Based on the accessibility of rotational degrees


of freedom, each bond is identified as either part of a rigid cluster or a flexible link in

between.

As an example, a rigid cluster decomposition is given for unbound HIV-1 TAR-RNA

(Figure 1). Two larger rigid cluster are identified, located at the lower stem (comprising

nucleotides G17-G21 and C41-C45, colored in blue) and upper stem (comprising nucleotides

G26-G28 and C37-C39, colored in red). On the contrary, the bulge and the loop regions are

identified as flexible regions. Note that flexibility and rigidity are static properties and only

determine the possibility of motion, where nothing actually moves. Constraint counting thus

says nothing about the directions and magnitudes of existing motions (33). Yet, identifying

rigid and flexible regions already gives insights into the location of possible motions of the

biomolecule. Indeed, the rigid cluster decomposition is in agreement with experimental

findings that in the free conformation of TAR-RNA, the two stable helical stems collectively

undergo a large-amplitude, hinge-like motion around the flexible bulge region (34-36).

---Figure 1---

---Figure 2---

While the decomposition into rigid clusters and flexible regions only provides a

qualitative picture, a continuous quantitative measure is given by a flexibility index fi defined

for each covalent bond i (Eq. 1) (26).

−

=

regionainedoverconstraninC

R

clusterrigidllyisostaticaanin

regionrainedunderconstaninH

F

f

k

k

j

j

i 0

Eq. 1


A constraint in the network is considered to be independent, if breaking it affects the

flexibility of the network. In contrast, a constraint is redundant if it can be removed without

influencing the network rigidity. In underconstrained (flexible) regions j, the flexibility index

fi relates the number of independent internal degrees of freedom (Fj) to the number of

potentially rotatable bonds (Hj); in overconstrained regions k the number of redundant bonds

(Rk) is related to the number of constraints (Ck). If there are as many internal degrees of

freedom as there are constraints, the region is isostatically rigid. The flexibility index ranges

from -1 to 1, with negative values in rigid regions and positive values in flexible ones.

Overall, the index allows quantifying how much more flexible an underconstrained region is

compared to a minimally rigid region or how much more stable an overconstrained region is

(37).

Predicting RNA mobility using constrained geometrical simulations

To determine actual motion and its amplitude requires introducing a kinematics that

produces real movements. The computed decomposition of the macromolecular structure into

rigid and flexible regions can be used in a subsequent step as input for coarse-grained

simulations (27, 28, 33), which explore the molecule’s mobility. For example, the constrained

geometrical simulation (CGS) approach FRODA (28) (Framework Rigidity Optimized

Dynamic Algorithm) moves flexible parts of a molecule through stereochemically allowed

regions of conformational space using random Brownian type (Monte Carlo) dynamics,

whereas atoms in rigid clusters are moved collectively. Figure 3 shows an ensemble of RNA

conformers generated by a FRODA simulation within a few hours of computational time in

comparison with the conformational space of the HIV-1 TAR-RNA spanned by the NMR

ensemble.


---Figure 3---

A topological network representation for analyzing flexibility characteristics of RNA

structures

Initially, the approach, implemented in the FIRST (Floppy Inclusion and Rigid

Substructure Topology) software package (26), has been successfully applied to the protein

world (26, 29, 37-40). Although it is straightforward to investigate RNA structures based on

the same flexibility and rigidity concepts applied to proteins, one needs to keep in mind that

both systems have different structural features. The dense packing in the core of proteins is

predominantly determined by interactions of hydrophobic side chains, and proteins are

generally globular (41). In contrast, RNA structures are elongated and more loosely packed

than proteins. In addition, they are stabilized mainly by hydrogen bonds and base stacking

interactions (17, 41). Thus, a network representation that has been developed for proteins

may not be appropriate for RNA systems. Indeed, we could show that a protein-based

parameterization does not capture flexibility characteristics of RNA structures satisfyingly

but rather leads to too rigid RNA structures in general (24).

In order to get a better understanding of the scope and limitations of the present

approach for RNA structures, we thoroughly tested different criteria to include hydrophobic

interactions and hydrogen bonds in a topological network representation of RNA structures

(24). Starting out by analyzing the network rigidity of a canonical A-form RNA, it became

obvious that it is the inclusion of hydrophobic contacts into the RNA topological network that

is crucial for an accurate flexibility prediction and that the number of contacts between

adjacent bases needs to be limited in order to capture the flexibility characteristics of RNA

reliably. The criteria to include non-covalent constraints were then adjusted based on

comparing results from constraint counting with crystallographic B-values of a tRNAASP


structure. In addition, conformational variabilities of NMR-derived ensembles of RNA

structures were compared with atomic fluctuations determined from FRODA-generated

ensembles (24). As the final RNA parameterization, I) the number of hydrophobic

interactions for base stacking is limited to one to prevent excessive hydrophobic contacts

between sequentially adjacent bases, II) the distance threshold up to which hydrophobic

contacts are included between bases is set to DHC = 3.55 Å between two carbon atoms, and

III) an energy cutoff for the inclusion of hydrogen bonds of EHB = -1.0 kcal/mol is used.

With these settings, flexibility predictions on RNA demonstrate good agreement both

qualitatively and quantitatively with experimental mobility data (24). For example, constraint

counting now identifies those nucleotides (U8 and U48, G26 and G45) in a tRNAASP

structure as flexible that are known to function as hinge regions (24). Likewise, convincing

results are obtained for mobility predictions obtained by constrained geometric simulations

on these networks using FRODA. As an example, Figure 4 shows a comparison of the

conformational variability of an NMR ensemble (PDB code 1ANR) with atomic fluctuations

calculated by FRODA for HIV-1 TAR-RNA, which results in a fair correlation coefficient of

R2 = 0.53 (see also Figure 3). Also note the good agreement between experimentally observed

and computed absolute amplitudes of motions, which was achieved without involving any

scaling.

Finally, the results using the new RNA parametrization were shown to be superior

compared to predictions based on a parametrization used previously by Jernigan and

coworkers (11) or developed for proteins (26, 29): the new RNA parametrization outperforms

the other two in 7 of the 12 tested NMR ensembles (24). When compared to GNM

calculations, a comparable performance is found, with the FRODA-generated ensembles

providing more detail than the coarse-grained GNM (24).


---Figure 4---

General considerations and recommendations on the network representation

The charged character of RNA structures results in a strong solvation and association of

ionic molecules. Both are known to affect the conformational flexibility of RNA structures

(42-44). Especially binding mediated by divalent ions such as Mg2+ should be included as

constraints in a network representation (24). Similarly, structural water molecules may

influence the stability of RNA structures. So far, only when bound to Mg2+ ions, water

molecules were considered as part of the constraint network (24). Other than that, water

molecules have not been included in the constraint counting analysis, mainly due to the

problem to distinguish tightly bound water molecules from fast-exchanging ones at the RNA

interface. Results from MD simulations can complement structural data in this respect (45).

However, by incorporating data from computationally expensive MD simulations, the

advantage of the highly efficient constraint counting approach with computing times on the

order of seconds even for the large ribosomal subunit would be lost. Encouragingly, previous

findings showed only a negligible difference in the flexibility characteristics of a protein-

protein complex when structural waters were considered (37). In addition, the influence of

solvent on the structural stability is already implicitly considered by including hydrophobic

interactions as constraints into the network (40).

Other improvements of the network representation that can be anticipated include

incorporating repulsion between negatively charged phosphate groups or modeling base

stacking interactions differently depending on the base types and the sequential context (24).

Stacking interactions in general increase in the order pyrimidine-pyrimidine < purine-

pyrimidine < purine-purine bases (46) and are larger for sequences rich in G-C rather than

A-U base pairs (47, 48). Regarding RNA/protein interface regions, which are strongly


stabilized by the formation of hydrogen bond networks and intermolecular hydrophobic cores

(49), we suggest to model these interactions according to the protein-based parameterization

(24, 25). On the same account, it appears reasonable to model the interface between a ligand

and RNA structures according to the protein-based parameterization. Finally, the quality of

the obtained flexibility characteristics depends on the quality of the experimental structure

used for the network representation, and we recommend using X-ray structures resolved to

< 2.5 Å.

Flexibility characteristics of the ribosomal exit tunnel as a case

study

Constraint counting provides rigidity and flexibility information at various structural

levels: I) flexibility characteristics at the bond level are instructive for analyzing, e.g., binding

site regions; II) flexibility characteristics of larger regions can be related to potential global

conformational changes; III) rigid cluster decompositions provide hints about movements of

structural parts as rigid bodies. In this chapter, we present constraint counting results on the

large ribosomal subunit to exemplify how rigidity and flexibility information can be used for

understanding biological function from a structural perspective. In particular, we focus on the

ribosomal exit tunnel and its active role in co-translational processes (50-54). The ribosome is

the protein synthesis machinery of the cell. After peptide bond formation at the peptidyl

transferase center (PTC) (55) the nascent polypeptide chain leaves the ribosome via the

ribosomal exit tunnel, which spans the entire large subunit of the ribosome and has a length

of ~80 Å (56). The tunnel surface is composed mainly of 23S rRNA, but non-globular parts

of ribosomal proteins contribute as well (57). The narrowest part of the tunnel is formed by

the proteins L4 and L22, which are highlighted in blue and red, respectively, in Figure 5a.


Global backbone flexibility

The flexibility characteristics of the tunnel obtained from constraint counting are shown

color-coded in Figure 5b. For this, flexibility indices of the backbone regions of ribosomal

RNA and proteins were averaged and subsequently assigned to the phosphorus and Cα-atoms,

respectively (25, 37). Blue color indicates overconstrained regions, red color flexible regions;

minimally rigid regions are colored in green. The constraint counting identifies large parts of

the tunnel neighboring regions as rigid, whereas clusters of flexible tunnel components are

located at the PTC, the tunnel entrance, and the exit region (25). This picture also holds if one

considers the flexibility characteristics of side chains instead (data not shown). For this,

flexibility indices of either the glycosidic bonds in the case of nucleotides or the Cα-Cβ bonds

in the case of amino acids were analyzed.

Overall, the results demonstrate a conserved stable structural environment of the tunnel,

which renders unlikely deformations of the tunnel that move peptides down the tunnel in an

active manner (58). Furthermore, the stable environment rules out that the tunnel can adapt

widely such as to allow tertiary folding of nascent chains (52). Yet, the approach identifies

local zones of flexible nucleotides within the tunnel. Strikingly, these flexible zones agree

with previously identified zones of folding at the secondary level (59).

---Figure 5---

Flexibility characteristics at the bond level

Characterizing the flexibility of specific covalent bonds of antibiotics binding sites

provides hints as to the selectivity of antibiotics binding. Clinically important antibiotics

inhibit the activity of eubacterial ribosomes by binding, e.g., to the active site crevice at the


PTC, where the peptide bond formation occurs and the nascent peptide is released into the

tunnel region. A major part of the crevice is formed by two sequentially adjacent bases that

splay apart such that a wedge-shaped hydrophobic gap in between is formed, which allows

for stabilizing hydrophobic interactions with antibiotics (Figure 6a) (60). Subtle structural

differences within the antibiotics binding pockets of the prokaryotic and eukaryotic

ribosomes (61, 62) are the key to antibiotics selectivity (63). In this regard, the archaeal

H. marismortui ribosome is distinct from eubacterial ones of E. coli, D. radiodurans, and

T. thermophilus as the former possesses typical eukaryotic elements at the principal antibiotic

target sites and requires much higher than clinically relevant antibiotics concentrations for

binding (60, 63-66). Interestingly, such structural differences are also reflected in different

flexibility characteristics of the antibiotic binding crevices. Whereas the glycosidic bonds of

the crevice-forming nucleotides show a dual flexibility character in the case of the

H. marismortui structure (one glycosidic bond is flexible whereas the other one is rigid), the

two glycosidic bonds are found to be structurally stable across all three analyzed eubacterial

structures (25). When comparing the crevice of the H. marismortui structure to corresponding

clefts of the eubacterial structures, a wider active site crevice is found for the latter ones. This

has led to the notion that the active site cleft may always be open in eubacterial ribosomes

(67), which could be the reason why eubacteria are more sensitive to some of the active site

crevice antibiotics than archaea: the already open conformation would not require to have any

of the binding free energy expended to accommodate the appropriate bound conformational

state (67). Our constraint counting results provide support to this observation: eubacteria

would benefit if the open conformation of the active site crevice is structurally stabilized, as

given by overconstrained glycosidic bonds of its nucleotides.

Investigating the flexibility characteristics of specific covalent bonds also provides

insights into the active role of tunnel components in co-translational elongation regulation


(25). Experimental evidence suggests that certain nascent peptide chains can interact with the

tunnel, thereby stalling translation elongation (68). Several models have been proposed to

explain how the constriction point in the tunnel formed by ribosomal proteins L22 and L4 is

involved in this arrest (5, 51, 69, 70). In fact, studies of L22 mutant strains demonstrate this

protein’s active role (50, 69). L22 consists of a globular domain at the subunit surface and a

ß-hairpin region that lines part of the tunnel wall. Constraint counting identifies flexible

residues in the otherwise rigid ß-hairpin of L22. Encouragingly, the identified flexible

residues lie at the center of two hinge regions that have been inferred from an observed

conformational change of L22 upon binding of troleandomycin in the ribosomal exit tunnel,

too (51). The molecular hinges allow for a conformational change of the loop region towards

the opposite side of the tunnel wall (Figure 6b). In the presence of such a swung

conformation the tunnel voyage of nascent polypeptides would be blocked. The constraint

counting result thus strongly supports the hypothesis that L22 can function as a gate keeper

(51).

---Figure 6---

Collective correlated movements inferred from mechanically coupled rigid clusters

The question that remains to be answered then is how does the ribosome sense the

interaction between the nascent peptide chain and the tunnel? Based on cryo-EM derived

models of E. coli ribosome complexes at different states, a cascade of RNA rearrangements

was proposed that propagate from peptide chain-induced conformational changes of 23S

rRNA bases in the ribosomal exit tunnel through the large ribosomal subunit, influence the

small subunit, and lead to elongation arrest (70).


To investigate this finding we made use of the fact that within a rigid cluster a hierarchy of

regions of varying stability can exist. In order to reveal such a hierarchy, it is possible to

simulate a melting of the constraint network by successively removing hydrogen bonds in the

order of increasing strength. The underlying assumption is that weaker hydrogen bonds break

first if the molecular system is heated. Initially, analyses of this type have been used to

investigate unfolding events within proteins (29). In the case of the large ribosomal subunit,

we used the approach to identify those parts of the subunit that are weakly coupled to the

subunit core (25). Notably, elements that rapidly peel of the large rigid core of the subunit

and form rigid cluster by themselves upon melting agree very well with those elements for

which significant conformational changes have been observed in the above cryo-EM

experiments. This demonstrates that ribosomal elements observed to be mobile are indeed

only weakly coupled to the subunit core. Furthermore, the analysis revealed that, for some of

the cryo-EM proposed rearrangements, information may be transmitted through mechanically

coupled, structurally stable regions from the induced conformational changes within the

tunnel. This transmission resembles a domino effect-like transformation, which, in the case of

the tRNA sites being involved, occurs over a distance > 100 Å.


Conclusion

Constraint counting on a topological network representation of RNA structures

provides a deeper understanding of the flexibility characteristics of RNA down to the atomic

level in a computational time on the order of seconds. Recently, a new parameterization for

the topological network representation has been introduced that reliably captures flexibility

characteristics of RNA structures (24). Compared to classical MD simulations, constraint

counting can be efficiently applied to large biomolecules such as the ribosome (11, 25),

which consists of more than 105 atoms.

The internal degrees of freedom within a bimolecular network representation can be

analyzed at different levels of detail. First, flexibility characteristics at the bond level are

instructive for analyzing, e.g., binding site regions; second, flexibility characteristics of larger

regions can be related to potential global movements; finally, rigid cluster decompositions

provide hints about movements of structural parts as rigid bodies. That way, static properties

of a biomolecule can be linked to biological function.

While primarily used in a descriptive manner for RNA so far, predictive flexibility

analyses have been presented for proteins (40, 71). Here, constraint counting was used to

identify structural features that impact the thermostability of a protein. This information can

be exploited subsequently by pointing to residues that should be varied to obtain a system

with higher thermostability. One can imagine using such an approach likewise for RNA for

predicting the impact of mutations that damage tertiary contacts. Finally, constraint counting

provides a natural coarse-grained representation of a biomolecule, which can be used

subsequently in computer simulations of the motions within the molecule (27, 28, 33).


Acknowledgements

This work was supported by the DFG (SFB 579, "RNA-ligand interactions"), Goethe-

University, Frankfurt, and Heinrich-Heine-University, Düsseldorf. SF acknowledges

financial support from the Hessian Science Program, the Frankfurt International Graduate

School for Science (FIGSS), and the Otto-Stern-School in Frankfurt. HG acknowledges

fruitful discussions at the workshop “Dynamics under constraints II”, McGill University's

Bellairs Research Institute, Barbados, 2007.


Figure captions

Figure 1: Workflow for flexibility prediction of RNA structures based on constraint

counting. A PDB structures is used as input (I.) and modeled as a topological network

representation (II.). Constraints between nearest neighbors are indicated by straight lines,

constraints between next nearest neighbors (angle constraints) by dashed lines. For reasons of

clarity, angle constraints are only indicated in the sugar and base scaffolds, and hydrogen

bonds between bases are omitted. Hydrophobic constraints are indicated by black dashed

lines. Flexible hinges are shown in red, minimally rigid regions in green, and overconstrained

regions in blue. Based on the accessibility of rotational degrees of freedom each bond is

identified as either part of a rigid region or a flexible link in between. The resulting rigid

cluster decomposition of an unbound HIV-1 TAR RNA structure (PDB code 1ANR) using a

new RNA parametrization (24) is shown in the right panel (III.). Rigid clusters are depicted

as uniformly colored bodies.

Figure 2: Modeling of hydrogen bonds and hydrophobic interactions in a bond-bending

network within protein (a) and RNA (b) structures. Vertices represent atoms and edges

represent covalent and non-covalent constraints. Constraints between next-nearest neighbors

(shown by dashed lines) define coordination angles between bonded atoms. For reasons of

clarity, only angle constraints associated with hydrogen bonds and hydrophobic interactions

are indicated. Hydrogen bonds are modeled by a bond between the hydrogen and the acceptor

atom and two angular constraints associated with these atoms. Hydrophobic interactions are

modeled by a flexible linkage consisting of three additional vertices (pseudoatoms). This

restricts the maximum distance between the two hydrophobic atoms, while allowing them to

slide with respect to one another.


Figure 3: Conformational space of the HIV-1 TAR-RNA spanned by a) the NMR

ensemble (PDB code 1ANR) and b) conformations generated by a FRODA simulation using

the RNA parameterization (24). From the NMR ensemble, the first 10 models of the PDB

entry are shown. From the FRODA simulation, the starting structure (orange) as well

conformations generated every 2000 steps for a total length of 20 000 steps are shown.

Following the setup of previous MD studies (4, 72), the second model of the NMR ensemble

(orange) was used as a starting structure for the FRODA simulation.

Figure 4: Atomic fluctuations predicted by FRODA simulations vs. conformational

variabilities as measured by NMR (PDB code 1ANR) for HIV-1 TAR-RNA. The alignment

was carried out with respect to all phosphorus atoms of the structure.

Figure 5: a) Atomic structure of the large ribosomal subunit. The narrowest part of the

tunnel is formed by the proteins L4 and L22, which are highlighted in blue and red,

respectively. A nascent peptide (turquoise) is emerging from the tunnel exit. The location of

the PTC is shown, and the binding position of TAO is indicated by a yellow star. b) Color-

coded representation of the flexibility characteristics of the ribosomal exit tunnel obtained by

constraint counting. The coloring of the backbone atoms of the RNA part is according to the

flexibility index of the P atoms and according to the Cα-atoms in the protein part. Blue color

indicates overconstrained regions and red color flexible regions; minimally rigid regions are

colored in green.

Figure 6: a) Binding mode of anisomycin (PDB code 1K73) in the active site crevice of a

H. marismortui 50S structure. A major part of the crevice is formed by two sequentially

adjacent bases EC2451 and EC2452 that splay apart such that a wedge-shaped hydrophobic


gap in between is formed, which allows for stabilizing hydrophobic interactions with the p-

methoxy-phenyl group of anisomycin (60). b) View through the ribosomal exit tunnel. The

tunnel constriction-forming proteins L4 and L22 are colored according to the flexibility

values of the Cα-atoms. Overconstrained regions are indicated by blue color, and flexible

regions are shown in red color. The swung conformation of L22 observed upon binding of

troleandomycin (PDB code 1OND) is shown in turquoise. Constraint counting on the 50S

structure from H. marismortui identified two flexible hinge regions located at residues EC85

and EC93 (E. coli numbering) in the ß-hairpin of L22.


Figures

---Figure 1---


-

--Figure 2---


---Figure 3---


---Figure 4---


---Figure 5---


---Figure 6---


References

[1] H. Al-Hashimi, N. Walter, Current Opinion in Structural Biology 18 (2008) 321-329. [2] D. Fourmy, S. Yoshizawa, J.D. Puglisi, Journal of Molecular Biology 277 (1998)

333-345. [3] J.D. Puglisi, R. Tan, B.J. Calnan, A.D. Frankel, J.R. Williamson, Science 257 (1992)

76-80. [4] R. Nifosi, C.M. Reyes, P.A. Kollman, Nucleic Acids Research 28 (2000) 4944-4955. [5] K. Mitra, J. Frank, Annual Review of Biophysics and Biomolecular Structure 35

(2006) 299-317. [6] A. Perez, A. Noy, F. Lankas, F.J. Luque, M. Orozco, Nucleic Acids Research 32

(2004) 6144-6151. [7] M. Getz, X. Sun, A. Casiano-Negroni, Q. Zhang, H.M. Al-Hashimi, Biopolymers 86

(2007) 384-402. [8] M. Karplus, J.A. McCammon, Nature Structural Biology 9 (2002) 646-652. [9] M. Orozco, A. Noy, A. Perez, Current Opinion in Structural Biology 18 (2008) 185-

193. [10] E.S. McDowell, N.a. Spacková, J. Sponer, N. Walter, Biopolymers 85 (2007) 169-

184. [11] Y. Wang, A.J. Rader, I. Bahar, R. Jernigan, Journal of Structural Biology 147 (2004)

302-314. [12] J. Trylska, V. Tozzini, J.A. McCammon, Biophysical Journal 89 (2005) 1455-1463. [13] O. Kurkcuoglu, Z. Kurkcuoglu, P. Doruker, R.L. Jernigan, Proteins: Structure,

Function, and Bioinformatics (2008) in press. [14] F. Tama, M. Valle, J. Frank, C.L. Brooks, Proceedings of the National Academy of

Sciences of the United States of America 100 (2003) 9319-9323. [15] J. Frank, R. Agrawal, Nature 406 (2000) 318-322. [16] Y. Bomble, D. Case, Biopolymers 89 (2008) 722-731. [17] A.W. Van Wynsberghe, Q. Cui, Biophysical Journal 89 (2005) 2939-2949. [18] I. Bahar, A. Atilgan, B. Erman, Folding and Design 2 (1997) 173-181. [19] I. Bahar, R.L. Jernigan, Journal of Molecular Biology 281 (1998) 871-884. [20] A.R. Atilgan, S.R. Durell, R.L. Jernigan, M.C. Demirel, O. Keskin, I. Bahar,

Biophysical Journal 80 (2001) 505-515. [21] E. Eyal, L.-W. Yang, I. Bahar, Bioinformatics 22 (2006) 2619-2627. [22] L.W. Yang, A.J. Rader, X. Liu, C.J. Jursa, S.C. Chen, H.A. Karimi, I. Bahar, Nucleic

Acids Research 34 (2006) W24-W31. [23] H. David-Eden, Y. Mandel-Gutfreund, Nucleic Acids Reserach 36 (2008) 4641-4652. [24] S. Fulle, H. Gohlke, Biophysical Journal 94 (2008) 4202-4219. [25] S. Fulle, H. Gohlke, Journal of Molecular Biology 387 (2009) 502-517. [26] D.J. Jacobs, A.J. Rader, L.A. Kuhn, M.F. Thorpe, Proteins-Structure Function and

Bioinformatics 44 (2001) 150-165. [27] A. Ahmed, H. Gohlke, Proteins-Structure Function and Bioinformatics 63 (2006)

1038-1051. [28] S. Wells, S. Menor, B. Hespenheide, M.F. Thorpe, Physical Biology 2 (2005) 127-

136. [29] A.J. Rader, B.M. Hespenheide, L.A. Kuhn, M.F. Thorpe, Proceedings of the National

Academy of Sciences of the United States of America 99 (2002) 3540-3545. [30] D.E. Draper, RNA 3 (2004) 335-343. [31] D. Jacobs, B. Hendrickson, Journal of Computational Physics 137 (1997) 346-365.


[32] D.J. Jacobs, Journal of Physics A: Mathematical and General 31 (1998) 6653-6668. [33] H. Gohlke, M.F. Thorpe, Biophysical Journal 91 (2006) 2115-2120. [34] H.M. Al-Hashimi, Y. Gosser, A. Gorin, W. Hu, A. Majumdar, D.J. Patel, Journal of

Molecular Biology 315 (2002) 95-102. [35] Q. Zhang, X. Sun, E.D. Watt, H.M. Al-Hashimi, Science 311 (2006) 653-656. [36] C. Musselman, H.M. Al-Hashimi, I. Andricioaei, Biophysical Journal 93 (2007) 411-

422. [37] H. Gohlke, L.A. Kuhn, D.A. Case, Proteins-Structure Function and Bioinformatics 56

(2004) 322-337. [38] B.M. Hespenheide, A.J. Rader, M.F. Thorpe, L.A. Kuhn, Journal of Molecular

Graphics and Modelling 21 (2002) 195-207. [39] A.J. Rader, I. Bahar, Polymer 45 (2004) 659-668. [40] S. Radestock, H. Gohlke, Engineering in Life Sciences 8 (2008) 507-522. [41] C. Hyeon, R.I. Dima, D. Thirumalai, Journal of Chemical Physics 125 (2006) [42] P. Auffinger, E. Westhof, Journal of Molecular Biology 300 (2000) 1113-1131. [43] P. Auffinger, Y. Hashem, Current Opinion in Structural Biology 17 (2007) 325-333. [44] P. Auffinger, S. Louise-May, E. Westhof, Biophysical Journal 76 (1999) 50-64. [45] A.C. Vaiana, E. Westhof, P. Auffinger, Biochimie 88 (2006) 1061-1073. [46] W. Saenger (1984) Principles of Nucleic Acid Structure, Springer-Verlag, New York. [47] R.L. Ornstein, R. Rein, D.L. Breen, R.D. Macelroy, Biopolymers 17 (1978) 2341-

2360. [48] J. Gralla, D.M. Crothers, Journal of Molecular Biology 78 (1973) 301-319. [49] A. Perederina, N. Nevskaya, O. Nikonov, A. Nikulin, P. Dumas, M. Yao, I. Tanaka,

M. Garber, G. Gongadze, S. Nikonov, RNA 8 (2002) 1548-1557. [50] H. Nakatogawa, K. Ito, Cell 108 (2002) 629-636. [51] R. Berisio, F. Schluenzen, J. Harms, A. Bashan, T. Auerbach, D. Baram, A. Yonath,

Nature Structural Biology 10 (2003) 366-370. [52] R.J. Gilbert, P. Fucini, S. Connell, S.D. Fuller, K.H. Nierhaus, C.V. Robinson, C.M.

Dobson, D.I. Stuart, Molecular Cell 14 (2004) 57-66. [53] C.A. Woolhead, P.J. McCormick, A.E. Johnson, Cell 116 (2004) 725-736. [54] S. Etchells, U. Hartl, Nature Structural & Molecular Biology 11 (2004) 391-392. [55] P. Nissen, J. Hansen, N. Ban, P.B. Moore, T.A. Steitz, Science 289 (2000) 920-930. [56] N.R. Voss, M. Gerstein, T.A. Steitz, P.B. Moore, Journal of Molecular Biology 360

(2006) 893-906. [57] N. Ban, P. Nissen, J. Hansen, P.B. Moore, T.A. Steitz, Science 289 (2000) 905-920. [58] P.B. Moore, T.A. Steitz, Annual Review of Biochemistry 72 (2003) 813-850. [59] J. Lu, C. Deutsch, Nature Structural & Molecular Biology 12 (2005) 1123-1129. [60] J.L. Hansen, P.B. Moore, T.A. Steitz, Journal of Molecular Biology 330 (2003) 1061-

1075. [61] T. Auerbach, A. Bashan, A. Yonath, Trends in Biotechnology 22 (2004) 570-576. [62] A. Yonath, A. Bashan, Annual Review of Microbiology 58 (2004) 233-251. [63] A. Yonath, Molecules and Cells 20 (2005) 1-16. [64] A.S. Mankin, R.A. Garrett, Journal of Bacteriology 173 (1991) 3559-3563. [65] J.L. Sanz, I. Marín, D. Ureña, R. Amils, Canadian Journal of Microbiology 39 (1993)

311-317. [66] J. Hansen, J. Ippolito, N. Ban, P. Nissen, P. Moore, T. Steitz, Molecular Cell 10

(2002) 117-128. [67] G. Blaha, G. Gürel, S.J. Schroeder, P.B. Moore, T.A. Steitz, Journal of Molecular

Biology 379 (2008) 505-519. [68] T. Tenson, M. Ehrenberg, Cell 108 (2002) 591-594.


[69] C.A. Woolhead, A.E. Johnson, H.D. Bernstein, Molecular Cell 22 (2006) 587-598. [70] K. Mitra, C. Schaffitzel, F. Fabiola, M.S. Chapman, N. Ban, J. Frank, Molecular Cell

22 (2006) 533-543. [71] D.R. Livesay, D.J. Jacobs, Proteins-Structure Function and Bioinformatics 62 (2006)

130-143. [72] Y. Mu, G. Stock, Biophysical Journal 90 (2006) 391-399.

Constraint counting on RNA structures: Linking …...Constraint counting on RNA structures – S....

Documents

Transcript of Constraint counting on RNA structures: Linking …...Constraint counting on RNA structures – S....