Abstract
Ribozymes, which carry out phosphoryl transfer reactions, often require Mg2+ ions for catalytic activity. The correct folding of the active site and ribozyme tertiary structure is also regulated by metal ions in a manner which is not fully understood. Here, we employ coarse-grained molecular simulations to show that individual structural elements of the group I ribozyme from the bacterium Azoarcus form spontaneously in the unfolded ribozyme even at very low Mg2+ concentrations, and are transiently stabilized by coordination of Mg2+ ions to specific nucleotides. However, competition for scarce Mg2+ and topological constraints arising from chain connectivity prevent complete folding of the ribozyme. A much higher Mg2+ concentration is required for complete folding of the ribozyme and stabilization of the active site. When Mg2+ is replaced by Ca2+ the ribozyme folds but the active site remains unstable. Our results suggest that group I ribozymes utilize the same interactions with specific metal ligands for both structural stability and chemical activity.
Since the remarkable discovery that RNA molecules can function as enzymes1,2 an ever increasing repertoire of cellular functions has been associated with these versatile molecules3. Execution of these diverse functions, which include control of gene expression and protein synthesis, often requires RNA enzymes (ribozymes) to fold to a compact, functionally competent structure with catalytic metal ions bound at the active site. For example, self-splicing of group I introns is catalyzed by Mg2+ ions which coordinate directly to the chemically active RNA groups4–8. The close relationship between site-specific Mg2+ binding and catalytic activity implies that precise folding of the ribozyme structure is of critical importance. However, folding of the highly negatively charged ribozymes is itself mediated by metal ions9–12 using mechanisms that have yet to be fully elucidated13–20. In other words, a molecular description of how metal ions facilitate the navigation of the rugged energy landscapes of ribozymes is lacking. Here, we address the problem of ion driven ribozyme folding in computer simulation of the group I intron from the purple bacterium Azoarcus21,22.
The high-resolution structure of the Azoarcus intron is known in complex with two exons in the conformation preceding the second splicing step7,22,23 (state pre-2S, Fig. 1). The tertiary structure of the Azoarcus intron in the pre-2S state closely resembles the structure of the group I intron from the ciliate Tetrahymena in the enzymatic form24,25, in which the exons and intron’s internal guide sequence are absent. In our work we have modeled the enzymatic form of the Azoarcus intron, assuming that its native conformation is the pre-2S conformation shown in Fig. 1. The crystal structure of the intron shows Mg2+ ions located in the regions with high concentration of negatively charged phosphate groups22,23. The Mg2+ ions in the intron core are either proximal to or directly bound to phosphates that were identified by Tb3+ cleavage experiments as candidates for specific interactions with divalent metal ions12. Two Mg2+ ions in the active site coordinate the reactive phosphate (reactive phosphoryl group), and it has been proposed that they are involved in catalysis7. Experiments indicate that high (1 M) concentrations of monovalent ions or submillimolar concentrations of Mg2+ ions can cause group I ribozymes to fold into a conformation which is nearly identical to the native state12,26. However, the ribozyme catalytic activity requires Mg2+ concentration above 1 mM14. Based on these findings it was suggested that the unfolded Azoarcus ribozyme assembles into inactive compact intermediate, which undergoes subsequent reorganization into the native conformation14,27 due to specific binding of Mg2+ ions in the ribozyme core. Both the intermediate and native conformations were shown to be stabilized by native tertiary interactions. However, neither the precise structural differences between these conformations nor the role of metal ions in their assembly is understood. Here we use coarse-grained computer simulations to determine the structural properties of the inactive intermediate and, for the first time, to make explicit the relationship between Mg2+ coordination and the folding of RNA into an active conformation.
To date an accurate and computationally efficient general simulation technique for studying RNA folding has not been developed. All-atom simulations of RNA in water, originally conceived in the context of protein folding and dynamics, would potentially provide us with the most detailed information on the folding process. However, large uncertainties in atomistic force fields and the difficulty in obtaining adequate conformational sampling have impeded general application of all-atom simulations in the folding studies. Recently it has become possible to generate folding trajectories of a few relatively small proteins in atomistic detail28–30. But for RNA, the limitations of all-atom approaches are more formidable, because the tertiary structure of even small ribozymes is known to take from milliseconds to seconds to form. Such long simulation times are not currently possible in all-atom simulations, even using the most advanced technology available. The force fields themselves are known to be inaccurate for the thermodynamics of basic RNA structure formation, such as base stacking31. Furthermore, since RNA folding is driven by metal ions, it is essential that experimental ionic conditions be employed in computational studies. In all-atom simulations the ions necessary for the folding of the structure must be contained in a very small simulation box, which implies ion concentrations that far exceed physiological concentrations.
In order to solve the problem of how ions drive ribozyme folding we have developed our own force field for RNA based on a coarse-grained model, in which each nucleotide is replaced by three interaction sites, representing a phosphate, a sugar and a base32–34. Our model is one of a class of Gō-like35 models for RNA which employ a simplified description of RNA energetics in implicit solvent32,36,37. The common simplification used in all G ō-like models is that intramolecular attractive interactions are defined only between the residues that appear to be in contact in the native structure of the RNA molecule. This definition ensures that the native structure of any molecule is the minimum energy structure. By contrast, in all-atom force fields generic attractive potentials are applied to all interatomic pairs and the molecule is not guaranteed to fold into its native structure. The basic drawback of G ō-like models is that they cannot capture any partially folded intermediate states stabilized by non-native interactions. To improve on this approximation, we have gone one step beyond standard G ō-like models and included non-native secondary structure interactions in our RNA model. In particular, we model all base stacking interactions between consecutive nucleotides, as well as hydrogen bond interactions between any bases G (guanine) and C (cytosine), A (adenine) and U (uracil), or G and U. Hydrogen bond and stacking interactions which stabilize the tertiary structure are defined only for the interactions present in the native structure, following the general strategy of G ō-like models. Because secondary structure interactions are substantially stronger than tertiary interactions, we expect that any long-lived misfolded states will be primarily stabilized by non-native secondary structure interactions, and that non-native tertiary interactions would not play a significant role in determining the thermodynamics of ribozyme formation.
Although G ō-like models of RNA have been successful in a variety of applications32,36–42, they are typically constructed with a reduced number of energetic parameters, and hence are applicable to a limited range of ion concentrations and temperature. In sharp contrast, the force field used in this study (see Methods) is able to reproduce the experimental thermodynamic and structural data for several different RNA molecules under a relatively wide range of solution conditions. Direct comparisons of the force field predictions with the measured data are presented in Supplementary Methods. The success of the current model in achieving quantitative agreement with experiment is due to a combination of the careful treatment of RNA interactions and explicit inclusion of all ions, which are modeled as spheres characterized by an appropiate charge and radius. This simple description of ions proves to be sufficiently accurate for the aims of the current study, suggesting that the folding of the Azoarcus ribozyme is controlled largely by the ion charge density.
Results and Discussion
Local and global folding of the Azoarcus ribozyme
We report the results of coarse-grained simulations of Mg2+-driven folding of the Azoarcus ribozyme. The generated equilibrium trajectories are sufficiently long so that we can observe multiple unfolding/refolding of individual tertiary elements in the RNA and track the uptake of Mg2+ ions by each element. We focus on the folding of the six principal elements of the ribozyme tertiary structure that undergo distinct folding transitions (Fig. 1): (1) the stack exchange junction, SE, which anchors the native pseudoknot P3, (2) the central triple helix TH, (3-4) two peripheral tetraloop-tetraloop receptor interactions, TL2-TR8 and TL9-TR5, (5) the G site, comprising the G-binding pocket and bound reactive nucleotide ΩG206, and (6) the active site, which is formed by interactions of loop J8/7 with the G site, TH and P4. We could detect stable formation of the G-binding pocket only upon binding of ΩG206 in the pocket, in support of the earlier kinetic studies of G binding43, and so we do not regard these as distinct events. Similarly, the junction J2/3 forms concomitantly with TL2-TR8, and hence is not considered to be an independent tertiary motif. The folding transitions of different tertiary motifs proceed in the order illustrated in Supplementary Fig. S1 and contribute to the global compaction of the ribozyme, which appears as a single transition in the radius of gyration Rg with increasing Mg2+ concentration (Fig. 2a). The midpoint, cm, of the Rg transition as a function of Mg2+ concentration is not sensitive to K+ concentration, consistent with the idea that the transition is driven by Mg2+ ions. Our results confirm that, in the absence of Mg2+, increasing KCl from 12 to 50 mM is sufficient to induce significant reduction in Rg (Fig. 2a). However, the elements of the tertiary structure do not form in 50 mM KCl without Mg2+, except for 17% occurrence of the folded G site (Fig. 2b and Supplementary Fig. S2). In addition to unstable tertiary structure, we find that the helix P3 is unpaired in low Mg2+ and its stability curve closely follows that of tertiary motif SE with increasing Mg2+ concentration. This and further results for the correlations between secondary and tertiary structure formation can be found in Supplementary Discussion.
Mg2+ coordination of the folded ribozyme
The spatial distributions of Mg2+ ions at different stages in the ribozyme assembly contain pronounced peaks at RNA sites characterized by high affinity for Mg2+ (Fig. 3). These Mg2+ concentration profiles are fingerprints, which identify site specific ion-RNA interactions that direct the folding of tertiary structure. In 30 mM Mg2+, the highest concentration considered, the Azoarcus ribozyme is fully folded (Fig. 2a, b). The majority of high-affinity sites in the Mg2+ fingerprint at 30 mM (Fig. 3a) are consistent with the positions of Mg2+ ions resolved in the crystal structure of the intron22,23. At the lowest Mg2+ concentration at which the ribozyme is folded (4 mM, Fig. 2a, b), the local Mg2+ ion concentration at the high-affinity sites is the same as in 30 mM, whereas it decreases noticeably elsewhere (Fig. 3a). This indicates that the ribozyme tertiary structure is sustained by localized Mg2+ ions.
Mg2+ coordination of the unfolded ribozyme
The Mg2+ fingerprints in Fig. 3 reveal that the distinct subsets of the peaks, associated with the formation of each of the tertiary motifs, emerge as Mg2+ concentration is gradually increased. The presence of very few Mg2+ ions per RNA is sufficient to trigger folding of the G site, SE and TH in the unfolded ribozyme. In submillimolar Mg2+ and 50 mM KCl these tertiary motifs form intermittently, as illustrated by the equilibrium trajectories in Supplementary Fig. S3. At 0.2 mM Mg2+ the ensemble of RNA conformations partitions into six structural classes characterized by the formation of the G site, SE and TH: (1) in 33% of conformations only the G site is formed, (2) in 12% of conformations only TH, (3) in 2.2% only SE, (4) in 2.8% the G site and SE, (5) in 0.7% SE and TH, (6) and in the remainder of the conformations none of the three motifs are formed. As we will discuss below, the complete folding of the G site and TH is mutually exclusive in submillimolar Mg2+ due to a phenomenon which we call folding frustration. Folding frustration occurs if topological restrictions arising from chain connectivity prevent the free energies of all interaction sites from being simultaneously minimized.
The Mg2+ fingerprint characteristic of class 1 RNA conformations with the folded G site has sharp maxima at residues A127 and G130 (Fig. 3b). A similar Mg2+-RNA interaction pattern is also observed in the crystal structure of the folded intron22,23, where a single Mg2+ ion is coordinated via a water molecule to phosphates 127 and 130, which are at distances 5.7 Å and 6.6 Å from the ion, respectively. The formation of the stack exchange junction SE in the absence of other motifs (class 3 conformations) is accompanied by the accumulation of Mg2+ ions in a cavity lined by phosphates 41, 138 and 167-169 (Fig. 3c and Supplementary Fig. S4). There are no resolved Mg2+ ions in the cavity in the crystal structure of the folded intron, possibly due to the diffuse nature of the local ion distribution. The Mg2+ fingerprint of the triple helix TH in class 2 conformations has substantial peaks around nucleotides 51 and 126 (Fig. 3c). We associate these peaks with two Mg2+ ions resolved in the crystal structure, localized at distances 3.8 Å and 2.6 Å from phosphates 51 and 126, respectively. Analysis of individual folding events (Supplementary Fig. S3) indicates that the appearances of the unique Mg2+ fingerprints in Fig. 3b, c are precisely correlated in time with the formation of the corresponding tertiary motifs. When none of the tertiary structure is formed (class 6 conformations), the ribozyme does not have sites with high affinity for Mg2+ (Fig. 3b). This is consistent with our finding that coordination of Mg2+ ions to transiently formed tertiary motifs initiates folding of the ribozyme structure.
Of the three tertiary motifs, the triple helix TH shows the strongest affinity for Mg2+ (Fig. 3c) and, consequently, the largest increase in stability in submillimolar Mg2+ (Fig. 2b). Interestingly, in the absence of Mg2+, the motif SE is most stable, indicating that it is not the intrinsic stability of a tertiary structure which determines its ability to capture Mg2+ ions. This ability depends primarily on the details of the electrostatic potential of the phosphate-lined recruiting pockets associated with structurally diverse tertiary motifs.
Mg2+ fingerprints of the peripheral motifs
In some cases, folding of a tertiary structure is a prerequisite for subsequent folding of another motif. For example, the tetraloop-tetraloop receptor interaction between domains P2 and P8 (TL2-TR8) can be detected only when the stack exchange junction SE is correctly folded. Figure 3d illustrates the Mg2+ fingerprint corresponding to simultaneous presence of folded SE and TL2-TR8 in 0.4 mM Mg2+. Comparison with the Mg2+ fingerprint of SE establishes Mg2+ binding region associated with TL2-TR8 itself (Figs 3d and 4a). The predicted region, occupied by a Mg2+ ion in the crystal structure of the intron (Fig. 4a), is not proximal to the P2 tetraloop, indicating that Mg2+ ions stabilize tertiary interactions indirectly.
The Mg2+-RNA interaction pattern of TL9-TR5 is similar with that of TL2-TR8. The TL9-TR5 motif unfolds rapidly in submillimolar Mg2+, which complicates the determination of its Mg2+ fingerprint. Comparison of multi-motif fingerprints in 1.2 mM Mg2+ in the absence and presence of TL9-TR5 folding (Fig. 3e) reveals a diffuse Mg2+ binding region associated with TL9-TR5 (Fig. 4b). Similarities between Mg2+ coordination of TL2-TR8 and TL9-TR5 become apparent upon structural alignment of these motifs (Fig. 4a, b). In the crystal structure, a Mg2+ ion is found at the periphery of the TL9-TR5 binding region (Fig. 4b), corroborating the diffuse character of the Mg2+ ion distribution observed in simulations.
Final stage of folding
The peaks around phosphates 48, 88, 124-125, 133, 171-172 and +1 (207) in the Mg2+ fingerprints in Fig. 3a are not associated with the G site, SE, TH, TL2-TR8 or TL9-TR5, but emerge cooperatively above 1 mM Mg2+. The Mg2+ ion coordination with phosphates 48 and 133, also found in the crystal structure, stabilizes coaxial stacking of the native pseudoknot P3 and helix P7 (Fig. 4c). The probability for these domains to stack coaxially increases with Mg2+ concentration with an approximate midpoint of 1.5 mM (Supplementary Fig. S5), which is noticeably higher than the midpoint for coaxial stacking of P3 and P8 (SE in Fig. 2b). The peaks involving phosphates 88, 124-125, 171-172, 207 are associated with the formation of tertiary contacts in the ribozyme active site. In support of this, the growth of the peak at reactive phosphate 207 with increasing Mg2+ concentration (red and blue curves in Fig. 2a) parallels the folding curve of the active site (red curve in Fig. 2b and Supplementary Fig. S2). Analysis of the spatial distribution of Mg2+ ions at the active site shows that it is strongly localized around two distinct maxima (Fig. 4d). These maxima are consistent with two Mg2+ ions bound at the active site in the crystal structure of the intron, which are essential to the catalytic activity7. This demonstrates that the Mg2+ ions at the active site serve the dual purpose of structural stabilization and catalysis (Fig. 4d). The active site forms cooperatively with a complementary electronegative pocket, lined by phosphates 124-125, 127 and 171 (Fig. 4e). A Mg2+ ion occupying this pocket, observed in simulations as well as in the crystal structure (Fig. 4e), further contributes to active site stabilization. A cooperative link between the two sites has also been confirmed by Tb3+ cleavage experiments that pointed to the ability of nucleotide 171 to bind Mg2+ and switch the ribozyme from the inactive to the active state12.
Anti-cooperativity and cooperativity of tertiary interactions
Interactions between tertiary motifs cause the stability of some motifs to change nonmonotonically with Mg2+concentration (Fig. 2b and Supplementary Fig. S2). One example is interaction between the G site and TH which are linked directly by the RNA chain, resulting in the folding frustration and anti-cooperativity between these motifs. For the G site and TH to be simultaneously folded, residues U126-A127 must adopt an entropically unfavorable extended conformation (Fig. 1a). Consequently, at less than 2 mM Mg2+, the stability of the base triple G53-C91-U126 in the TH decreases when the stability of the G site increases, and vice versa (Fig. 2b and Supplementary Figs S2 and S6). The stability of the base triple C52-G92-G125 in the TH is also dependent, to a lesser extent, on the formation of the G site (Supplementary Fig. S5). Only when Mg2+concentration exceeds 4 mM its stabilizing effect on the TH overcomes the anti-cooperativity effects and the folding of G53-C91-U126 can proceed to completion (Fig. 2b and Supplementary Fig. S2). We find that, despite strong anti-cooperative correlation between the TH and G site, the folded TH in fact stabilizes other elements of the ribozyme tertiary structure (see Supplementary Discussion).
We have also observed a destabilizing effect of the peripheral motif TL9-TR5 on the interactions of J8/7 with P4 and TH, which causes the stability of the active site to decrease around the folding transition midpoint of TL9-TR5 (Fig. 2b and Supplementary Figs S2, S5 and S6). The folding of the active site and the folding of TL9-TR5 draw together coaxially stacked domains P5-P4-P6a and P7-P3-P8 in mutually inconsistent relative orientations (Supplementary Fig. S7). When neither active site nor TL9-TR5 are folded, the angle between the two domains, γ, undergoes large fluctuations with mean close to 70° (Fig. 5a and Supplementary Fig. S2). The formation of the active site alone increases the mean value of γ to approximately 90°. In contrast, the folding of TL9-TR5 in the absence of the folded active site results in a relatively narrow distribution of γ with mean below 60° (Fig. 5a and Supplementary Fig. S2). As a consequence of this conformational conflict between the active site and TL9-TR5, only one of the two motifs is observed in the majority of ribozyme conformations below 3 mM Mg2+. Increasing Mg2+ concentration above 4 mM leads to significant population of the native conformation, in which all core and peripheral interdomain contacts are formed and γ is narrowly distributed around 65° (Fig. 5a and Supplementary Fig. S1), as compared to 67° in the crystal structure22. In the native conformation the compromise value of γ is attained through the formation of a kink between helices P9 and P9.0 (Fig. 1). At 30 mM Mg2+ the native conformation is populated less than 100% due to lack of complete stability of the active site (Supplementary Fig. S5). It is possible that the conformational mobility in the ribozyme core in the absence of substrate is necessary for efficient substrate binding and catalytic function.
We propose that an ensemble of conformations in which the principal domains P5-P4-P6a and P7-P3-P8 are formed but are not in the native orientation represents the native-like inactive intermediates, Ic, observed experimentally44,45 (gray, blue, green in Fig. 5a and Supplementary Fig. S2). Specific details of this conformational ensemble depend on the concentration of the monovalent ion. High concentration of KCl enhances the stability of tetraloop-tetraloop receptor motifs, thus promoting the conformations with folded TL9-TR5 while decreasing the population of conformations with the folded active site (Fig. 5a and Supplementary Fig. S2). It should be possible to distinguish various conformational states and confirm this prediction using FRET experiments in which one pair of fluorescent markers is attached to nucleotides forming a peripheral contact between P5 and P9, and another pair to nucleotides connecting J8/7 and P4 or J8/7 and TH (Fig. 1a). The proposed nature of the native-like intermediates explains why the motif TL9-TR5 is unfolded in one of the four molecules comprising the unit cell in the crystal structure of the Tetrahymena ribozyme25.
Early studies of the unfolding of group I intron of bacteriophage T4 described the cooperative loss of most of the tertiary interactions upon heating, which appeared as a single twōstate transition46. More recent mutation studies of the Azoarcus ribozyme folding concluded that the initial population of the native-like intermediates is also guided by a cooperative network of tertiary interactions with the TH helix at its center14. Surprisingly, strong anti-cooperativity between the peripheral motif TL9-TR5 and interdomain contacts in the ribozyme core emerges as the folding progresses to the native structure at higher Mg2+concentration14. These results are in complete accord with our simulations which place partial formation of the TH at the beginning and coexistence of TL9-TR5 and the active site during the final stage of ribozyme folding with increasing Mg2+ concentration. We find that it is precisely the folding frustration between the core and peripheral tertiary contacts, characteristic of the native conformation, that leads to the population of native-like states at intermediate Mg2+ concentration. The possibility that not all elements of RNA tertiary structure are linked cooperatively was also discussed in the context of kinetic studies of the G binding pocket43, which was shown to undergo a distinct folding transition upon binding of guanosine. The results presented here provide support and the much-needed structural underpinning for these insightful experiments.
Folding in Ca2+
To examine the specific requirement of Mg + for the catalytically competent assembly of the Azoarcus ribozyme, we carried out additional simulations using Ca2+ instead of Mg2+. Our results for the tertiary structural equilibria in 50 mM KCl with varying Ca2+concentration, summarized in Supplementary Fig. S8, show that the folding transition midpoint in Ca2+ is higher than in Mg2+ and the ribozyme folded state is less compact. We find that most of the tertiary structure has formed in Ca2+ with the exception of the active site and the base triple G53-C91-U126 in the TH (Supplementary Fig. S8). In addition, the probability for P3 and P7 to stack coaxially is less than 100% in 30 mM Ca2+ (Supplementary Fig. S8), pointing to the presence of conformations with incompletely formed domain P7-P3-P8. The majority of compact conformations in Ca2+ are similar to an intermediate observed for Mg2+, in which domains P5-P4-P6a and P7-P3-P8 are completely formed and joined by the TL9-TR5 but not J8/7-P4 or J8/7-TH contacts (green in Fig. 5a, b). The probability of formation of the native conformation, representing a potentially active ribozyme, is low in 30 mM Ca2+ (Fig. 5b).
The stark difference in the stability of the active site in Mg2+ and Ca2+ cannot be attributed to the electrostatic energy alone. Indeed, the energy of Coulomb interaction between a phosphate group and a Ca2+ ion at the distance of closest approach is only 25% less than the analogous energy for a Mg2+ ion. This leads us to conclude that the most important discriminating factor between Mg2+ and Ca2+ ions is not the Coulomb energy, but the size exclusion of larger Ca2+ ions from the active site. In the crystal structure of the active intron7, two Mg2+ ions bound at the active site are separated by 3.9 Å (Fig. 4d). This short distance is only slightly larger than the diameter of a single Ca2+ ion, indicating that the active site cannot accommodate two Ca2+ ions without significant steric frustration between them. Similarly, in Fig. 4e the radius of gyration of the electronegative binding pocket adjacent to the active site is 3.8 Å, which just equals the sum of the phosphate and Ca2+ radii. A Ca2+ ion cannot effectively bind in such a tight configuration without disturbing the binding pocket itself and neighboring RNA structure.
Conclusions
Our study establishes that site specific interactions between Mg2+ ions and individual tertiary motifs occur even in the unfolded ribozyme. Due to this extraordinary specificity, as few as two Mg2+ ions per RNA (0.1 mM Mg2+ in Fig. 2b) can serve to nucleate transient folding of key tertiary motifs. At such low Mg2+ concentrations the folding of the tertiary motifs is mutually exclusive since the ions must be released before another motif can fold. With increasing Mg2+ concentration, multiple motifs fold in parallel in accordance with their affinity for Mg2+ ions and subject to the topological constraints of the RNA. A complex order of equilibrium assembly arises from these interactions, with the stability of some tertiary structure changing non-monotonically with Mg2+ concentration (Fig. 2b). Although the principal helical domains in the Azoarcus ribozyme can also fold in Ca2+, their correct relative orientation and the organization of the active site require Mg2+ ions, which have a much higher charge density. Our results demonstrate that the Mg2+ coordination pattern necessary for catalytic activity also provides the basis for structural stability of the active site (Fig. 4d). This conclusion is further supported by our all-atom simulations described in Supplementary Discussion, as well as by experiment12. Such harmony between chemical and structural requirements reduces the possibility that the active site is folded and occupied by catalytically inactive ions which must be displaced before the splicing reaction can proceed. Only the Mg2+-coordinated active site is ordered, poising the ribozyme for substrate recognition and catalytic activity, thus effectively speeding up the rate of reaction.
The folding mechanism we have discovered — in which folding of individual structural elements results in formation of phosphate-lined binding pockets that recruit stabilizing Mg2+ ions, even at Mg2+ concentrations for which global folding is frustrated — is likely to be quite general. The interactions between Mg2+ ions and the RNA are determined by the structural properties of these binding pockets, rather than by a specific RNA sequence. Furthermore, homology between the structural elements in the ribozyme studied here and many other functional RNA molecules suggests that the relationship between Mg2+ binding and folding elucidated here should hold in other ribozymes.
Methods
The force field used in this study is an extension of our earlier model34. It takes into account bond length and valence angle constraints, secondary and tertiary structure hydrogen bonding, secondary and tertiary structure base stacking, excluded volume repulsions and electrostatic interactions. Below we summarize the new elements in the interaction potentials that were introduced to address the problem of folding of a large ribozyme.
The potentials for bond lengths and valence angles are carried over from the original model34. Both models incorporate the hydrogen bonds present in the crystal structure of the RNA molecule, which are determined by submitting the structure to the WHAT IF server at http://swift.cmbi.ru.nl. In the current model other non-native canonical base pairs can form between any A and U, G and C, and G and U separated by at least four nucleotides along the chain. The potential for hydrogen bonds, which was in the original model, is in the new model, where the common function u1 is a combination of harmonic potentials chosen to bias the structure to ideal A-form helix for canonical bonds or to the crystal structure for non-canonical bonds34. The rapidly decaying exponential form of the revised UHB describes the short range of hydrogen bonds more accurately, which is important when there is a large number of non-native interactions. is an adjustable parameter.
Stacking interactions between two consecutive nucleotides are modeled using the same potential in both models, , where u2 is a sum of harmonic terms that bias the stack structure to A-form helix34. The parameters take different values for sixteen (16) different nucleotide dimers, r(XpY), where X, Y represents A, C, G, or U. The are temperature-dependent, , where h and s are tuned for each r(XpY) individually so as to yield the melting temperatures (Tm) and entropies of r(XpY) stacking obtained from experiment, as detailed previously34. The parameters h (but not s) which result from this learning procedure are functions of a single free energy correction ΔG034. ΔG0 is the second adjustable parameter.
Tertiary stacking interactions between nonconsecutive nucleotides were not included in the original model. We have identified twenty seven (27) stacks between nonconsecutive nucleotides in the crystal structure of the Azoarcus intron22 (Supplementary Table S1). These stacks are described here using the interaction potential , where u1 is the same as for hydrogen bonds. Note that native tertiary stacks are modeled similarly to native hydrogen bonds in the original model, because there are no fundamental differences in the geometry of base pairing and base stacking in a coarse-grained representation of RNA. Ust is the third adjustable parameter.
All electrostatic interactions are modeled using Coulomb potential, divided by the temperature dependent dielectric constant of water47. Solvent molecules are not explicitly included in simulation. The charges for RNA sites and ions are given in Supplementary Table S2. In the original force field the ions were modeled implicitly34.
Excluded volume repulsion between sites i and j (RNA sites or ions) separated by distance r (Å) is described by the modified Lennard-Jones potential, where Dij = Ri + Rj and . Ri εi and for ions and RNA are listed in Supplementary Table S2. UmLJ models hard repulsions that decay on the short length scale of 1.6 Å. The use of UmLJ simplifies parameterization of the model, because the short range of repulsions makes our quantitative results insensitive to the specific values of εi. The distance 1.6 Å ensures that UmLJ becomes a standard Lennard-Jones potential for a pair of smallest particles — two Mg2+ ions. Ri for RNA have been adapted from the original model34. In Supplementary Methods we provide justification for our choice of Ri for divalent ions, and also demonstrate that RNA thermodynamics is relatively insensitive to Ri for K+.
Using comparison of simulation and experimental melting curves for an RNA hairpin and pseudoknot we have set kcal/mol, ΔG0 = 0.85 kcal/mol and kcal/mol (Supplementary Methods). In the original model ΔG0 = 0.6 kcal/mol34, which results in small differences in h between the two models (s are the same). h and s for sixteen dimers are listed in Supplementary Table S3.
The proposed force field may be applied to other RNA molecules provided the structure of RNA is available to determine a network and geometric parameters of non-canonical hydrogen bonds and nonconsecutive stacks, as was done in the simulations reported here. Further details of these simulations can be found in Supplementary Methods.
Author contributions
N.A.D. and D.T. conceived and designed the project, analyzed the simulation data and cōwrote the paper. N.A.D. performed the simulations.
Acknowledgements
This work was supported by a grant from the National Science Foundation (CHE 13-61946). Correspondence and requests for materials should be addressed to D.T.