Abstract
We present a new approach to determining the conformational changes associated with biological function, and demonstrate its capabilities in the context of experimental single-particle cryo-EM snapshots of ryanodine receptor (RyR1), a Ca2+-channel involved in skeletal muscle excitation/contraction coupling. These results include the detailed conformational motions associated with functional paths including transitions between energy landscapes. The functional motions differ substantially from those inferred from discrete structures, shedding new light on the gating mechanism in RyR1. The differences include the conformationally active structural domains, the nature, sequence, and extent of conformational motions involved in function, and the way allosteric signals are transduced within and between domains. The approach is general, and applicable to a wide range of systems and processes.
Introduction
In equilibrium, each conformational state of a macromolecule is occupied with a probability , where Ei is the free energy of the conformational state i, kB the Boltzmann constant, and T the temperature. Single-particle snapshots of a sufficiently large number of macromolecules will include all accessible conformational states, with the number of snapshots emanating from each state determined by its occupation probability. For a macromolecule with N atoms, the conformational landscape has 3N degrees of freedom.
Biological function, however, involves structural blocks, e.g., an alpha helix, or an entire molecular domain. Function can be thus described in terms of a small set of so-called conformational coordinates, each describing the concerted motions of a large number of atoms. The number of degrees of freedom exercised during unperturbed function, and the conformational coordinates relevant to such function must be experimentally determined. The choice of conformational coordinates is not unique, but independent, or at least mutually orthogonal coordinates are the most convenient.
Conformations can be represented as points on energy landscapes spanned by the conformational coordinates. The energy of each observed conformational state can be determined from the number of times it has been sighted, through the Boltzmann relation between the occupation probability pi and energy of the state Ei. Function unfolds along heavily populated (“least-action”) paths on energy landscapes (1).
It has long been recognized that landscapes specifying the free energy of each conformation offer a powerful framework for discussing function (2, 3), and the literature abounds with sketches of such landscapes. The few experimentally determined energy landscapes, however, are predominantly one-dimensional (i.e., involve a single conformational coordinate), and are described in terms of qualitative, ad hoc, or externally imposed reaction coordinates not guaranteed to capture the changes relevant to function.
In the absence of suitable, experimentally determined energy landscapes, efforts to infer function often rely on powerful maximum likelihood classification methods (4, 5) to sort single-particle snapshots into a user-defined number of discrete conformational clusters. A 3D structure is then extracted from each cluster. In general, the sequence in which these structures appear in the course of function, and indeed their relevance to function are unknown. Under such circumstances, functional inference necessarily involves linear interpolations between clusters, if only conceptually. As the number of ways in which two discrete structures can be transformed into each other is essentially unlimited, functional inference by discrete clustering is fraught with difficulty.
The primary goals of this article are as follows. First, to demonstrate the compilation of experimental energy landscapes associated with complex biological function, including those involving more than one landscape. Second, to identify the conformational paths associated with function, even when they involve interlandscape transitions. Third, to demonstrate the motions revealed by such functional analysis are significantly different from those inferred by discrete clustering methods. And finally, to outline the new biological insights gained by studying the continuous conformational changes associated with function.
For concreteness, the comparison is made with reference to ryanodine receptor type 1 (RyR1), a large, functionally complex, fourfold-symmetric (C4) ion channel in the sarcoplasmic reticulum membrane. RyR1 is a calcium-activated calcium channel critical to excitation/contraction coupling in skeletal muscle. Several recent cryo-EM studies have characterized, in exquisite detail, the many discrete structures obtained by clustering techniques (see, e.g., (6-10)). The functional information inferred from these studies includes the conformational states assumed by the channel (6-8), and the effects of activation and gating induced by ligand binding (9, 10). Nonetheless, our understanding of key functional processes, such as the allosteric coupling between the cytoplasmic shell and pore of the channel, remains incomplete. RyR1 thus offers an ideal opportunity to compare the conformational information deduced by standard clustering methods with that revealed by the functional approach used here.
This article is organized as follows. We first outline how manifold-based geometric machine learning (1, 11-14) can be used to determine the experimental energy landscapes of RyR1 with and without ligands in terms of rigorously derived, mutually orthogonal conformational reaction coordinates associated with ligand binding. This provides important insights into the mechanisms underlying the process of ligand binding, including the presence of multiple routes to ligand binding, and the associated functional motions. These observations are then contrasted with the results obtained by analyzing discrete structures. The comparison reveals major differences in the delineation of functionally active structural domains, the nature, sequence, and extent of motions associated with function, and the way allosteric signal is propagated to functionally important remote sites. The article concludes with a discussion of the new biological insights provided by the new approach.
Functional analysis of RyR1
We have previously demonstrated that experimental single-particle snapshots of molecular machines idling in equilibrium on a single energy landscape can be used to determine functionally relevant conformational motions in terms of rigorously derived orthogonal coordinates (1). Such an energy landscape reveals all conformations with energies up to an upper limit set by the vanishing occupation probability of high-energy states. The key point is that thermal fluctuations in equilibrium lead to sightings of all states up to the limit set by the number of snapshots in the dataset (SM section 1).
An important feature of the present study is the pooling of cryo-EM snapshots from two experiments. In one experiment, RyR1 macromolecules were in equilibrium with a thermal bath without any activating ligands. In the other, the macromolecules were in equilibrium with both a thermal bath and a reservoir of ligands, specifically calcium, ATP, and caffeine (9) (SM section 2). This pooling of data allows both species (with and without ligands, henceforth ±ligand) to be described in terms of the same set of mutually orthogonal conformational coordinates. The resulting ±ligand energy landscapes reveal the heavily populated conformational conduits, which we associate with routes relevant to ligand binding (Fig. 1A).
Subject to reasonable assumptions, Fermi’s Golden Rule (15, 16) (SM section 3) is then used to estimate the transition probability between the two landscapes, with “hotspots” identifying the most probable transition points between the landscapes (Fig. 1B). Three – dimensional (3D) movies compiled along heavily populated conduits on these landscapes reveal the conformational motions associated with ligand binding, in some regions with near-atomic resolution.
In greater detail, the 791,956 cryo-EM snapshots of RyR1 molecules analyzed in this study comprised about the same number of molecules in equilibrium with reservoirs with and without ligands (Ca2+, ATP, and caffeine) prior to cryo-freezing (9). (For details see SM section 2.) For a discussion of the residual ligand concentration in the no-ligand solution, see SM section 2.) These snapshots were grouped into 1,117 uniformly spaced orientational bins by standard procedures (5). Geometric (manifold-based) analysis of the pooled dataset revealed at least four significant conformational reaction coordinates, each describing a concerted set of continuous changes. Further detailed analysis was restricted to the first (i.e., the most important) two conformational reaction coordinates (RC1 and RC2 for short).
The architecture of the ryanodine receptor can be divided into three major regions: the channel pore, responsible for calcium efflux from the sarcoplasmic reticulum; an activation core, responsible for ligand binding and channel activation; and a large cytoplasmic shell serving as a platform for the binding of many regulatory proteins. Broadly speaking, conformational changes along RC1 involve the shell, the activation core, and the pore; those along RC2 the shell (SM section 4).
Fig. 1A shows the ±ligand energy landscapes of RyR1. Assuming the probability of a collision (not binding) with a ligand is independent of conformation, the probability of a transition between equivalent points on the two landscapes can be estimated from Fermi’s Golden Rule for the period immediately after the exposure of RyR1 molecules to the reservoir containing ligands (SM section 3). As shown in Fig. 1B, the inter-landscape transition probability displays specific “hotspot” regions, where a significant number of ligand-free and ligand-bound macromolecules have the same conformation. The most probable routes to ligand binding start from the region of lowest energy on the –ligand landscape (“START” in Fig. 1A), reach one of the hotspot transition points (“HOT”) with a probability of ~2%, cross to the +ligand landscape with ~ 0.45% of the probability of a collision with a ligand, and terminate in the region of lowest energy on the +ligand landscape (“FINISH”). This means ~ 0.01% of collisions with a ligand lead to binding.
The displacement of inter-landscape transition hotspots (red regions in Fig. 1B) from minimum energy regions (magenta dots) on both landscapes highlights the need for significant conformational changes before and after transition between the two energy landscapes. At the same time, the presence of several inter-landscape transition hotspots reveals a multiplicity of routes to ligand binding with comparable transition probabilities. The curved nature of all routes to binding (e.g., the white line in Fig. 1A) emphasizes the inappropriateness of deducing functional information from discrete “START” and “FINISH” structures at the extremes of the conformational range.
The results outlined above already elucidate longstanding questions regarding the “population shift” vs. “induced fit” models of ligand binding. Broadly speaking, “population shift” (17) requires a conformational change before, “induced fit” (18) a conformational change after ligand binding. Our results show ligand binding, at least in RyR1, proceeds via a continuum of conformations increasingly different from, and at higher energies than, the minimum-energy conformation on the –ligand landscape. These higher-energy conformations are reached thermally via “population shift”. Collision with a ligand then transfers RyR1 to the +ligand energy landscape, where a downward slope in energy drives further continuous conformational changes to the minimum-energy, ligand-bound state. The conformational changes after collision with a ligand constitute an “induced fit”. Our results thus show that each of the two opposing models describes a different part of the actual process; at least for RyR1, binding entails specific conformational changes both before and after collision with a ligand.
The exact apportionment of the conformational changes before and after collision depends, of course, on the point at which the transition to the +ligand landscapes takes place. Although transitions can occur over a relatively broad region, the highest probabilities are concentrated at a few “hotspots” (Fig. 1B). The positions of the ±ligand energy minima relative to the broad region of significant transition probability indicate that most ligand-binding events in RyR1 involve a greater element of “population shift” than “induced fit”, as suggested in (19), but this could be system-specific.
Functional analysis versus linear interpolation between discrete structures
The above discussion makes it clear that our functional analysis provides a wealth of new information on the mechanism of ligand binding. The new information includes the conformational coordinates relevant to function, the functional routes to ligand binding, the points at which transitions occur between the +ligand and –ligand landscapes, and the transition probabilities and branching ratios for different functional paths. None of this information can be obtained by studying discrete clusters.
Indeed, the availability of energy landscapes enables us, for the first time, to investigate the relationship between the discrete classes produced by maximum likelihood clustering (20) of the same RyR1 snapshots (7, 9). The data analyzed here were previously clustered into 16 conformational classes, two of which were classified as “junk” (9). The distribution of the snapshots (on the energy landscapes) contributing to each class is shown in Fig. 2. It is difficult to discern a systematic relationship between the positions of the different classes. Indeed, many of the clusters do not correspond to notable regions of the landscapes, with snapshots from “junk” clusters randomly distributed over the landscapes. Class 2 (no ligands) and class 3 (with ligands) were taken in the previous cluster-based analysis (7, 9) to represent the functionally relevant extremes of the conformational range. Function was then inferred by interpolation between the 3D structures obtained from these two classes. Below, we compare the conformational changes obtained from such interpolation, with the changes associated with the functional trajectory on the energy landscapes (Fig. 1).
Before doing so, however, we note two important points. First, the extent to which maximum likelihood clustering is based on functionally relevant conformational reaction coordinates is unknown. The position of a discrete cluster on the relevant energy landscape (Fig. 2) thus represents a projection from an unknown space onto the space spanned by the two conformational reaction coordinates most relevant to ligand binding and channel gating. Second, interpolation (“morphing”) between two or more static structures along a putative functional path is generally acknowledged as invalid. But the approach is nonetheless often used to infer function from static structures, because the functional path connecting two discrete clusters is unknown. In the absence of other information, this makes functional inference by interpolation all but inevitable, if only conceptually.
We now turn to the conformationally active structural domains and their motions. Fig. 3 and Movies 1 and 2 compare the displacements revealed by our functional analysis with those inferred from interpolating between the discrete clusters (ringed classes in Fig. 2) used in a previous study (10). These discrete classes lie close to the energy minima on the ±ligand energy landscapes, i.e., the START and FINISH of the inter-landscape functional path.
There are major differences between the results of functional analysis and those inferred by interpolating between the discrete clusters. These differences include the structural domains involved in motion, as well as the sequence and extent of displacements (Figs. 3-5, Movies 1-6). For example, in contrast to the results from cluster analysis, functional analysis shows that: a) the N-terminal domains (NTD) lead the sequence of motions; b) a significant section of the macromolecule remains rigidly static during function; and, c) the motions in the activation core and shell are coupled (Fig. 3, Movies 1-2).
The above conformational changes are substantially different from those inferred by interpolating between the discrete structures, even though the clusters used are close to the termini of the functional path. Conformational changes specific to the N-terminal domains have not been described in previous high-resolution cryo-EM studies, nor specific elements of the shell been shown to be more rigid than others (8, 9, 21).
Importantly, functional analysis reveals the motions involved in pore opening are significantly different from those inferred by discrete cluster analysis. Specifically, functional analysis shows the atomic displacements in the pore domain during channel opening are primarily located in the transmembrane region of the channel/pore scaffold (Fig. 4 and Movies 3-4). RyR1 function is known to be sensitive to mutations in this region; 40% of such mutations are implicated in Central Core Disease (see, e.g., http://www.uniprot.org/uniprot/P21817). In contrast, the displacements inferred from discrete cluster analysis primarily involve the cytoplasmic region of the pore-lining helix S6. These results graphically highlight the limitations of using static clusters for functional inference.
Insights into the RyR1 gating mechanism
We now discuss the implications of our functional results for the allosteric mechanisms responsible for channel pore opening upon binding of activating ligands. This discussion is illuminated by distance measurements at important sites, which quantify the consequences of functional motions.
The atomic coordinates of RyR1 obtained by modeling along the functional trajectory reveal the conformational changes at the binding sites of the ligands Ca2+, ATP and caffeine as ligand binding proceeds. Starting at the minimum-energy point of the –ligand landscape, the Ca2+ and ATP binding sites gradually contract until the transition to the +ligand landscape, after which the conformations of the binding-sites stabilize (Fig. 6). This is in line with the “population shift” view described earlier. The caffeine binding site, on the other hand, displays a more complex behavior: two aromatic amino-acids approach each other to accommodate, and potentially stabilize, caffeine binding, while two orthogonal amino-acids move apart (Fig. 7). The upward movement of the backbone supporting Ile 4996 may explain the calcium potentiation effect of caffeine (22), as this would bring the calcium-coordinating amino-acid Thr 5001 on the opposite side of this loop closer to its calcium-bound position (Fig. 6).
It is known that ligand binding stabilizes the activation domain in a conformation suitable for pore opening (9). Our results show this activated state can be assumed also in the ligand-free state, albeit with low, or no conductance (Fig. 8). Consistent with the low probability of channel opening in the absence of calcium and ATP (23), the energy landscape shows only ~2% of RyR molecules assume the activated conformation in the ligand-free state. Further pore opening requires the binding of a ligand followed by an “induced fit” to the minimum-energy conformation of the ligand-bound receptor. The relatively broad +ligand energy minimum (Fig. 1) is also consistent with previous observations: Brownian motions of the shell (6-8) give rise to conformational dispersion along RC2; and the spikes in conductance, interpreted as channels flickering between the open and closed states (23), cause dispersion along RC1.
Despite extensive work (see, e.g., (9)) it has proved difficult to clarify the way the conformational changes associated with ligand binding in the activation domain lead to gating and pore opening. Our distance measurements reveal potentially important atomic motions as the functional trajectory is traversed, most strikingly along a previously unobserved allosteric conduit connecting the ligand-binding sites in the Csol domain to the EF-hand (Fig. 5, Movies 5-6). Frame-by-frame measurements indicate the displacements begin at the calcium binding site, propagating along a narrow “vein” to the EF-hand (Fig. 5). Movement of the EF-hand has been observed before (8, 9). Our analysis, however, uncovers a previously unobserved allosteric conduit between the ligands binding sites and the EF-hand, revealing the mechanical motions underlying signal transduction (Movie 5). In contrast, the displacements inferred from discrete cluster analysis are distributed uniformly over large regions, with no special feature indicating targeted signal transduction (Fig. 5, Movie 6). The functional analysis shows the narrow band of displacements first appears on the –ligand landscape, i.e., before ligand binding. This further supports the notion that “population-shift” is involved in the first part of the ligand-binding process, whereby a ligand stabilizes fleeting conformational fluctuations present before ligand binding.
The observed coupling of the EF-hand movement to ligand-binding points to the potential role of the EF-hand in gating, or its regulation. The movement leads to an interaction between the EF-hand and the S2S3 domain of the pore pseudo-voltage sensor. The small movement of the S2S3 domain associated with that interaction may be relevant to gating, but this hypothesis remains to be tested.
As noted earlier, the NTD is among the first domains affected by the transition between ligand-free and ligand-bound states (Fig. 3, Movies 1-2), suggesting a potentially important role for NTDs in gating. Indeed, the NTDs give rise to important interprotomer interactions, which are lost during channel dilation and subsequent pore opening, and a number of disease causing mutations are located at these interfaces (24,25, 26). It is thus important to understand whether NTDs and other inter-domain contacts involved in the gating mechanism are destabilized by the binding of ligands prior to pore opening, or they are sufficiently weak to be broken by Brownian motions during pore opening, a mechanism previously described as the “zipper hypothesis” (27). To clarify this question, we investigated the distance between interprotomer contacts as the functional path is traversed. The analysis was limited to backbone-to-backbone distances, as the resolution of our present study is currently limited by the number of snapshots to ~ 4.5 Å in the core of the channel, which precludes reliable measurement of side-chain positions (SM section 5).
Two interprotomer contacts display significantly nonlinear behavior, suggesting a possible role in gating (Fig. 9). The first such contact is formed between the EF-hand and the S2S3 domain of the neighboring protomer, as outlined above. We observe a stepwise motion bringing these two domains into close proximity well before the transition to the ligand-bound state and pore opening (Fig. 9B). The EF-hand movement is therefore not correlated with pore opening, but with channel activation. As deletion of the EF-hand does not affect channel activation (28), a role in regulation, or calcium inactivation appears more likely (29).
The second significant contact is situated between the NTD β8– β9 loop and the activation domain. These two domains move apart by ~1.5Å in a stepwise fashion as the interlandscape transition point is approached (Fig. 9A). The loop, containing a number of charged residues suitable for ionic interactions, has been suggested to play an important role in gating (24, 26). This contact is therefore a “gate” candidate, where strong ionic interactions would prevent pore-opening before ligand binding. The contact would be broken by conformational changes in the activation domain upon ligand binding, thus allowing pore opening. Other, weaker “zipper” interactions would then be broken and reformed at intervals by the Brownian motion of the channel, accounting for the observed “flickering” behavior of RyR1. This tempting hypothesis for the RyR1 gating mechanism remains to be experimentally tested.
Finally, the IP3 receptor, with a homologous calcium activation mechanism, does not have EF-hands and an S2S3 domain, but its N-terminal domains are homologous to the RyR1 NTDs, where the activating ligand IP3 binding site is located (30, 31). The IP3 receptor could thus have a homologous mechanism, where binding of calcium and IP3 lead to conformational changes in the activation domain and N-terminal domains of IP3R, followed by pore opening.
Discussion
We now turn to the more general implications of our work. Inferring biological function from structure is a paramount goal of structural biology. Singly and together, the results presented here highlight the importance of basing functional inference on experimentally determined energy landscapes.
Our approach is based on three concepts with wide-ranging implications. First, biological function involves a rich set of continuous conformational changes inadequately described by discrete structures of unknown relationship. Second, thermal fluctuations in equilibrium lead to sightings of all states up to an energy limit set by the number of snapshots in the dataset. This makes it possible to compile the energy landscapes needed for a rigorous description of the thermodynamics of function. And third, the course of a biological process can be inferred by pooling data from ensembles in equilibrium with reservoirs corresponding to the initial and final states of the process, provided continuous, functionally relevant conformational changes can be mapped. We believe the energy landscapes, the inter-landscape transition maps, the new information on the conformationally active structural domains, and the nature, sequence, extent of important displacements involved in function – in this case ligand binding, channel activation, and channel gating – demonstrate the importance of the function-based analytical approach used here.
Clearly, equilibrium measurements cannot answer all questions regarding dynamics. It would thus be illuminating to compare the present results with those obtained from the observation of non-equilibrium ensembles engaged in reaction. Also, using larger equilibrium datasets, it would be interesting to investigate the role of conformations lying at higher energies. These constitute future tasks.
Conclusions
We have presented a new approach to determining conformational changes associated with biological function. The results demonstrate the possibility of a rigorous description of biological function at high spatial resolution. The new insights include the nature, sequence, and extent of conformational motions involved in function, and the way allosteric signals are transduced to remote sites. The approach is general, and thus applicable to a wide range of systems and processes.
(4,548 words, 9 figures, 6 movies)
Movie captions
Movie 1: Evolution of the atomic model of RyR1 along the functional trajectory of Fig. 1. Color bar shows the magnitude of the atomic displacements.
Movie 2: Evolution of the atomic model of RyR1 obtained by linear interpolation between the “START” and “FINISH” discrete structures (boxed classes in Fig. 2). Color bar shows the magnitude of the atomic displacements.
Movie 3: Conformational evolution (backbone representation) in the pore domain along the functional trajectory of Fig. 1. Color bar shows the magnitude of the atomic displacements. Singular value decomposition (SVD) was applied to the movie of atomic models to reduce noise.
Movie 4: Conformational evolution (backbone representation) in the pore domain obtained by linear interpolation between two discrete structures. Color bar shows the magnitude of the atomic displacements.
Movie 5: Conformational evolution in the asymmetric unit of the activation core domain along the functional trajectory of Fig. 1. Colored spheres indicate ligand-binding sites (yellow: Ca2+; magenta: caffeine; brown: ATP). Color bar shows the magnitude of the atomic displacements. Singular value decomposition (SVD) was applied to the movie of atomic models to reduce noise.
Movie 6: Conformational evolution in the asymmetric unit of the activation core domain obtained by linear interpolation. Colored spheres indicate ligand-binding sites (yellow: Ca2+; magenta: caffeine; brown: ATP). Color bar shows the magnitude of the atomic displacements.
Movie Captions
Movie S1: Conformational changes along the functional trajectory of Fig. 1 (trans-membrane view).
Movie S2: Conformational changes along the functional trajectory of Fig. 1 (cytoplasmic view).
Movie S3: Conformational changes along the functional trajectory of Fig. 1 obtained by modeling (trans-membrane view).
Movie S4: Conformational changes along the functional trajectory of Fig. 1 obtained by modeling (cytoplasmic view).
Author Contributions
AD designed and implemented the data-analytical approach, a robust pipeline for driving energy landscape, transition probability maps and continuous conformational changes from cryo-EM snapshots.
AG processed the cryo-EM data using standard methods. AG and DBH designed the molecular modeling and distance measurement strategy. DBH performed the molecular modeling and distance measurements.
GM helped with implementation of data-analysis algorithms, mainly for the energy landscapes and the continuous conformational movies.
PS co-developed the geometric manifold-based approach, co-investigated the effect of coarse graining on energy landscapes, and helped implement the software.
JF co-designed the study, analyzed the results in terms of the concepts of single-particle cryo-EM and the dynamics of molecular machines, and co-wrote the paper.
AO co-designed the study and the manifold-based data-analytical algorithms, identified and co-analyzed the effect of coarse-graining, proposed the ligand-binding model and formalism for estimating interlandscape transition probabilities, and wrote the paper.
AO, AD, AG, DBH and JF analyzed and interpreted the results.
Acknowledgments
We acknowledge valuable discussions with E. Lattman, G. Phillips, M. Schmidt, and members of the UWM data analysis group. The research conducted at UWM was supported by the US Department of Energy, Office of Science, Basic Energy Sciences under award DE-SC0002164 (algorithm design and development, and data analysis), by the US National Science Foundation under awards STC 1231306 (numerical trial models) and 1551489 (underlying analytical models), and by the UWM Research Growth Initiative. The work performed by JF was supported by HHMI, NIH GM55440, and NIH GM29169. The work performed by DBH and AG was supported by CUNY.