Abstract
Despite the development of massively parallel computing hardware including inexpensive graphics processing units (GPUs), it has remained infeasible to simulate the folding of atomistic proteins at room temperature using conventional molecular dynamics (MD) beyond the μs scale. Here we report the folding of atomistic, implicitly solvated protein systems with folding times τf ranging from ~100 μs to ~10s using the weighted ensemble (WE) strategy in combination with GPU computing. Starting from an initial structure or set of structures, WE organizes an ensemble of GPU-accelerated MD trajectory segments via intermittent pruning and replication events to generate statistically unbiased estimates of rate constants for rare events such as folding; no biasing forces are used. Although the variance among atomistic WE folding runs is significant, multiple independent runs are used to reduce and quantify statistical uncertainty. Three systems were examined: NTL9 at low solvent viscosity (yielding τf ~ 5μs), NTL9 at water-like viscosity (τf ~ 40μs), and Protein G at low viscosity (τf ~ 10s). In all cases the folding time, uncertainty, and ensemble properties could be estimated from WE simulation; for protein G, this characterization required significantly less overall computing than would be required to observe a single folding event with conventional MD simulations. Our results suggest discrepancies with experimental folding times that should enable improvement of force fields and solvent models.
Introduction
Elucidating the kinetics and mechanisms of protein folding has been a decades-long focus of molecular biophysics, both experimental and theoretical/computational.1–19 Significant challenges remain, however, notably whether molecular dynamics (MD) simulations will provide the hoped-for reproducible and atomically detailed folding trajectories.1, 11, 13–14, 20–23 Despite isolated reports of success,24 MD simulations generally have not produced room temperature atomistic folding trajectories beyond the μs timescale even with modern hardware.25 Promising results have been reported using path-sampling techniques26–30 but no simulation methodology has emerged as a general-purpose tool for folding, especially for timescales beyond the μs range.
Here we report substantial progress in the application of the weighted ensemble (WE) path sampling method31–35 to room-temperature folding at the ms and sec scales, exploiting the power of GPU and cluster computing. We study three atomistic implicitly solvated systems: NTL9 with low and high-friction solvent, as well as Protein G at low friction. These are costly studies, requiring aggregate trajectory totals of 10s to 100s of μs per system, but they enable fairly precise (order-of-magnitude) estimation of folding rate constants. In earlier work, Ensign and Pande25 were able to estimate the WW-domain folding time of ~100 μs at room temperature using distributed computing with a total cost of 400 - 500 μs per system. To our knowledge, there are no other reports of room-temperature atomistic protein folding at the ms scale and beyond. Prior folding studies of NTL9 and Protein G were conducted at high temperature (355 K1/370 K36, and 350 K1 respectively) because of the prohibitive room-temperature timescales.
In addition to information about protein folding, the ability to quantify rate constants for slow-timescale biomolecular behavior is a critical step in model (force field) development. Although MD simulation is now a standard tool in structural biology studies,37–40 the governing parameters of MD force fields have been determined based on energy minima41–45 whereas energy barriers are expected to govern kinetic behavior. Given the evident importance of dynamic biomolecular phenomena, it is critical to obtain simulation-based rate constants to permit further refinement of force fields. However, force fields cannot be assessed fully without the ability to compute kinetic observables, and we report on significant progress in this regard.
The WE method employed in the present report is one of a number of path sampling approaches based on rigorous statistical mechanics32, 46–51 capable of yielding unbiased rate constants. Although all these methods are theoretically well-grounded, WE does offer the pragmatic advantage of being fully independent of the dynamics engine employed, which has enabled its application with a wide range of both molecular and cell-scale simulation software.33, 52–58 This versatility facilitated the integration of the WESTPA software package59 with the GPU-accelerated version of the AMBER molecular dynamics package60–62 as employed here. The WE method yields ensembles of fully continuous trajectories from which non-equilibrium observables can be calculated, including kinetic and mechanistic properties.
Results
The WE procedure takes advantage of running in parallel multiple simulations with well-defined probabilities (or weights) in a conformational space that typically is divided based on pre-defined progress coordinates (see Fig. 1A).31 The trajectory pruning and replication strategy facilitates progress along the coordinates and guarantees a constant total weight of all trajectories during the WE simulation (see SI Methods for more details). Fig. 1B shows a comparison of a brute-force MD simulation with a typical WE simulation, both starting from the same unfolded NTL9 structure. After ~7 μs of aggregate simulation time, the NTL9 Cα-RMSD in the MD simulation remains > 6 Å, whereas in the WE simulation folded NTL9 structures with Cα-RMSD < 1 Å are sampled. The probability flux of simulations reaching the target state allows estimation of the folding kinetics and the interrogation of continuous trajectories can provide information on folding mechanisms.
The data in Figs. 2–4 show that the probability flux into the folded states, which is an estimator for the rate constant,63 reaches a steady value in all three atomistic folding systems: NTL9 at low friction, NTL9 at high friction, and Protein G at low friction. The steady values indicate the systems have relaxed into steady states as a function of elapsed molecular time and that errors in estimating the force field specific folding rate constants for the given starting structure are governed by statistical noise. Note that although the average rate constant is dominated by a relatively small fraction of runs, the dominating runs switch during the course of the trajectories (Figs. S1-S3). The "molecular time", tmol, shown in Figs. 2–4 represents the time elapsed during individual trajectories. WE uses an ensemble of trajectories which all require computing resources, and aggregate simulation times are given in Table 1 (see SI for WE parameters and computing resources). Additional runs for the NTL9 systems were performed with alternative WE protocols to confirm the consistent, unbiased nature of the data: Figs. S4 and S5 show consistent time evolution of the folding flux for both low and high-friction systems.
The present study necessarily estimated folding times specific to the chosen force field and solvent model, and also conditioned on the starting structures. The novelty of the results is their relatively high precision and unbiased nature due to the theoretical foundations of the WE method.34 Hence, although comparison to experimental folding times are shown in Table 1, readers are cautioned that the present study should be considered a first step in assessment of molecular models and initial ensembles. Given these caveats, the rough agreement with experimental values is encouraging but also points to the need for further investigation of solvent modeling and initial ensembles as discussed below.
A comparison of the folding times (specific for the force field) and the aggregate simulation times as given in Table 1 also enables assessment of the effectiveness of the WE protocol. In the case where WE exhibits least enhancement of sampling, namely NTL9 at low friction (Fig. 2), the force field specific folding time of ~5μs employed ~100μs of aggregate simulation. Fig. 2 reveals that much of the computation was used to confirm steady-state behavior and in fact the folding time could have been inferred from substantially less computation. In principle, similar results could have been obtained via 5–10 independent standard MD runs totaling the same aggregate simulation time. However, given the experimental ms folding time, it is unlikely such MD runs would have been attempted, and WE provided a reliable estimate in an affordable amount of computing effort. The higher-friction NTL9 study, which should be a better mimic of aqueous viscosity,64–66 reveals a folding time of ~40μs (Fig. 3) that is essentially prohibitive for harvesting multiple events via conventional MD, even on modern GPU platforms. The value of the WE protocol is unambiguous for the slower Protein G system, where a ~10 s folding time is estimated in much less than a ms of aggregate simulation time (Fig. 4). By comparison, the computational cost of rate estimation here is significantly less than the previously reported overall cost of ~500 μs to estimate a ~65 μs room-temperature folding time.25
During the WE process, a variety of folding trajectories are simulated, enabling unbiased computation of ensemble properties. The weighted distributions of Cα-RMSD values shown in Figs. S6A, S7A for the NTL9 simulations and in Fig. S8A for the Protein G simulation serve as effective folding free energy profiles, which indicate that NTL9 folding has an energy minimum at Cα-RMSD = ~6 Å and Protein G at Cα-RMSD = ~10 Å. These regions are separated from the folded state by a free energy barrier, suggesting a definition of the transition region and thus allowing calculation of the transition times (event durations) of the continuous WE folding trajectories. Of growing interest,67–68 the event duration depends on the exact event starting point and on the solvent viscosity.69–70 For NTL9, at low viscosity, the distributions of event duration have a peak at 1.5 - 2 ns (Fig. S6B), while at the higher water-like viscosity the peaks occur at slightly larger values ~4–5 ns (Fig. S7B). For Protein G, the event duration peaks are less clearly defined but occur in the range of ~2–7 ns (Fig. S8B).
A visual analysis of representative intermediate structures sheds light on the folding mechanisms. The NTL9 molecular structures shown in Fig. 5A illustrate that during the folding process the α-helix is formed first, followed by the formation of the N-terminal β-hairpin. A putative rate-limiting step of NTL9 folding is characterized by the association of the C-terminal β-strand with the N-terminal β-hairpin through hydrogen bonds. During the final steps (1 Å < Cα-RMSD < 4 Å), the protein reduces its solvent-accessible surface area by ~5 nm2 when forming the remaining native hydrogen bonds, bending the N-terminal β-hairpin turn, and aligning the α-helix with the β-sheet. Similarly, Protein G (Fig. 5B) folds by first forming the α- helix and both β-hairpins and then bringing them all closer to each other, which appears to define the main free energy barrier, before connecting the two hairpins with hydrogen bonds and establishing the 4-stranded β-sheet. From the initial formation of the secondary structural elements to the fully folded structure (i.e. 1 Å < Cα-RMSD < 10 Å), Protein G reduces its overall surface area by ~ 8nm2.
Discussion
The data reported here suggest that molecular dynamics calculations may soon be able to measure precisely and regularly a broad array of experimentally relevant timescales characterizing functional motions of biomolecules. Such measurements are necessarily limited by the accuracy of the underlying model equations (i.e., the force field) but understanding and correcting force field mis-calibrations is essential for progress in computational structural biology. These corrections will not be possible without reliable kinetics measurements, and the present data yields roughly order-of-magnitude precision (Table 1). Current force fields can suffer inaccuracies exceeding 1 kcal/mol for free energy minima71–73 and errors at least as large are expected for the barriers which govern kinetics, which have not been part of force field parametrization.21, 41–42, 74–77 Note for reference that an order of magnitude change in an Arrhenius factor exp(−ΔG/RT) corresponds to a shift in ΔG of 1.44 kcal/mol; hence uncertainty of only 0.68 kcal/mol corresponds to a tenfold range.
Accuracy in kinetics also depends on the solvent model. Implicit solvation was employed in the present study, i.e., water molecules were not explicitly modeled. Because such models are in common use,36, 78–82 it is important to assess their kinetic accuracy. Although the overall computational cost is higher at water-like viscosity (γ=80 ps−1), the estimated average rate constants are found to be only slightly higher to low solvent viscosity (γ=5 ps−1), consistent with prior investigation of the issue.25 Going forward, additional comparison to explicit-solvent folding rate constants will be an important goal.
Another limitation of the present study is also intrinsic to protein folding generally - namely, ambiguity regarding the unfolded state ensemble. Experimentally, proteins are denatured chemically or with temperature,83–86 each of which should yield a different unfolded ensemble, and the sensitivity of refolding to the denaturing process is an under-explored topic.87 Given that some folding times are ms-scale or less, measurements may be sensitive to experimental protocols (e.g., mixing, cooling) occurring on the same timescales. Because of these ambiguities, we chose to keep our study as controlled as possible and focused specifically on folding from a single initial structure, recognizing the importance of future study of ensemble-initialized folding. Our mechanistic discussion above must be seen as restricted to this condition.
Quantification of statistical uncertainty was a central part of this study, and numerous repeated WE simulations were required to overcome the large variance of the present folding protocol. Although a large variance is generally and rightly a cause for concern in data analysis, our ability to perform tens of truly independent simulations distinguishes this work from typical molecular simulation studies; furthermore, the data presented here exhibited convincingly steady rate estimates. As described elsewhere, neither traditional standard-error analysis nor bootstrapping properly quantify uncertainty in small-size/large-variance data sets.88 We therefore employed a Bayesian bootstrapping approach which is superior at characterizing precision in such data.88–89 Nevertheless, no analysis method can correct for insufficient sampling of an unknown distribution, and we estimate that the nominal 95% Bayesian credibility regions reported here empirically correspond to ~60% probability of bracketing the true mean - and such uncertainty in the error analysis is intrinsic to the modest sample sizes.88 Future studies will clearly benefit from variance-reduction strategies, which have been proposed.90–91 Lastly, we note that we did not perform time averaging (beyond data smoothing over 0.1 or 1 ns windows) so some additional precision gains are possible but they are complicated by time correlations in WE data.92
The weighted ensemble method was chosen over other rigorous path sampling approaches10, 26–30, 46–51 and Markov state models (MSMs).93–94 Compared to other path sampling methods, WE offers fully scalable parallelization and does not require hard-coding within the dynamics engine in order to "catch" trajectories as they cross interfaces.33 When compared to MSMs, WE not only avoids any approximation but also offers continuous trajectories and the fine temporal resolution needed to infer mechanistic details occurring on 5–10ns timescales (Figs. S6-S8). By contrast, modern well-validated MSMs often require lag times >100ns.93–94
Acknowledgements
We gratefully acknowledge support from the NIH (Grant GM115805) and from the OHSU Center for Spatial Systems Biomedicine. Computing support was provided by the Center for Research Computing at the University of Pittsburgh. Also, we would like to acknowledge the research effort by Sundar Raman Subramaniam, who performed significant initial simulations on the NTL9 system. Helpful comments on the manuscript were provided by Lillian Chong.
References
- 1.↵
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.↵
- 11.↵
- 12.
- 13.↵
- 14.↵
- 15.
- 16.
- 17.
- 18.
- 19.↵
- 20.↵
- 21.↵
- 22.
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.
- 28.
- 29.
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.
- 39.
- 40.↵
- 41.↵
- 42.↵
- 43.
- 44.
- 45.↵
- 46.↵
- 47.
- 48.
- 49.
- 50.
- 51.↵
- 52.↵
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.↵
- 59.↵
- 60.↵
- 61.
- 62.↵
- 63.↵
- 64.↵
- 65.
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.
- 73.↵
- 74.↵
- 75.
- 76.
- 77.↵
- 78.↵
- 79.
- 80.
- 81.
- 82.↵
- 83.↵
- 84.
- 85.
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵