Abstract
Advances in biomolecular sciences are closely linked to our ability to chart the energy landscapes of biomolecules with atomic details. Here we validate a new paradigm to characterise thermodynamics and kinetics of millisecond timescale conformational transitions between ground state and transient excited states in the enzyme cyclophilin A (CypA). We describe a novel methodology that combines molecular dynamics simulations and Markov State modelling with NMR measurements to provide atomic-level insights into the nature of CypA transient conformational states. The computed conformational ensembles also enabled the predictive design and experimental validation of a single-site mutant that dramatically perturbs millisecond timescale loop motions, converting a CypA excited state into the ground state. The resulting models open up new horizons for targeting CypA with inhibitors and pave the way towards rational design of protein energy landscapes for protein engineering and drug discovery purposes.
The structural characterization of conformational changes between long-lived low-energy (ground) states and short-lived sparsely populated (excited) states in biological macromolecules remains a major unsolved challenge in structural biology. Despite advances in integrating data from multiple biophysical techniques,1–12 atomistic descriptions of the conformational changes that link the ground and excited states of a protein remain scarce. New approaches to provide such descriptions could transform the fields of drug discovery and protein engineering.13–15
While X-ray diffraction provides atomic resolution structures, it struggles to resolve lowly populated states (ca. <20%), and time-resolution of conformational motions has been possible in only a handful of special cases.16,17 Molecular dynamics (MD) simulations provide atomic resolution, but their accuracy is limited by the approximate nature of potential energy functions. MD also only readily characterizes transient states that exchange with a ground state within few microseconds, and efforts to surpass this timescale are generally accompanied by loss of temporal resolution.18–21 NMR relaxation experiments provide temporal resolution on a broad range of timescales but lack spatial resolution in comparison with X-ray diffraction or molecular modelling techniques.22–25
Here we implement and validate an approach that combines MD, NMR and X-ray crystallography, to characterize in atomic detail a millisecond timescale conformational change in the human enzyme cyclophilin A (CypA).26–28 Extensive biophysical studies have previously sought to characterize its diverse millisecond timescale motions using X-ray and NMR, but a detailed atomistic description of the conformational changes involved has remained elusive.29–33 CypA is a major drug target, but atomic level description of its ground state has proven insufficient to develop isoform selective CypA inhibitors.34–41 The structural ensembles reported herein give key insights into the nature of a millisecond conformational change in CypA, and these guided the design and experimental validation of a single-site mutant that converts a transient state into a ground state, there by unlocking new opportunities for CypA inhibitors design efforts.
Results and discussion
An experimentally validated ensemble of Cyclophilin A conformations
The workflow outlined in Fig. 1a was applied to CypA. Previous NMR results had identified a set of residues displaying significant chemical-exchange contribution (Rex > 4 s-1) to the transverse relaxation rate R2.29 The majority of those residues clustered in the region between Asp66 and Gly75, herein referred to as the 70s-loop. Accelerated molecular dynamics (aMD) simulations were initiated from the X-ray structure of wild type CypA that entailed enhanced motions specifically in the vicinity of the 70s-loop while maintaining the integrity of the over all fold (Fig 1a, step I and SI for details.) 42,43
aMD-biased simulations successfully generated large-scale rearrangements of the 70s-loop as well as of the loop between residues Ala101 and Gln111 (hereafter referred to as the 100s-loop) on timescales of a few hundred ns. To recover the equilibrium properties of the aMD-generated conformational changes, CypA snapshots that were representative of the conformational transition were pooled with snapshots from the initial and final conformational states sampled during the aMD simulations. These structures were used as inputs for a follow-up stage that featured rounds of equilibrium MD simulations and Markov state model (MSM) construction. This step was iterated until a self-consistent MSM was obtained. Three rounds amounting to a cumulative sampling time of 187.5 μs were sufficient to produce a numerically stable MSM (Fig. S1-S2).
The conformational ensemble was cross-validated by back-calculation of a set of NMR observables that included standard and exact NOEs, 3J-coupling measurements and Residual Dipolar Couplings (RDCs).6,44 Observables were also back-calculated from a 40-structures CYANA NMR ensemble,31 a 20 structures DYANA NMR ensemble,45 and a representative CypA X-ray structure.46 The MD ensemble was found to reproduce about 90% and 95% of derived proton-proton distances from eNOE and NOE measurements, which was comparable to both NMR ensembles and slightly better than the X-ray structure. 3J-coupling values were also computed for backbone HN-HA, HN-C, and HN-CB pairs (Fig 2b). HN-HA RMSDs values for the MD ensemble are slightly above those reported in benchmark studies of smaller proteins,47,48 and above those obtained with the CYANA ensemble. The DYANA ensemble that was not refined against the particular set of 3J coupling values used here shows poorer agreement. The RMSD for 3J coupling values derived from the X-ray structure is similar to the one obtained from the MD ensemble. There is little difference in accuracy for HN-C and HN-CB pairs between the MD and NMR ensembles, whereas the X-ray structure is slightly worse. Finally, the RDCs are better reproduced with the CYANA ensemble, whereas the X-ray structure, MD and DYANA ensembles are of similar accuracy (Fig 2c). Overall the accuracy of the MD ensemble, which does not include any refinement step against NOEs, 3J-coupling or RDCs, was deemed encouraging, and further insights into loop motion mechanisms were sought by detailed inspection of the MSM.
Atomic-level insights in millisecond time scale CypA loop motion mechanisms
A MSM description of CypA 70s- and 100s-loop dynamics is presented in Fig. 2d. For ease of interpretation, the MSM states were further grouped into a coarser, five-state description (see SI for details). The most populated macro-state (orange in Fig. 2d) resembles the conformation observed in most X-ray structures reported for CypA, with the 70s- and 100s-loops in ‘closed’ and ‘open’ conformations, respectively. In this 70s-loop closed/100s-loop open (closed/open, orange in Fig. 2d) macro-state both loops undergo low positional fluctuations (average Cα RMSF values of 1.1 Å and 2.7 Å for the 70s and 100s loops respectively). This closed/open macro-state interchanges with a similarly flexible closed/closed macro-state (red, RMSF 1.1 Å and 2.6 Å). The 100s-loop is as flexible in the open/open (magenta, RMSF 2.49 Å) and open/closed (blue, RMSF 3.33 Å) states as in the orange and red macro-states. However the 70s-loop displays greater positional fluctuations (RMSF 4.4 Å and 4.1 Å). This suggests that the open 70s-loop is disordered in these states, in contrast with the rigid structure observed in both the X-ray structure and in the closed/open macro-state. The transition between closed and open 70s-loop conformations proceeds via an intermediate (teal) semi-open 70s-loop macro-state (teal) that features positional fluctuations intermediate between the other macro-states (RMSF ca. 2.9 Å).
The order-disorder transition of the 70s-loop is critically controlled by a network of hydrogen bonds established between the backbone nitrogen atoms of residues Thr68 to Gly74 and the side-chain of Asp66. Progression from the well-populated closed/open or closed/closed macro-states to the intermediate semi-open macro-state requires partial breaking of this array of hydrogen-bonds (Gly72-Gly75), to allow partial unwinding of the 70s-loop, before progression to completely disordered 70s-loop macro-states, where no interaction between the side chain of Asp66 and backbone amide N of Thr68-Gly74 is observed (Fig. 2e). Structures of rhesus macaque TRIMCyp (RhTC) showing that a D66N-R69H double substitution promotes the opening of the 70s-loop strengthens the proposed mechanism.49,50 However, whereas RhTC adopts a structured open 70s-loop conformation that is resolved in X-ray experiments, the present calculations suggest that the closed 70s-loop observed in the X-ray structure of human CypA exchanges with a previously uncharacterized open and disordered 70s loop conformation.
A rationally designed single-site mutant converts a transient CypA state into the ground state and disrupts millisecond time scale motions of the 70s loop
This led to the hypothesis that mutation of Asp66 to Alanine (D66A) would shift the 70s-loop towards a conformation resembling the intermediate semi-open state. The hypothesis was assessed by applying the same MD/MSM protocol to CypA structures featuring the D66A mutation (Fig 3a). Calculations were also carried out for the H70A mutant as a negative control. Although His70 is spatially close to Asp66, this mutant has previously been reported to only cause a minor change in the 70s-loop motions.29
The computations predicted that the open/closed (orange) state is significantly destabilized in the D66A mutant with a population decreasing from ca. 40% to less than 15% (Fig. 3a). Meanwhile, the intermediate (teal) state seen in CypA is considerably stabilized, becoming the most abundant (40%). Overall with respect to CypA the population of states in which the 70s-loop is ordered has decreased from about 70% to 25%. In addition, while Mean First Passage Times (MFPT) among closed and open macro-states are similar between CypA and H70A, they have decreased by 10-15 fold (Fig. 3c) in D66A, thus suggesting that ordered to disordered transitions in the 70s loop happen faster in D66A than in CypA or H70A.
Experimental evidence for the dramatically altered energy landscape of D66A was sought by carrying out NMR experiments. Thus single-labelled and double-labelled CypA and D66A were expressed in and purified from BL21(DE3)pLysS E.coli cells (Fig. S4-S5).
The well dispersed HSQC spectra recorded for the D66A mutant indicated the protein was natively folded (Fig. S6). A chemical shift perturbation (CSP) analysis, with respect to CypA, indicated strikingly larger changes for the D66A mutant compared to the previously reported H70A mutant (Fig. 4a, Table S1). The largest deviations are observed in the 70s-loop region with CSP values of up to 9.0 ppm. Additional perturbations occur around residues 50 and 110 that flank the 70s-loop region. These measurements are in good agreement with the predicted CSP patterns obtained through back-calculation of chemical shifts from the computed ensembles for CypA, D66A and H70A (Fig. 4b). Moreover, the neighbour corrected intrinsically disordered protein (ncIDP) web server was used to compare chemical shift changes with respect to random coil values (Fig 4c). The collective displacement of chemical shifts in the 70s-loop towards random-coilvalues supportedthe MD predicted order-disorder transition.
Further corroboration was provided by co-crystallization and X-ray diffraction experiments on CypA and D66A in complex with inhibitor 1 (Table S2).36 Comparison of electron density profiles revealed that the 70s-loop adopts an ordered and closed conformation in the CypA:1 complex (Fig 4d) while in the D66A:1 complex electron density in the 70s-loop region is largely absent (Fig. 4e), consistent with a predominantly disordered conformation. The electron density for other parts of the protein and for 1 was similar, suggesting that the conformational change is restricted to the 70s-loop region. Moreover, the ITC estimated Kd value of 1 is ca. 12-fold weaker for D66A than for CypA (Fig. 4f and 4g) suggesting that the order-disorder transition of the 70s-loop weakens binding of compounds to the Abu and Pro pockets without completely disrupting the CypA active site. This is in line with the conservation of binding mode between CypA and D66A observed for 1 by X-ray crystallography.
The narrow difference between effective correlation time for internal motions of a large number of residues and global tumbling times for both CypA and D66A precluded the obtention of generalized S2 order parameters to characterize differences in ps-ns timescale motions between CypA and D66A (Fig. S7). Instead {1H}-15N heteronuclear NOE values were back-calculated by combining MD trajectories with NMR measurements of global tumbling times. The measured heteronuclear NOE values for D66A are consistently lower than for CypA in the 70s-loop region, even though the global tumbling times are similar (9.1 and 9.2 ns respectively). This suggests increased ps-ns time scale motions for D66A (Fig. 4h), an observation that is in line with the back-calculated NOE values, which show a significant dip in the 70s-loop region for D66A (Fig. 4i).
CPMG measurements turned out to be insufficient to reliably fit intermediate exchange parameters for D66A. However additional R1ρ experiments provided sufficient data to enable a combined fit of the R1ρ and CPMG measurements (Fig. S7). The data for CypA and D66A could be fitted in both cases to a two-state exchange model with similar kex values (ca. 2000 s-1) for CypA and D66A, but with significant differences in excited state populations (pb ca. 0.5% for D66A and ca. 2% for CypA). The fit also enabled determination of the magnitude of the chemical shift changes Δω for residues showing significant dispersions for CypA, and the magnitude and sign of chemical shift changes for D66A.
Remarkably the pattern and magnitude of Δω between ground state and excited state is dramatically different between CypA and D66A (Fig. 5a, 5b, Table S3). Major differences are also apparent in the comparison of measured Rex values for CypA and D66A, which shows that the large dispersions measured for residues in the 70s-loop in CypA have been completely quenched in D66A (Fig. 5c, Table S4) while Rex contributions can be measured in D66A for residues outside the 70s region. This suggests that D66A may undergo different millisecond exchange processes. CPMG dispersion profiles for individual residues (Fig. 5d) further support this hypothesis. Residues in the 70s-loop region (e.g. Asp/Ala66; Gly74) that exhibited strong dispersion in CypA are no longer sensitive to CPMG pulses in D66A, whereas other residues show more pronounced or similar dispersions elsewhere in the structure (e.g. Gly80, Asn87). Furthermore, the difference in chemical shifts measured between CypA and D66A proteins in their ground states correlates well with the difference in chemical shifts between CypA ground and excited states (Fig. 5e). This provides additional evidence that the exchange process involving the 70s-loop in CypA is an order-disorder transition. By contrast the differences in chemical shifts measured between CypA and D66A ground states do not correlate with the differences in chemical shifts between D66A ground and excited states (Fig. 5f). Altogether the data suggests that the exchange process measured for D66A is unrelated to an order-disorder transition of the 70s-loop region.
Discussion
The present results demonstrate comprehensive description of specific millisecond-timescale processes in proteins by combining molecular simulations with NMR measurements. The resulting conformational ensemble for CypA describes a range of NMR observables with accuracy comparable to that of NMR structure-refinement protocols. Yet there are remarkable structural differences. The DYANA ensemble does not display large amplitude conformational changes of the 70s-loop or the 100s-loop.45 Conformational variability in the CYANA ensemble is mostly in the 70s-loop region,31 but the closed state structures in that ensemble do not resemble that observed in the X-ray structure. By contrast the present MD-derived ensemble suggests that both the 70s-loop and 100s-loop undergo large amplitude motions. The source of these discrepancies may be linked to the nature of the NMR measurements used to validate the ensembles. Indeed, back-calculated NOE, J-coupling and RDC values from a single X-ray structure are only slightly less accurate than those obtained with the various NMR and MD ensembles (Fig. 2a-c). Most of the available NMR observables report on the position of residues in structured regions of the protein, and experimental data to characterize the flexible loop regions are relatively scarce. Indeed reanalysis of the MD ensemble using only states from the closed/open (orange) macro-state yields observables of similar accuracy to the full ensemble (Fig. S8). This stresses the need to validate conformational ensembles with experimentally testable hypotheses.
Our results open the door to in silico generation of such hypotheses for manipulating thermodynamics and kinetic parameters in a protein ensemble, information that is lacking in traditional workflows for NMR-based structure calculations. There has been much debate about the relationship between millisecond timescale motions in CypA and catalysis.51 The present results suggest that the previously reported millisecond timescale process involving the 70s-loop region in CypA is a local order-disorder transition that does not dramatically perturb the shape of the active site, as evidenced by X-ray and ITC measurements on a CypA-ligand complex. These results are in line with previously reported observations that millisecond time scale motions of the 70s-loop do not appear to be linked with the catalytic cycle.52,53 Previous work has shown that while human CypA only weakly restricts the HIV-2 virus, mutations in the 70s-loop region of RhTC confer potent HIV-2 restriction.49 The proposed mechanism involved alternative conformations of the 70s-loop thought to be inaccessible to CypA.50 Our results suggest instead that CypA’s energy landscape readily permits exploration of large scale rearrangement of the 70s-loop; thus detailed characterisation of transient conformational states emerge as a powerful strategy to understand how mutations in protein sequence may evolve new function.
Alternative conformational states of the 70s-loop in CypA also offers prospects for the design of next-generation inhibitors that could potentially overcome isoform-selectivity concerns.54,55 The D66A mutant offers a template for such efforts and further work to validate the mechanism of order-disorder transition in the 70s-loop of other cyclophilin isoforms is warranted. Engineering methodologies based on antibodies have been developed to trap transient conformational states and facilitate ligand discovery efforts.56–58 Our results suggest that molecular simulations may be used enhance the effectiveness of conformational trapping technologies.
The ability to describe millisecond timescale conformational changes in proteins by MD simulations is shedding new light on NMR relaxation measurements. The interpretation of such experiments is traditionally performed by fitting data to two or three states phenomenological models. By contrast MSMs constructed from MD datasets typically use a much larger number of states to describe protein dynamics. In the present case there is an apparent discrepancy between CPMG-derived populations for the CypA excited state (ca. 2%, Table S3) and those for the CypA open 70s-loop macro-states from the MSM (ca. 25%, Fig. 2e). Quantitative agreement is not expected owing to technical limitations of the simulations. However other factors may contribute to this discrepancy. Inspection of the distributions of chemical shifts of snapshots sampled from the various macro-states show that in many cases there is a significant overlap in chemical shift values between distinct macro-states (Fig. S10). Re-analysis of the MSMs, under the assumption that open 70s-loop conformations with chemical shifts similar to closed 70s-loop conformations cannot be distinguished from the ground state lead to populations for the ground state and excited state (ca. 91/9% respectively, Table S5) that are in closer agreement with the NMR measurements. This suggests that two-state CPMG analyses may underestimate the populations of minor states owing to a degree of degeneracy of chemical shifts with respect to different protein conformations.
Overall this work has demonstrated that molecular simulation methodologies may now be combined with NMR to offer atomic-level details about millisecond-timescale conformational changes in proteins. Such ability is expected to be widely enabling for protein structure-dynamics-function relationships studies, and for unlocking new opportunities in bioengineering or drug discovery.
Methods
Detailed simulation and experimental methods are provided in the SI.
Contributions
M.D.W., P.N.B., A.N.H., A.J.B., J.J-J. and J.M. conceived the overall strategy of the study. J. J-J., A.S.J.S.M. and J.M. conceived and implemented the aMD/MD/MSM approach, carried out and analysed the computational work. A.G., G.K. and A.J.B. performed and analysed NMR experiments. A.D.S. and A.N.H. synthesized and purified compound 1. C.G., H.I., P.N.B. and M.D.W. performed X-ray crystallography. A.G., H.I. and P.N.B. performed Iso-Thermal Calorimetry experiments. All authors discussed the results, designed experiments and J. J-J, G.K., A.J.B. and J.M. wrote the manuscript.
Acknowledgements
Authors thank Prof. Beat Vogeli for kindly providing the set of NMR observables of Cyp A and Prof. Xavier Hanoulle and Prof. Guy Lippens for kindly providing the NMR residue assignment of wild type CypA. J. M. is supported by a Royal Society University Research Fellowship. The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No. 655667 awarded to J. J-J., from the European Research Council under the European Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement No. 336289 and from EPSRC (grant no. EP/P011330/1). This project made use of time on the ARCHER UK National Supercomputing Service (http://www.archer.ac.uk) granted via the UK High-End Computing Consortium for Biomolecular Simulation, HECBioSim (http://hecbiosim.ac.uk), supported by EPSRC (grant no. EP/L000253/1).
Footnotes
↵¬ Current affiliations: A. G: Department of Chemistry, University of Warwick, Coventry, CV4 7AL, United Kingdom; C.G: Department of Biomedicine, University of Bergen, 5020 Bergen, Norway; A.D.S: Sygnature discovery, Biocity, Nottingham NG1 1GR, United Kingdom.