Abstract
Under normal cellular conditions, the tumor suppressor protein p53 is kept at a low levels in part due to ubiquitination by MDM2, a process initiated by binding of MDM2 to the intrinsically disordered transactivation domain (TAD) of p53. Although many experimental and simulation studies suggest that disordered domains such as p53 TAD bind their targets nonspecifically before folding to a tightly-associated conformation, the molecular details are unclear. Toward a detailed prediction of binding mechanism, pathways and rates, we have performed large-scale unbiased all-atom simulations of p53-MDM2 binding. Markov State Models (MSMs) constructed from the trajectory data predict p53 TAD peptide binding pathways and on-rates in good agreement with experiment. The MSM reveals that two key bound intermediates, each with a non-native arrangement of hydrophobic residues in the MDM2 binding cleft, control the overall on-rate. Using microscopic rate information from the MSM, we parameterize a simple four-state kinetic model to (1) determine that induced-fit pathways dominate the binding flux over a large range of concentrations, and (2) predict how modulation of residual p53 helicity affects binding, in good agreement with experiment. These results suggest new ways in which microscopic models of bound-state ensembles can be used to understand biological function on a macroscopic scale.
AUTHOR SUMMARY Many cell signaling pathways involve protein-protein interactions in which an intrinsically disordered peptide folds upon binding its target. Determining the molecular mechanisms that control these binding rates is important for understanding how such systems are regulated. In this paper, we show how extensive all-atom simulations combined with kinetic network models provide a detailed mechanistic understanding of how tumor suppressor protein p53 binds to MDM2, an important target of new cancer therapeutics. A simple four-state model parameterized from the simulations shows a binding-then-folding mechanism, and recapitulates experiments in which residual helicity boosts binding. This work goes beyond previous simulations of small-molecule binding, to achieve pathways and binding rates for a large peptide, in good agreement with experiment.
Introduction
The transcription activator p53 plays a central role in tumor suppression1. Cellular levels of p53 are normally kept low by targeted degradation by the E3 ubiquitin lig-ase MDM2 (mouse double minute 2), whose N-terminal domain binds residues 17-29 of the p53 transactivation domain (TAD) in a deep hydrophobic cleft2. The p53 TAD is intrinsically disordered3, but forms a helix when bound to MDM2. Various types of cellular stresses such as DNA damage leads to disruption of p53-MDM2 bind-ing and an increase in p53 expression, which in turn promotes cellular repair or apoptosis. Thus, the discovery of potent competitive inhibitors that can disrupt the p53-MDM2 binding interaction has been an important strategy for developing new cancer therapeutics4,5. The availability of structural information has also made p53-MDM2 a valuable model system for the study of protein-protein interactions and the development of new classes of peptidomimetics6–9 often alongside computational design efforts10,11.
A consensus of experimental and simulation studies suggest that intrinsically disordered protein (IDP) domains such as p53 TAD bind their receptors through an induced-fit “fly-casting” mechanism, whereby binding occurs first, followed by structuring into higher-affinity poses12–17. It has been proposed that this mechanism facilitates binding to multiple partners in complex regulatory networks, and may enable fast association rates important for signaling. The structural and kinetic properties of IDPs are thought to fine-tune many signaling interactions18. Recently, Borcherds et al. have shown that the extent of residual helicity of p53 TAD can modulate p53-MDM2 binding affinity as well as signaling dynamics in cells19. An important challenge for molecular simulation is thus to predict binding pathways and association rates of IDPs to their targets, and the detailed molecular mechanisms responsible for shaping them.
In this work, we use extensive all-atom molecular simulations in explicit solvent, combined with state-of-the-art Markov State Model (MSM) approaches, to investigate the p53-MDM2 binding mechanism. Recent MSM studies have examined the mechanisms by which protein receptors recognize small molecules20–23 and here we extend similar methods to model the coupled folding and binding of a larger peptide (p53 TAD peptide) to MDM2.
Methods
Molecular simulation
Simulations starting from unbound states were performed on the the Folding@home distributed computing platform24 with Gromacs 4.5.425 using the Amber ff99sb-ildn-nmr force field26 and TIP3P explicit solvent. A number of initial starting configurations were selected from conformational clustering of implicit-solvent REMD simulations, each placed at different distances within 12 Å from the binding site. A total 2776 trajectories were generated amounting to ~ 831μs of aggregate simulation data (Figure S1).
MSM construction
Recent methodological advances have exploited the variational approach to conformational dynamics27 to enable the construction of optimal MSMs given the available trajectory data28,29. To implement this approach, we used time-structure-based Independent Component Analysis (tICA)30,31 to project the trajectory data to a low-dimensional subspace that best preserves the slowest conformational transitions. Using all pair distances between Cα + Cβ atoms of p53 and the binding pocket of MDM2 (see Supporting Information for details), we constructed a time-lagged correlation matrix C(Δt) and corresponding covariance matrix from the pair distances using a Δt = 5 ns lagtime. The tICA components α are found by maximizing the objective function ⟨αi|C(Δt)|αi⟩ subjected to certain constraints30. Once projected to the tICA subspace, distance-based clustering using the k-means algorithm was performed to obtain MSM metastable state definitions. To select hyper-parameters such as the number of tICA components, clustering method, MSM lag time, and the number of MSM microstates, we performed variational crossvalidation using the GMRQ method of McGibbon et al.28 on over 120 MSMs. As in previous work32, we find that tICA distance metrics are better than rmsd or dihedral angle metrics, and k-means clustering performs better than k-centers. The optimal MSM, used in all subse-quent analysis, is constructed using 10 tICA components, 600 microstates and a 5 ns MSM lag time τlag (Figure S2). The MSM transition matrix T(τlag) was estimated using a maximum likelihood method33,34. The model is validated by implied timescales which plateau near the chosen lag time of 5 ns (Figure S3), and Chapman-Kolmogorov tests (Figure S4). All models were built using MSMbuilder 3.333 and MDTraj 1.535 software packages.
Results
Binding precedes folding. A projection of the simulation data to two reaction coordinates-the rmsd of p53 to its native structure, and the distance of p53 to the MDM2 binding pocket-suggests that binding of p53 precedes folding of p53, consistent with the “fly-casting” mechanism (Figure 1). The distance vs. rmsd landscape can be manually partitioned into four states: folded-bound (blue), unfolded-bound (green), folded-unbound (red), and unfolded-unbound (cyan). These states were defined using a bound-state distance cutoff of 1.2 nm, and rmsd cutoff of 0.2 nm. Projecting the 600 MSM microstates to this landscape, we find most of the population in the bound states, with only one microstate corresponding to the folded-unbound state (red).
Projections of the simulation data to the two largest tICA components, corresponding to the slowest conformational dynamics, show a very different landscape (Figure 2). The folded-bound state (blue) is composed of a single well-populated basin, closely matching within 1.3 Å backbone rmsd to the native co-crystal structure, with the side chains of F19, W23 and L26 correctly inserted into the binding pocket of MDM2. In contrast, the unfolded-bound state (green) is distributed throughout the tICA landscape. The two predominant basins of the unfolded-bound state correspond to p53 bound in two different misfolded states, each with residue F19 in its native binding groove, but W23 outside of the binding cleft. As can be seen by the eigenvector structure of the MSM (Figure S5) transitions from these basins control the slowest timescales of binding.
Transition pathways and rates. To estimate pathways, fluxes and rates of p53 association, we used Transition Path Theory (TPT), which we briefly describe here and refer readers to other references for more details36–38. In TPT, source states (A) and sink states (B) first need to be defined for the transition process of interest. The remaining states are considered to be intermediate states (I). Next, committor probabilities , defined as the probability that a trajectory started from state i will reach B before state A, are computed from the MSM transition matrix. The total folding flux giving the ex-pected number of observed A → B transitions per time unit τ is: . The rate of reaction A → B, kAB can then be computed as:
We first used TPT to estimate overall p53 binding on-and off-rates (kon and koff) from the 600-microstate MSM, using the unfolded-unbound and folded-bound states as the source and sink states for kon, respectively (and vice versa for estimating koff). For comparison, we constructed a four-macrostate MSM by manually lumping the 600 microstates according to our four-state definitions. The results (Table I) show that the kon estimated from the 600-microstate MSM is very close to the experimental kon (within a factor of 2.7). Macrostate lumping into four states further accelerates the dynamic timescales, with kon predictions still within a factor of 5.6. As expected given the available trajectory lengths, koff estimates from both models are severely over-estimated.
A four-state kinetic model predicts an induced-fit mechanism for p53 binding. To analyze whether the binding mechanism follows a conformational selection (CS) or induced fit (IF) mechanism (or aspects of both), we computed the reactive flux for each mechanism according to the method introduced by Hammes et al.40, illustrated in Figure 3. In this model, association of a ligand can occur either through a weak-binding (w) form of p53, or a tight-binding (t) form, with interconversions between these two possible when unbound or bound. Here, we slightly modify our interpretation of the model for use with disordered peptide binding; in our case it is the receptor MDM2 which can select or induce folded (helical) states of p53 TAD peptide. Following Hammes et al., we compute the reactive flux for conformational selection pathways as: and the flux for induced fit pathways as where [MDM2]f is the free MDM2 concentration. The derivation is shown in the Supporting Information.
To obtain the relative amounts of reactive flux that occur by conformational selection vs. induced fit pathways, we use our MSM model to make initial estimates of all eight rates in the four-state kinetic model shown in Figure 3. We tried two different approaches to make these initial estimates: (1) directly from the transition probabilities of a four-macrostate MSM derived for our state classifications, and (2) using TPT with pairs of relevant states selected as the source and sink states. Both sets of estimates are listed in Table S1. We find that the two methods yield very similar results, except that estimates from the transition matrix are more than an order of magnitude larger than the estimate from TPT, due to enforcing detailed balance with a low equilibrium population predicted for the folded-unbound state. Therefore, in the following analysis we use only the parameters estimated from TPT. From this initial estimate, we then scale the off-rates and to reproduce the experimental binding affinity of p53 to MDM2 (see Supporting Information).
Unlike our MSM, which was constructed from simulations performed at a fixed concentration ([p53] = [MDM2] = 7.1 mM), the resulting four-state kinetic model can be used to extrapolate the binding fluxes at any desired concentrations. In all cases, we find that binding is dominated by an induced-fit mechanism, consistent with “fly-casting”. The fraction of flux that occurs by an induced-fit mechanism, FIF/(FCS + FIF) is nearly 100% regardless of the concentrations of p53 and MDM2. This is mainly due to the fact that the simulated helicity of p53 is very low (0.11%, Figure 4).
Increased residual helicity leads to enhanced p53 binding and a shift toward conformational selection. To estimate the effect of residual helicity of p53 on binding mechanism and affinity, we use a maximum caliber approach to infer how the rates kwt and ktw between unfolded-unbound states (cyan) and folded-unbound states (red) change in response to new helix-coil equilibrium populations (see Supporting Information), using the relation . The helicity of p53 predicted by our 4-state model is 0.11%. To model the experimental system of Borcherds et al., we increase the helix population of unbound p53 to 28% and 64%, values measured for wild-type and P27A variants of the p53 TAD19. The inferred rates are shown in Table S3.
The four-state kinetic model predicts that increasing residual helicity increases the flux of conformational selection at low MDM2 concentration; however, in the limit of excess of MDM2, induced-fit binding flux increases to almost 100% (Figure 4). The reason for this is the relatively high value, a key feature of intrinsically disordered proteins that we have calculated directly from the MSM model. In the limit of excess of MDM2, would have to be reduced by several orders of magnitude to convert the binding mechanism to conformational selection. In the limit of excess p53, a shift towards a conformational selection binding mechanism is observed, although a strong preference for induced-fit binding pathways (more than 30% of the binding flux) remains even at high levels of residual helicity (64%) and in the excess of p53.
In agreement with experiment, the four-state model predicts a greater apparent binding affinity of P27A p53 TAD compared to wild-type, with absolute and relative binding free energies similar to experimental values (see Figure 5). The predicted apparent ΔG of binding for p53 wild-type and P27A are −7.5 and −9.0 kcal·mol−1, respectively, while the experimental values are −9.1 ± 0.2 and −10.4 ±0.1 kcal·mol−1, respectively. We predict that the ΔΔG incurred by increasing the helicity of p53 from 28% to 64% is −1.5 kcal·mol−1, which also agrees very well with experiment (−1.3±0.3 kcal·mol−1).
Discussion
Recently, Zwier et al. has reported efficient implicit-solvent simulations of p53 TAD peptide binding to MDM2 carried out using a weighted-ensemble path sampling strategy42 on 3500 CPU cores of TACC Stampede for 15 days, with an aggregate simulation time of 120 μs.43 The authors report similarly accurate predictions of on-rates, and a mechanism whereby diffusion-controlled formation of a specific encounter complex is the rate-limiting step. While their study predicts a high helical propensity for the p53 TAD peptide, our study predicts less helicity, possibly due to differences in the forcefield and solvent model used, as well as differences in initial starting conformations.
An advantage of our approach of parameterizing a four-state binding flux model from a detailed MSM is the ability to extrapolate differences in binding mechanisms that result from various helical propensities of p53 TAD peptide and various ligand and receptor concentrations. At the effective concentrations used in our simulation, residual helicity exerts a large influence on dominant binding flux, but has less influence in excess MDM2 (see Figure 4).
Notably, both our study and the Zwier et al. study suggest a bright future using adaptive sampling simulations to model protein-peptide binding. With a high-quality MSM of p53 binding now constructed, we aim to exploit new MSM-based adaptive sampling approaches to model binding rates and mechanisms for multiple sequences44. A remaining challenge of course is to efficiently sample off-rates as well as on-rates. With the advent of multiple-ensemble MSM techniques45, this too may be within reach in the near future. We expect that the mechanistic detail provided by MSM approaches may suggest new ways to design inhibitors that compete with natural substrates.
Conclusion
We have used ab initio binding simulations and Markov State Models to construct a detailed kinetic network model of p53 TAD peptide binding to MDM2. The MSM predicts binding on-rates in agreement with experiment, as well an ensemble of encounter complex structures that control the overall binding pathways and rates. Predicted MSM rates, along with experimental affinities, were used to parameterize a four-state kinetic model, which predicts an induced-fit “fly-casting” mechanism over a wide range of concentrations, and shows increased binding affinity for p53 variants with higher amounts of residual structure, in agreement with recent experiments. This work demonstrates how combining detailed all-atom MSMs and simple few-state kinetic models can be very useful in understanding how disordered protein domains bind their target receptors. The results also suggest new ways to design inhibitors that compete with natural substrates, by rationalizing how specific binding modes may modulate key rate processes, in the context of physiological concentrations.
Acknowledgments
The authors thank the participants of Folding@home, without whom this work would not be possible. We thank Dr. Lillian Chong for very helpful feedback on our manuscript. This research was supported in part by the National Science Foundation through major research instrumentation grant number CNS-09-58854 and MCB-1412508.
Footnotes
↵a) Electronic mail: voelz{at}temple.edu