Abstract
Intrinsically disordered proteins/regions (IDPs/IDRs) are prevalent in allosteric regulation. It was previously thought that intrinsic disorder is favorable for maximizing the allosteric coupling. Here, we propose a comprehensive ensemble model to compare the roles of both order-order transition and order-disorder transition in allosteric effect. It is revealed that the MWC pathway (order-order transition) has a higher probability than the EAM pathway (disorder-order transition) in allostery, suggesting a complicated role of IDPs/IDRs in regulatory proteins. In addition, an analytic formula for the maximal allosteric coupling response is obtained, which shows that too stable or too unstable state is unfavorable to endow allostery, and is thus helpful for rational design of allosteric drugs.
Author Summary Allosteric effect is an important regulation mechanism in biological processes, where the binding of a ligand at one site of a protein influences the function of a distinct site. Conventionally, allostery was thought to originate from structural transition. However, in recent years, intrinsically disordered proteins (IDPs) were found to be widely involved in allosteric regulation in despite of their lack of ordered structure under physiological condition. It is still a mystery why IDPs are prevalent in allosteric proteins and how they differ from ordered proteins in allostery. Here, we propose a comprehensive ensemble model which includes both ordered and disordered states of a two-domain protein, and investigate the role of various state combinations in allosteric effect. By sampling the parameter space, we conclude that disordered proteins are less competitive than ordered proteins in performing allostery from a thermodynamic point of view. The prevalence of IDPs in allosteric regulation is likely determined by all their advantage, but not only by their capacity in endowing allostery.
Introduction
Allosteric regulation is intrinsic to the control of many metabolic and signal-transduction pathways.(1) It is described as the effect that the binding of a ligand at one site of a protein influences the function of a distinct site which binds with substrate.(2) In history, several models have been proposed illuminating possible mechanism of allostery. The classical MWC (Monod-Wyman-Changeux)(3) model explained the allosteric effect based on a cooperative conformational transition of protein oligomers. Taking hemoglobin binding with oxygen as an example [see Fig. 1(a)], the MWC model assumes that four subunits of hemoglobin are simultaneously in either a relaxed state (R state) or a tense state (T state), and oxygens bind preferentially to the R state which shifts the R-T equilibrium. With such a simple assumption, the MWC model nicely explained how the binding of oxygen at one site promotes the binding at a remote site. Later, the KNF (Koshland-Nemethy-Filmer) model(4) has considered finite subunit interactions and proposed a progressive conformational transition of each domain step by step [Fig. 1(b)]. Both models imply that allosteric processes are closely associated with ligand-driving conformational changes that propagate between the allosterically coupled binding sites. With the development of structural biology, the description of allostery in terms of structure changes was derived,(5) and was used to study allosteric proteins such as lactate dehydrogenase.(6) The structure paradigm also leads to the seeking of specific atomic pathway that connects allosteric sites.(7, 8) Nevertheless, the discovery of dynamic structure and multiple conformations of proteins, such as multiple orientations of DNA-binding domains of DNA-binding proteins in the absence of DNA(9) and the intermediate conformation of hemoglobin in solution,(10) suggests more possibilities beyond the simple two-state models.
The discovery of intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) has brought a challenge to the conventional “structure-function” paradigm.(12–15) IDPs/IDRs do not have ordered structures in the free state under physiological conditions, but they are important in biological signaling and regulation.(16–23) IDPs/IDRs possess some advantages over ordered proteins,(24) such as high specificity coupled with low affinity useful for reversible signaling interaction,(25–28) binding to multiple partners,(29, 30) and rapid turnover allowing sensitive response to environment changing.(12, 19, 31) Therefore, they play crucial roles in widespread categories of proteins,(22) e.g., scaffold proteins,(32) RNA and protein chaperones,(33) transcription factors,(20) and regulation of cellular pathways.(34) In particular, IDPs/IDRs were found to be widely involved in allosteric regulation in despite of their lack of ordered structures.(35–42) Representative examples include enzyme aminoglycoside N-(6’)-acetyltransferase II (AAC), which has local unfolding and switching behaviors from positive cooperativity to negative cooperativity upon different temperature;(37) and Doc/Phd toxin-antitoxin system with intrinsic disorder exhibiting complex “conditional cooperativity” character upon different Doc/Phd ratio.(38)(42)
How can IDPs/IDRs implement allosteric effect under the lack of ordered structures? And why are they so prevalent in allosteric regulation? The answer is related to an emerging new view of allostery based on the general landscape theory of protein structure, where the ligand binding stabilizes specific states and shifts the conformational ensemble.(11, 43, 44) The EAM (Ensemble Allostery Model) model used the ensemble view to explain the allostery of IDPs,(45–49) see Fig. 1(c). As an example, it described a two domain system as a four-state ensemble with each domain having ordered (R) and disordered (I) states. The allosteric ligand (A) binds only with the R state of the first domain while the substrate (B) binds only with the R state of the second domain, i.e., the disordered states have no affinity to ligand and substrate. When the interface-interaction free energy between two ordered domains is negative, binding of the ligand A would stabilize the RR state and thus facilitate the binding of the substrate B, resulting in a positive allosteric effect. Similarly, a negative allosteric effect arises when the interface interaction is unfavorable. The EAM model also provided insight in explaining why IDPs/IDRs are so prevalent in allosteric regulation: it was shown that high allosteric intensity is accompanied by high probability of disordered (I) states.(45) However, based on the ensemble concept, EAM model considers only the order-disorder (R-I) transition, but lacks the order-order (R-T) transition as that in the MWC model for the allostery of ordered proteins. Therefore, with separate EAM or MWC models, it is impossible to determine whether disordered or ordered proteins are more advantageous in allosteric regulation. To get a full view of competition of ordered and disordered proteins in allosteric effect, here we propose a comprehensive ensemble model considering both order-disorder and order-order transitions. In this comprehensive model, the EAM and MWC mechanisms become two pathways for allostery of the system, and thus their role can be quantitatively evaluated.
Models
The comprehensive ensemble model
Our proposed model describes a two-domain protein system, see Fig. 2. It combines components of both the MWC and the EAM models. Each domain has three states: R (Relaxed), T (Tense) and I (Disordered). Being consistent with the MWC model, R and T are incompatible and thus the combinations “RT” and “TR” are forbidden in the resulting protein states. Similar to the EAM model, the I state of a domain is disordered and do not have any interface interaction with the adjacent domain, and it does not bind to any ligand or substrate due to the lack of ordered structures. As a result, there are seven possible combinations for protein states, which are listed in Fig. 2 with the formula of their free energy, the statistical weight and the corresponding probability in the absence of ligand and substrate. Six free energy parameters (ΔGR1, ΔGR2, Δgint,R, Δgint,T, ΔGRT1, ΔGRT2) are basic parameters of the model, determining the ensemble distribution. The substrate B binds only to the R state of one (yellow) domain. The allosteric ligand A binds to the other (blue) domain but there are two binding modes: in the A-R binding mode A binds only to the R state of the blue domain, while it is the A-T binding mode when A binds only to the R state of the blue domain. The two binding modes are taken into account here to enable both positive and negative allosteric effects for ordered proteins (MWC mechanism), making a comparison between the roles of ordered and disordered proteins possible. For example, if we look at a subsystem consisting of RR and TT states, binding of A in the A-R binding mode increases the fraction of the RR state and thus enhances the subsequent binding of B (activation), while that in the A-T binding mode weakens the binding of B (inhibition).
Definitions of contribution of ordered and disordered protein pathways to allostery of the comprehensive ensemble model
Adding ligand A to the system results in a redistribution of the protein ensemble probabilities. The allosteric effect is directly related to probability variation of the states that can bind substrate B due to the adding of A. Following the EAM model,(45) we define the allosteric coupling response (CR) as to quantitatively measure the allosteric intensity for a given system. Here, X denotes the states that can bind B, so PX,[A] is the probability of states that can bind B when there exists ligand A, and PX,[A]=0 is the probability when A is absent. In the comprehensive ensemble model proposed here, for the A-R binding mode we have PX,[A] = PARR + PRR + PIR, and for the A-T binding mode we have PX,[A] = PRR + PIR. ΔgLig.A is the stabilizing free energy of adding ligand A for the states that can bind A, which is determined as: where Ka,A is the intrinsic equilibrium constant of the binding reaction for A. For example, in the A-R binding mode, Ka,A is the association constant for the reactions A + RR = ARR and A + RI = ARI, which gives the equilibrium distributions: clearly demonstrating the nature of the stabilizing free energy ΔgLig,A. In our study, we fixed ΔLig.Ag= −3.0 kcal/mol at a physiological temperature of T = 310.15 K unless otherwise specified.
Because the comprehensive model includes all the states of the MWC model and the EAM model, we can also view the comprehensive system consisting of three subsystems: the MWC subsystem, the EAM subsystem and the Others subsystem, and thus the allosteric effect can be approximately decomposed into three pathways (Fig. 3). The MWC pathway is order-order transition involving the states RR and TT, the EAM pathway is disorder-order transition involving the states RR, RI, IR and II, and the Others pathway is an extra component in the comprehensive model involving RR and the remaining states (TI and IT) neglected in the MWC and EAM pathways. The allosteric coupling response (CR) of each subsystem can be defined and calculated separately. Take the MWC subsystem as an example (under A-R binding mode), we have where the superscript “(MWC)” indicates that the related probabilities of states are defined (normalized) within the MWC subsystem. Similarly, CR for the EAM subsystem and the Other subsystem are determined by
With a set values of the basic parameters (ΔGR1, ΔGR2, Δgint,R, Δgint,T, ΔGRT1, ΔGRT2), it is thus straightforward to calculate the probabilities of all the states with and without ligand A, as well as CR for the whole system (CRtot) and subsystems (CRMWC, CREAM, CROthers). The contribution of a pathway to the total allostery of the comprehensive system depends not only on CR of the corresponding subsystem, but also on the proportion of the subsystem states in the whole system. Therefore, the contribution ratio of the MWC pathway to the allostery of the comprehensive system is approximately defined as:
It stands for the weight of the MWC pathway in the allosteric effect. When there are only RR and TT states before adding ligand A, the comprehensive model degenerates to the MWC model and Eq. (5a) gives WeightMWC = 1. Similarly, for the EAM and the Others pathways, we have:
It is noted that WeightMWC, WeightEAM and WeightOthers are metrics for three pathways’ contributions to allosteric effect of the comprehensive system, but the sum of them is not necessarily equal to 1.0 although the deviation is usually small. Related equations under the A-T binding mode can be found in Supporting Information.
Results
Limits for the maximal allosteric response
With a given set of parameters for protein state stability (ΔGR1, ΔGR2, ΔGRT1, ΔGRT2, Δgint,R, Δgint,T) and protein-ligand interaction (ΔgLig,A) of the proposed comprehensive ensemble model, we can calculate the ensemble distribution, the allosteric coupling response (CR) and the contributions of different pathways with the formulism described above. CR as a function of Δgint,R and Δgint,T is shown in Fig. 4(a,b) as a case example when the other parameters are fixed. It reveals that combination of Δgint,R and Δgint,T is required to maximize the allosteric effect. Under the A-R binding mode, the model can afford both positive (CR > 0) and negative (CR < 0) allosteric effects, while there is only negative effect under the A-T binding mode. The achieved highest CR is about 0.17. To have a global inspection on the occurring probability of allostery, we assume the stability free-energy parameters (ΔGR1, ΔGR2, ΔGRT1, ΔGRT2, Δgint,R, Δgint,T) vary randomly between [−8, +8] kcal/mol, and determine the distribution of CR for two binding modes with ΔgLig,A = −3 kJ/mol [Fig. 4(c,d)]. For the majority of parameter sets, the resulting allostery is weak, giving a sharp peak at CR = 0 for both binging modes [Fig. 4(c,d)]. Actually, only 6.3% of parameter sets produce |CR| >0.1 under the A-R binding mode. Remarkably, CR has the boundaries at around ±0.172. In other words, no matter how the state stabilities of protein are optimized, it is impossible to achieve a CR value higher than 0.172.
The boundary limits of CR can be well explained in an analytic way. Take the MWC model as a simplified example, there are two states (RR and TT state) with only one stability parameter (ΔGi,. = GRR – GTT), which determines the probability of RR state without ligand to be:
CR can then be written as a function of PRR and ΔgLig,A as under the A-R binding mode. The relations among PRR, ΔGi and CR are plotted in Fig. 5(a) for ΔgLig,A = −3 kJ/mol. CR is equal to 0 at either PRR = 0 or PRR = 1, i.e., too stable and too unstable RR state are unfavorable to allostery. CR reaches its maximum of about 0.172 at PRR ≈ 0.081. PRR depends on ΔGi in a switch-like manner. A great many ΔGi values give PRR close to 0 or 1, and result in small CR and weak allostery. This provide a clue in understanding the dominant peak at CR = 0 in Fig. 4(c,d). Based on Eq. (7), the maximization of CR can be solved analytically with to give at the optimized PRR as
Eq. (8) keeps valid for the comprehensive ensemble model (see Supporting Information). CRmax is plotted in Fig. 5(b) as a function of −ΔgLig,A/RT (note that ΔgLig,A < 0). It decreases with increasing −ΔgLig,A/RT, and reaches a value of 0.172 at ΔgLig,A = −3 kJ/mol and T = 310.15 K, being consistent with the observation in Fig. 4. Eq. (8) gives an analytical result for the limits of CR when the state stabilities of protein are optimized, and would be useful in studying the allosteric capacity of proteins.
The weight of MWC pathway is significantly higher than that of EAM pathway
The weights of three pathways (MWC, EAM and Others) in the allostery of the comprehensive system are numerically analyzed when the stability free-energy parameters (ΔGR1, ΔGR2, ΔGRT1, ΔGRT2, Δgint,R, Δgint,T) vary randomly between [−8, +8] kcal/mol. The resulting average weights are shown in Fig. 6(a) as functions of CR. For positive allosteric effect (CR > 0), the weight of the MWC pathway is much larger than the EAM one, indicating the MWC pathway holds an advantage over the EAM pathway in this case. For negative allosteric effect, CR under the A-R binding mode mainly comes from the EAM pathway, while under the A-T binding mode CR mainly comes from the MWC and Others pathways. The reason is that when A binds with R, in the MWC subsystem the decrease of RR state is not allowed and thus its weight is almost zero or even negative based on Eq. (5a), while an IR→RI transition of EAM pathway dominates the negative allosteric response. On the other hand, when A binds with T, it has no effect in the state distribution in the EAM subsystem thus its weight is always zero.
The capacity of the MWC or the EAM pathway for allostery depends on not only their weights in a comprehensive system [Fig. 6(a)] but also the possibility of the system to afford an allosteric effect [P(CR), see Fig. 4(b)]. Therefore, the possibility for allosteric effect with CR undertaken by the MWC pathway can be calculated as
It describes the probability of a randomly chosen parameter set to possess an allosteric effect CR via the MWC pathway. Formula for the EAM and Others pathways can be similarly written. The calculated results are shown in Fig. 6(b). PMWC(CR) and POthers(CR) has sharp peak near the positive allostery limit CRmax in the A-R binding mode and near the negative allostery limit −CRmax in the A-T binding mode, which will be discussed in detail below. More importantly, if we take a simplified approach by adding curves in the A-R and A-T binding modes for each pathway, PMWC(CR) is much larger than PEAM(CR) for strong allosteric effects. Therefore, the MWC pathway is more important in allosteric effects than the EAM pathway based on the comprehensive ensemble model.
Probability of strong allostery first increases and then decreases when the ΔGi range increases
The distribution of allostery and pathway contribution were investigated above when the free-energy parameters (ΔGR1, ΔGR2, ΔGRT1, ΔGRT2, Δgint,R, Δgint,T) of the comprehensive model vary randomly in a range of [−8, +8] kcal/mol. The results may change under a different range. In Fig. 7(a), the possibilities for an allosteric effect to occur with CR undertaken by three pathways are plotted under various variation range [−ΔGmax, +ΔGmax] of the free-energy parameters. The sharp peaks of PMWC and POthers near the positive allostery limit (CRmax = 0.172) observed previously are absent when the variation range (ΔGmax) is small, e.g., ΔGmax = 1 kcal/mol. In Fig. 7(b), the probabilities of CR > 0.171 for three pathways are plotted as a function of ΔGmax. It clearly shows that the MWC and the Others pathways have a similar tendency: it first equals to zero before a critical ΔGmax (which is smaller for the MWC pathway), then increases quickly, and finally decreases slowly.
The feature observed in Fig. 7 can be qualitatively explained based on the simplified two-state model (Fig. 5). The maximal CR is achieved at PRR = 0.081, which corresponds to a free energy difference of AG. (= GRR – GTT) = 1.6 kcal/mol. When the variation range of the free-energy parameters is small, the resulting ΔGi, cannot reach the optimized value for the maximal CR, giving the zero value in Fig. 7(b) and the absence of the sharp peak near CRmax in the panel with ΔGmax = 1 kcal/mol in Fig. 7(a). When the variation range of the free-energy parameters is large enough, although the optimized value of ΔGi, can be always satisfied at some values of parameter sets, the total number of possible values increases with the variation range, and thus the probability of maximal CR, defined as the ratio between the number of optimized parameter value sets to that of the total number, would decreases with increasing the variation range as observed in Fig. 7(b).
Two-state transition is the main mechanism for strong allostery
The comprehensive ensemble model includes seven states and three subsystems/pathways. How do they coordinate in fulfilling the allosteric effect? For example, do the pathways repeal each other in a system? How many states play significant role in a system? Here, we investigate the interplay between different states and different subsystems/pathways in the allosteric process.
To measure the mixing extend of subsystems and pathways, we classify each system case (with a certain set of ΔGi values) into one of four categories: single subsystem with single pathway (S,S), single subsystem with mixing pathways (S,M), mixing subsystems with single pathway (M,S), and mixing subsystems with mixing pathways (M,M). If the sum of state probability for any subsystem is larger than 0.99 before and after adding ligand, it is classified into single subsystem; otherwise it belongs to mixing subsystem. Single pathway is defined for the case where the weight of one pathway is larger than 0.99 and the absolute value of weights for other pathways are less than 0.01; otherwise it belongs to mixing pathway. For example, if a system only contain RR and TT states, then it simply belongs to the (S,S) category. The results are shows in Fig. 8(a). When the variation range (ΔGmax) of free-energy parameters is small, mixing subsystems with mixing pathways (M,M) dominate in most cases. But when ΔGmax is larger, the proportion of single subsystem with single pathway (S,S) increases while the (M,M) type decreases. More importantly, the (S,S) proportion increases with increasing |CR|. The system tends to behave as pure subsystem with pure pathway mechanism at strong allostery.
A clearer angle of view is to look at the proportion of systems that implement allostery via a simple mechanism of two-state transition. Here we specify a system to have two-state transition mechanism if the probability sum of two certain states of the given system is larger than 0.99 both before and after binding with ligand A. Possible two-state transition for positive allosteric effect includes “II➔RR”, “TT➔RR”, “TI➔RR” and “IT➔RR” For negative allosteric effect, the only possible two-state transition is “IR➔RI”. The proportion of systems with simple two-state transition is shown in Fig. 8(b). With larger ΔGmax, the proportion of two-state transition is higher. The proportion has a sharp peak at ±CRmax. Therefore, two-state transition is the major mechanism for strong allosteric even in the comprehensive ensemble model.
The existence of two-state transition and single subsystem/pathway are also reflected in the state distribution patterns. The distributions of RR and states of three subsystems are shown in Fig. 8(c) for systems with CR ≈ 0.16. The distribution of PRR has two obvious peaks labeled with <1> and <3>. In Fig. 8(c) we also plot the theoretical CR ~ PRR curve for the two-state model for convenience’s sake. The crossing points between the CR ~ PRR curve and the horizontal line of CR = 0.16 give the PRR values to achieve an allosteric effect of CR = 0.16 in the two-state model. The obtained PRR values of the crossing points coincide with the peak position at <1> and <3> of the simulated PRR distribution, suggesting that the strong allostery (with CR = 0.16) of the comprehensive model mainly occurs in a two-state model mechanism (note that RR exists in all possible two-state transition for positive CR including “II➔RR”, “TT➔RR”,”TI➔RR” and “IT➔RR”). There is also some nonzero PRR distribution (<2>) between two peaks, which is expected to have CR higher than 0.17 in the two-state model. The reason for that is the introducing of additional IR and RI population would decrease CR (see Supplementary Material). It also explains the intriguing result that there is no distribution outside <1>&<3>, for PRR outside cannot give CR as big as 0.16. When ΔGmax increases to 13 kcal/mol, PRR distribution enriches at <1>&<3> and reduces at <2>, suggesting an enrichment of two-state transition mechanism. Similarly, for the distribution of the MWC pathway states, the PRR + PTT peaks at <1>&<3> correspond to the systems dominated by other pathways (EAM or Others) so that PRR + PTT = PRR and the peak positions are identical to that for PRR. At <5>, PRR + PTT = 1 corresponds to the systems dominated by the MWC pathway. <2> and <4> mean hybridized cases. Results for the population distribution of the EAM and Others subsystems are similar (data not shown). They confirm that strong allostery in the comprehensive ensemble mode is dominated by single pathway and the two-state transition mechanism.
Discussion
Possible reasons for the prevalence of IDPs/IDRs in allosteric regulation
IDPs/IDRs appear in much higher amounts in regulatory proteins,(20, 23) and are also widely involved in allosteric processes.(35–40, 42) A possible explain for the prevalence of IDPs/IDRs in allosteric regulation was provided by the EAM model which suggested that intrinsic disorder can maximize the ability to allosteric coupling.(45) However, our comprehensive ensemble model reveals that the order-disorder transition (EAM mechanism) is actually less competitive than the order-order transition (MWC mechanism) in affording allosteric effects, especially the strong allostery. It shows that the reasons for the prevalence of IDPs/IDRs in allosteric regulation are more complicated than previously thought. Our work does not give a complete answer for it, but we provide some discussion and comments here.
Firstly, in our study we assumed that the free energy parameters of conformation change and domain-domain interaction (ΔGR1, ΔGR2, ΔGRT1, ΔGRT2, Δgint,R, Δgint,T) vary randomly with an equal probability density between [−ΔGmax, +ΔGmax]. In real proteins it does not have to be like this. The difficulty (probability) to modify order-order and order-disorder transitions is likely different. Specifically, to tune the protein stability difference between two similar order structures (R and T in the MWC model) via mutation would be more difficult than to tune the stability difference between order and disordered structures (R and I in the EAM model), because in the latter case this can be accomplished via breaking or strengthening a residue-residue interaction that is present in ordered structure but absent in disordered structure. Therefore, a possible reason for the prevalence of IDPs/IDRs in allosteric regulation is their convenience in modifying state stability.
Secondly, IDPs/IDRs possess various advantages over ordered proteins,(24, 50) such as saving genome resources via multi-binding pattern or creating large binding surface, overcoming steric effect in binding, accelerating binding speed, achieving high specificity with low affinity, and facilitating posttranslational modifications. The prevalence of IDPs/IDRs in allosteric regulation is determined by all their advantage, but not only by their capacity in endowing allostery. Work combining experimental data and bioinformatics analyses would be helpful to compare ordered and disordered proteins’ importance in allosteric regulation.
Lastly, allosteric effects with maximal CR may be not the pursuing goal. Allostery with different strength would have different applications. For example, allosteric effect that are not too strong is beneficial in ensuring safer dosing.(51)
Conclusions
In this work, we proposed a comprehensive ensemble model to study the role of order-order and order-disorder transitions in allosteric effect. An analytic equation for the maximal allosteric coupling response (CR) was derived, which shows that too stable or too unstable state is unfavorable to achieve allostery. By sampling the parameter space, it was revealed that the order-order transition (MWC) mechanism has a higher possibility in allostery than the order-disorder transition (EAM) mechanism. In addition, two-state transition is the primary mechanism when allostery is strong although there are seven states in the model. The work not only provided insight in understand the prevalence of IDPs/IDRs in allosteric regulation, but is also helpful for rational design of allosteric drugs.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (grant 21633001) and the Ministry of Science and Technology of China (grant 2015CB910300). The authors thank Huaiqing Cao, Miao Yu and Hao Ruan for helpful discussions.