Abstract
Surfactant micelles are often utilized as membrane mimetics for structure determination and functional analysis of membrane proteins. Although curved-surface effects of the micelle can perturb their structure, it is difficult to assess such effects and membrane mimetic artifacts by experimental and theoretical methods. Here, we propose an implicit micelle model (IMIC) to be used in molecular dynamics (MD) simulations of membrane proteins. IMIC is an extension of the IMM1 implicit membrane model by introducing a super-ellipsoid approximation to represent the curved-surface effects. Most of the parameters for IMIC are obtained from all-atom explicit solvent MD simulations of twelve membrane proteins in various micelles. In simulations of the HIV envelop protein gp41, M13 major coat protein gp8, and amyloid precursor protein (APP) dimer, curved-surface and compact hydrophobic-core effects are exhibited. The MD simulations with IMIC provide accurate structure predictions of membrane proteins in various micelle environments quickly with smaller computational cost than that necessary for explicit solvent/micelle model.
INTRODUCTION
Membrane proteins play important roles in cellular processes like substrate transport, signal transduction, membrane fusion, and cell adhesion. Cell membranes are mainly composed of phospholipids, which spontaneously assemble into bilayers, with hydrophilic head groups exposed to water and hydrophobic tail groups buried inside the membrane. Experiments on membrane proteins frequently use surfactant as a membrane mimetic agent to purify and stabilize the proteins, where the surfactant covers the transmembrane (TM) domain by forming a micelle.1 X-ray crystallography often detects electron densities of bound surfactants,2 and recent cryo-electron microscopy (cryo-EM) experiments clearly discern an ellipsoidal micelle around the TM domain.3-6 In solution NMR, proteins are frequently reconstituted into micelles, bicelles, or nanodiscs,7 while in solid-state NMR they are in multilamellar vesicles or oriented lipid bilayers.8,9
It is commonly recognized that the surfactant micelle may not mimic a lipid bilayer environment faithfully, in that, compared to lipid bilayers, the micelle has a highly curved surface and compact hydrophobic core region.10,11 Many membrane proteins have been shown to have different conformations in micelles and bilayers. Examples include the LLP-3 domain in the HIV-1 envelope glycoprotein gp41,12 bacteriophage M13 major coat protein gp8,13 influenza M2 proton channel,14 phospholamban,15,16 epidermal growth factor receptor (EGFR),17 and amyloid precursor psorein (APP).18-20 Specifically, the LLP-3 peptide forms a curved α-helix on the micelle surface, but is straight on the bicelle surface.12 The M13 major coat protein gp8 forms various kinked-helix conformations in micelles, while a continuous α-helix predominates in fully hydrated vesicles.13 This is in contrast to the glycophorin A dimer, the conformation of which is identical in micelles and lipid bilayers, presumably because the two monomers are tightly bound through GxxxG motif interactions.21 Therefore, among experimentally determined structures, discrimination between native structures and membrane mimetic artifacts is extremely valuable.
Molecular dynamics (MD) simulation has been useful for investigation of the structural and dynamic properties of membrane proteins in surfactant micelles as well as in lipid bilayers.22-25 Recent development of automated system builders, such as CHARMM-GUI Micelle builder,26 makes it easy to model a protein-micelle complex for all-atom MD simulations in explicit solvent and micelle. One difficulty, though, is that the aggregation number of surfactant around the protein is usually unknown, and there is limited experimental data available in this regard.27-30 To avoid this problem, protocols for the spontaneous formation of the protein-micelle complex have been proposed, where the MD simulation starts from a random distribution of surfactants in the simulation box.31-33 Since such simulations require long relaxation times (typically, several hundred nanoseconds), course-grained models or implicit water models have been employed to enhance surfactant self-assembly.34-37 Even if the optimal aggregation number of surfactant could be determined experimentally or theoretically, there is large computational cost for simulation of a protein-micelle complex in explicit solvent.
The implicit solvent model can significantly reduce computational cost in biomolecular simulations.38 In this model, the solvation free energy of solute is incorporated into a molecular mechanics potential energy function as the effective energy term. The solvation free energy is traditionally decomposed into electrostatic and nonpolar contributions. The electrostatic term is computed by numerically solving the Poisson-Boltzmann (PB) equation,39 or by using screened pairwise interactions between partial charges of solute atoms based on the Generalized Born theory (GB).40 The nonpolar term can be calculated from the solvent accessible surface area (SASA).41 In other methods such as hydration shell models, the solvation free energy is assumed to be proportional to the solvent accessible volume of the first hydration shell.42 In the EEF1 model proposed by Lazaridis and Karplus,43 the solvent excluded volume is estimated from the volume of neighboring solute atoms.
Implicit solvent models have been also extended to the membrane environment,44 and most of them are based on the GB method.45-49 They introduce a low dielectric slab (ε = 1-4) at z = 0, and parameters are optimized to reproduce PB calculations or all-atom MD simulations. The EEF1 model was also extended to the membrane model (IMM1), where the solvation free energy is calculated by a combination of the solvation free energies in water and cyclohexane.50 The IMM1 model has been further extended to anionic lipid bilayers,51 mixed-lipid bilayers,52 and bilayers with a membrane potential,53 aqueous pore,54 and large curvature.55 Although many implicit lipid bilayer models have been developed, to our knowledge “micelle models” have not been proposed.
In this study, we propose a new implicit solvent model, named the IMIC model, which treats the micelle environment implicitly. It is an extended form of the IMM1 model. We introduce a super-ellipsoid approximation to define micelle shape, and the parameters are derived from all-atom MD simulations of pure micelles and protein-micelle complexes in explicit solvent. We validate the IMIC model by comparing structures and dynamics of selected membrane proteins with those in all-atom MD simulations. In application studies, we simulate the HIV-1 envelope glycoprotein gp41, bacteriophage M13 major coat protein gp8, and APP dimer to examine whether the IMIC/IMM1 models identify membrane mimetic artifacts in solution NMR structures in micelles.
THEORY
Approximation of micelle shape
First, we propose a “super-ellipsoid” approximation to define micelle shape. For simplicity, the micelle is centered at the origin of the system (x, y, z) = (0, 0, 0). A pure micelle has been typically approximated as an “ordinary” ellipsoid.56-58 As observed in recent cryo-EM experiments,3-6 micelle shape can be deformed in the presence of membrane proteins, resulting in a flat region near the micelle center where the protein is contained. Both simple and deformed shapes can be described with the super-ellipsoid function: a, b, and c are the semi-axes of the super-ellipsoid, and m1 and m2 determine the shape of the cross section in the super-ellipsoid (see Figure S1).59,60 In the case of m1 = m2 = 1, eq 1 gives an ordinary ellipsoid. If 0 < m1 < 1 and m2 = 1, the cross section in a plane perpendicular to the XY-plane is expanded, keeping the semi-axes at the given lengths, and the shape also resembles a bicelle or nanodisc. If m1 = 1 and 0 < m2 < 1, the cross section in a plane parallel to the XY-plane is expanded. The shape becomes close to rectangle as both m1 and m2 decrease. Note that m < 0 or m > 1 is not allowed, because it produces a non-micelle-like shape resembling an octahedron.
Implicit micelle model (IMIC)
The implicit micelle model (IMIC) creates an ellipsoidal hydrophobic core region in the system using the super-ellipsoid function. The IMIC model is an extension of the EEF1/IMM1 models,43,50 where the effective energy W of a solute molecule is defined as the sum of molecular mechanics potential energy EMM and solvation free energy ΔGsolv, given by Where rij is the distance between atoms i and j, and Vj is the volume of the j-th atom. The function gi is the density of the solvation free energy of the i-th atom, defined with the van der Waals radius Ri and thickness of the first hydration shell λi · is the solvation free energy of the atom when it is fully exposed to solvent. is similar to , but is determined to satisfy the zero solvation energy of deeply buried atoms.43
As in the IMM1 model, is defined as a combination of the solvation free energies of the i-th solute atom in water and cyclohexane: where f(d) is a function that describes the transition between water and cyclohexane phases. In the IMIC model, we use the following function: where d is the depth of the solute atom from the super-ellipsoid surface (Figure 1a), and s controls the steepness of the micelle-water interface. If a solute atom is inside the micelle, d has a negative value, and vice versa. d is computed with an iterative minimization scheme (see Supporting Information).61 The function f produces a sigmoidal curve, and it gives 0.5 at the super-ellipsoid surface (d = 0) (Figure 1b). In our model, we assume that the surfactant headgroup has the same properties as aqueous solution,50 and thus, the midpoint f = 0.5 corresponds to the boundary between surfactant hydrocarbon and polar headgroups. The derivatives of the solvation free energy with respect to atomic position are also calculated (see Supporting Information).
As in the IMM1 model, we use a distance-dependent dielectric constant for electrostatic interactions. The dielectric constant depends on the positions of interacting atoms,50 defined as where rij is the distance between the i-th and j-th atoms, and p is an empirical parameter to adjust strength of the interactions (p = 0.85 for CHARMM1950 and 0.91 for CHARMM3662,63) Far from the micelle surface, the dielectric constant ε is close to r, corresponding to the EEF1 model, while in the micelle center, it provides strengthened interactions. Eqs 1 and 5 are the only differences between the IMIC and IMM1 models. The IMM1 model utilizes f (z′) = z′n/(1 + z′n) for eq 5, where z′ = |z|/(T/2) and T is the membrane thickness. The IMIC model is nearly equivalent to the IMM1 model when a and b → ∞ and c is half membrane thickness.
We implemented the IMIC, EEF1, and IMM1 models into GENESIS,64,65 and these were parallelized with hybrid MPI/OpenMP protocols. Computational cost of the IMIC model is ∼1.9-times that of the vacuum simulations, and ∼1.2-times that of IMM1 simulations, which is mainly due to the iterative minimization scheme for the computation of d. In this study, we use the IMM1-p36 parameter set for ΔGref, ΔGfree, λ, and V, which is an updated version of the CHARMM19-based IMM1 model for the CHARMM C36 force fields.62 We used top_all36_prot_lipid_eef1.1.rtf and solvpar22.inp for topology and parameter files, respectively, both of which are available in the CHARMM program package.66
METHODS
In order to carry out MD simulations of membrane proteins in the IMIC model with a given number of surfactant molecules, we should estimate the micelle size and shape in advance. Let us consider a structural change of the micelle upon protein insertion (Figure 2), where the surfactant aggregation number is assumed to be invariant. If the protein is inserted into the micelle along the Z-axis as in lipid bilayers, the semi-axis c will be changed to accommodate hydrophobic mismatch between the protein and micelle (c → c′), and the semi-axes a and b are expanded or shrunk (a → a′ and b → b′), keeping the volume of the surfactant hydrophobic core region constant. m1 and m2 in eq 1 and s in eq 5 may also change (, and s → s′). Hereafter, the prime indicates a parameter in the presence of membrane proteins. To estimate these variables we propose a general scheme below, in which empirical parameters were derived from all-atom MD simulations of protein-micelle complexes as well as pure micelles.
All-atom MD simulations of pure micelles
We carried out all-atom MD simulations of pure micelles in explicit solvent. We chose decyl-phosphocholine (Fos10), dodecyl-phosphocholine (DPC or Fos12), tetradecyl-phosphocholine (Fos14), sodium dodecyl sulfate (SDS), lauryldimethylamine N-oxide (LDAO), and dihexanoyl-phosphatidylcholine (DHPC) to examine how parameters a, b, c, m1, m2, and s depend on surfactant headgroup type and hydrocarbon length. We modeled the micelles with various aggregation numbers of surfactant by using the CHARMM-GUI micelle builder (Table SI),26 and carried out a 100-ns MD simulation in the NPT ensemble at 300 K and 1 atm for each system (for detailed simulation conditions, see Supporting Information).
As reported previously,67,68 micelle structure was highly dynamic, and its instantaneous shapes were irregular. We found that the averaged shape was an ordinary ellipsoid, especially when the micelle was composed of a typical aggregation number of surfactant. Figure 3 shows the mass density profile of 60DPC in the X and Y dimensions. By fitting a super-ellipsoid function to the three-dimensional (3D) density profile, we obtained a = 19.2 Å, b = 16.5 Å, c = 14.3 Å, m1 = 0.98, m2 = 1.00, and s = 0.50. In most other cases, m2 was close to 1.0 (see Table SI). m1 slightly decreased as Nsurf increased, presumably because the flat region increased as in bicelles. Both a and b increased as Nsurf increased, while c did not seem to exceed a certain length (e.g., 15.1 Å for DPC micelles). The ratio of a and b was 1.15-1.21 in most cases. The moments of inertia of the ellipsoid calculated from a, b, and c were comparable to those reported in recent simulation studies.58 s was slightly dependent on the surfactant headgroup type, where s = ∼0.5 for DPC, Fos12, and Fos14 and ∼0.4 for SDS, LDAO, and DHPC. The volume per surfactant (Vsurf) was approximately proportional to the number of hydrocarbon atoms, and determined as Vsurf = 322 Å3 for DPC, SDS, and LDAO, 268 Å3 for Fos10, 379 Å3 for Fos14, and 488 Å3 for DHPC.
All-atom MD simulations of protein-micelle complexes
All-atom MD simulations were again carried out for the twelve selected membrane-proteins (Hemagglutinin fusion peptide, Integrin β3, Glycophorin A, TMEM14A, AchR β2, BcTSPO, semiSWEET, D3 receptor, KcsA, OmpX, OmpA, and TtoA) in micelles with various aggregation numbers of surfactant (see Figure 4a). We employed 36 systems in total (see Table SII), and performed a 100-200 ns run for each in the NPT ensemble at 300 K and 1 atm (for detailed simulation conditions, see Supporting Information). Hereafter, we call these systems “test sets”, and the results were repeatedly used in this study for comparison with the MD simulations in the IMIC model. We analyzed how protein size and surfactant aggregation number affect parameters , and s′ based on the mass density profile of the surfactant hydrocarbon group.
Figure 4b illustrates the density profile obtained in the OmpX-60DPC complex, where the black solid line is the averaged backbone structure of OmpX in the reoriented micelle. As expected, the whole micelle structure showed an ellipsoidal shape, and the protein was located near the micelle center. Interestingly, the hydrophobic core width of the micelle near the protein along the Z-axis was close to the membrane thickness in the OPM database (23.6 Å) (dotted lines in Figure 4b). The averaged micelle thickness in the XY-plane was 11.7 Å. Similar results were obtained in the other test systems (Figure S2).
We calculated the super-ellipsoid that fits the micelle-water interface of the mass density profile. In the case of OmpX-60DPC, we obtained a′ = 23.4 Å, b′ = 20.3 Å, c′ = 11.8 Å, , and s′ = 0.51. Among the 36 systems, we found some common features and obvious tendencies in the parameters (see Table SIII): 1) a′/b′ = ∼1.15, , and s′ = 0.35-0.5 in most cases. 2) decreases as the protein size in the X-Y dimension increases or hydrophobic thickness decreases. (3) Surfactant headgroup type can affect s, where SDS showed a smaller s′ than DPC, Fos10, and Fos14. (4) In the case of membrane bound peptides, the micelle size is approximately the same as that of the pure micelle. (5) In the case of large membrane proteins, micelle shape can be strongly deformed due to the protein shape (Figure S2), and thus, may not be applicable to large proteins.
General scheme to estimate the micelle size and shape
Based on the above results, we propose a scheme to estimate the micelle size and shape in the presence of membrane proteins without performing all-atom MD simulations. As shown in Figure 2, the total volume of the protein-surfactant complex (Vcomplex), excluding the extra-micellar region, can be described as a sum of the volumes of the TM domain (VTMD) and surrounding surfactants: Nsurf is the aggregation number of surfactant, and Vsurf is the volume per surfactant molecule. Vcomplex can be calculated by Where B(x,y) is the so called beta function. Vsurf can be estimated from the volume of a pure micelle composed of Nsurf surfactants (Figure 2a), given by Now, we approximate the embedded protein as a cylindrical shape with hydrophobic (or TMD) thickness Lz and effective radius r (see Figure 2b). To fully solvate TMD with the micelle, c′ is set to half of the hydrophobic thickness of the protein: The effective radius of the protein is calculated by Micelle width l on the XY-plane can be roughly estimated by assuming that the protein is surrounded by surfactants whose cross-section in a plane perpendicular to the protein surface is semi-ellipsoid and its semi-axes are l and Lz/2. Based on the Pappus-Guldinus theorem about Z-rotation, l is readily calculated as For the estimation of , we introduce an empirical formula: where P1 and P2 are the parameters that decide the vertical width of the super-ellipsoid at x = r + l − Δl and y = 0. We use Δl = 2 Å, P1 = 2.761 and P2 = 0.259, which were derived from a micelle surface curvature calculated from the all-atom MD simulations of protein-micelle complexes. Note that if the estimated exceeds the lower limit 0 or upper limit 1, it must be fixed to the limit.
Finally, a′, b′, and are estimated empirically. As described above, our MD simulations for various protein-micelle complexes demonstrated that a′/b′ = ∼1.15 and are applicable to many cases, especially when the protein is adequately solvated with surfactant. Accordingly, is calculated with eq 9, and thereby, a′ and b′ are obtained with eq 8.
Practical estimation of the micelle size and shape
Once Nsurf, Vsurf, Lz, and VTMD are decided for a target system, we can estimate , and from eqs 7-14. In the case of OmpX-60DPC, we specify Nsurf = 60, Vsurf = 322 Å3, Lz = 23.6 Å, and VTMD = 10,801 Å3, where TMD was defined based on the OPM database,69 and VTMD was calculated from the heavy atoms in TMD using the 3V program.70 We obtain a′ = 22.5 Å, b′ = 19.6 Å, c′ = 11.8 Å, , and , whose shape is illustrated as the red line in Figure 4b. Our estimation showed good agreement with the density profile in the all-atom MD simulation. Similar results were obtained in the other protein-micelle complexes (Table SIV and Figure S2). Among the 36 systems, the standard error of the estimate for a′, b′, and was only 1.5 Å, 1.3 Å, and 0.08, respectively (Figure S3). These results indicate that our scheme is entirely appropriate for estimating the micelle size and shape in the presence of membrane proteins with a given aggregation number of surfactant molecules. As for s′, we use 0.35 for SDS and 0.45 for DPC, Fos10, and Fos14. The estimated parameters can now be used for simulating membrane proteins in the IMIC model.
RESULTS
Comparison between the IMIC and all-atom models
To validate the IMIC model, we carried out MD simulations for the 36 test systems (see Table SIV), and compared structural and dynamic properties of membrane proteins with those in the all-atom MD simulations. To prepare the initial structure in the IMIC model, the X-ray crystal or NMR structure was placed near the micelle center with the same orientation as in OPM, and rotated about the Z-axis so that the PC1 and PC2 axes of the X,Y-coordinates of the TM atoms were aligned with the X,Y-axes. We conducted a short energy minimization, followed by a 100-200-ns MD simulation at 300 K using the Langevin thermostat. The equations of motion were integrated with the leapfrog method with a time step of 2 fs, where the SHAKE algorithm was used for bond constraint. We used a cutoff distance of 25 Å for the non-bonded interactions. To prevent the embedded proteins from large lateral shifting, we applied a weak restraint on the distance from the center of mass of the TM domain to the Z-axis using a flat-bottom restraint potential, where we used 1.5 Å for the switching distance between flat and harmonic functions, and 1.0 kcal/mol·Å2 for the force constant.
Here, we focus on OmpX-60DPC. Figure 5a shows the snapshot of OmpX at 200 ns in the IMIC60DPC model. The orientation of OmpX with respect to the Z-axis was similar to that in the all-atom MD simulation (compare with Figure 4b). Figure 5b shows the time courses of the root-mean-square deviation (RMSD) for the all Cα atoms with respect to the X-ray crystal structure (PDB entry: 1QJ8). In both IMIC and all-atom models, the RMSD was less than 3 Å over 200 ns, indicating that OmpX was stable in the 60DPC micelle, and this is also consistent with experiment.71 Figure 5c shows the root-mean-square fluctuation (RMSF) of the Cα atoms. The RMSF in the IMIC model is in good agreement with that in the all-atom model. The averaged difference between the two RMSF profiles, which we define as excluding the N- and C-terminal two residues, was 0.134 Å in the transmembrane secondary structure regions, and 0.639 Å in the other regions.
We also analyzed the degree of burial of OmpX in the micelle by calculating the solvent accessibility (SA) of each residue. In the IMIC model, we calculated the averaged solvent accessible surface area using a Monte Carlo integration scheme, where the water-SA and surfactant-SA were calculated individually according to the position of the probe particle (1.4 Å in radius). In the all-atom model, we calculated the averaged number of solvent atoms around the residues using a cutoff distance of 4 Å. Note that the contacts with the surfactant head groups were included in the water-SA. Figure 5d compares the SA profiles in the IMIC (upper panel) and all-atom models (lower panel). Both two models showed similar profiles, where the β-strands and loops were mainly solvated with surfactant hydrocarbon groups and water, respectively. Correlation coefficients (c.c.) between the two profiles were 0.93 for water-SA, and 0.95 for surfactant-SA, demonstrating that the IMIC model can reasonably mimic a micelle environment.
Similar results were obtained in most other cases of the test sets (see Figure S4a). In both IMIC and all-atom models, the RMSD for the Cα atoms with respect to the initial structure was 4-6 Å in helix bundle proteins with 2-5 TM helices (e.g., GpA, TMEM14A, and AChR β3), while 2-3 Å in the bigger or β-barrel proteins (e.g., D3 receptor, KcsA, and TtoA). ΔRMSF for the secondary structure region in TMD was typically less than 0.5 Å, and large ΔRMSF (> 1.5 Å) was only observed in the non-secondary structure region of AChR β2 and OmpA, both of which have large extra-micelle loops. The solvent accessibility showed high c.c. (> 0.8) in most cases except for GpA-100DPC (c.c. = 0.48 in surfactant-SA) and TMEM14A-45DPC (c.c = 0.41 in surfactant-SA). This is presumably because 100DPC is excessive for GpA, and 45DPC insufficient for TMEM14A, which might cause a different solvent exposure of proteins in the IMIC and all-atom models. Note that in these cases c.c. is still positive and the profiles were similar to each other. Overall, the IMIC model can reasonably reproduce the all-atom model.
Glycophorin A in the IMIC, IMM1, and all-atom models
We further focus on glycophorin A (GpA). Experiments have demonstrated that GpA forms an identical conformation in the DPC micelles,72 DMPC:DHPC bicelles,21 and monoolein lipidic cube phase (LCP).73 Among these conditions, the bicelles and LCP are closer to a lipid bilayer environment. An aggregation number of ∼80 for C12 surfactants around GpA was also determined.28 We simulated GpA in the IMIC, IMM1, and all-atom explicit micelle/membrane models, and compared them with experiments. We employed 60DPC, 85DPC, and 100DPC micelles, and DMPC, DOPC, and POPC bilayers (for detailed simulation conditions, see Supporting Information).
Figure 6a shows a representative snapshot of GpA in the IMIC85DPC model. There were tight G79xxxG83 motif interactions between monomers over 200 ns. The averaged inter-helical distance dHH was 6.1 Å, and the averaged inter-helical crossing angle θ was −43.0°. These results were in agreement with those in the all-atom 85DPC model (Figure 6b) (dHH = 5.8 Å and θ = −44.1°) as well as experiments in DPC micelles (dHH = 6.3 Å and θ = −41.2° in PDB: 2KPE).21 The obtained structures in the DPC micelles were similar to those in the DMPC bilayers (Figures 6c and 6d and Table I), again consistent with experiments.21 In both micelles and bilayers, θ slightly decreased as the micelle size or membrane thickness increased (see Table II), which is mainly due to hydrophobic matching between GpA and micelles or bilayers. The effective energy W in the IMIC model was lower than that in the IMM1 model, where we obtained W = 209.1 kcal/mol in IMIC85DPC, while W = 213.8 kcal/mol in IMM1DMPC (see also Table SV). This is presumably because GpA can easily rotate inside the micelle to accommodate the hydrophobic matching, resulting in a less strained conformation compared to that in the lipid bilayers.
Curved surface effects on protein dynamics
To examine curved-surface effects of micelles on membrane protein structure, we simulated the HIV-1 envelope glycoprotein 41 (gp41). This protein has an important role in membrane fusion between the viral envelope and cellular membranes.74 It has a long C-terminal cytoplasmic tail, consisting of three amphipathic α-helical portions: LLP (lentiviral lytic peptide)-1, 2, and 3. Fluorescence and infrared spectroscopy demonstrated that LLP-3 tightly binds to lipid bilayers, and the helix axis is nearly parallel to the membrane plane (∼70°).75 Moreover, NMR experiments suggested that the α-helix of LLP-3 is almost straight on the DHPC:DMPC bicelle surface, while it is slightly curved on the DHPC micelle, indicating a curved-surface effect of micelles.12
We carried out MD simulations of LLP-3 using the IMIC and IMM1 models, and compared the structures in the two environments. We employed a 30DHPC micelle and DMPC lipid bilayer, and performed long MD runs (500 ns in IMIC and 150 ns in IMM1 at 300 K) to search the optimal position and orientation of the peptide on the micelle/bilayer surface, starting from the ideal α-helical conformation. Figure 7a illustrates a representative structure of LLP3 in the IMIC model, whose probability of spatial position and orientation was highest according to Euler angle analysis (Figure S5). LLP-3 adopted an α-helix on the micelle surface, and the side-chains of Trp5, Leu9, Trp12, and Leu16 were fully buried in the hydrophobic core region. Similarly, in the IMM1 model, LLP-3 bound to the bilayer surface (Figure 7b), and the averaged tilt angle was ∼86°, which is reasonably consistent with the experimental data (∼70°).75
To investigate dynamic properties of LLP-3 on the micelle and bilayer surfaces, we carried out principal component analysis (PCA) for the MD trajectories. In both IMIC and IMM1 models, the first and second principal components (PC1 and PC2) were bending motions of the α-helix, which were perpendicular to each other. These two modes contributed about 48% to the total fluctuations. We found that the sum of the PC1 and PC2 vectors gave a bending motion from the hydrophilic side of the helix towards the hydrophobic side, and vice versa (Figure 7c), and the direction appeared to be perpendicular to the micelle/bilayer surface. We projected the trajectories onto the (PC1+PC2) vector obtained from the IMM1 simulation, and calculated their distributions (Figure 7d). In the IMM1 model we obtained a Gaussian distribution with zero mean, while in the IMIC model it was shifted towards positive values. Here, the averaged bending angle of the helix at and around 0.0, 0.2, 0.5, and 1.0 on the (PC1+PC2) axis was 175, 173, 169, and 162°, respectively. These results indicate that the structural fluctuation of LLP3 on the micelle surface is biased to the “+curved” conformation, in which the hydrophilic side of the α-helix is relatively extended (insets of Figure 7d), consistent with NMR experiments.12 The IMIC model faithfully induces the curved α-helix conformation on the micelle surface.
Compact hydrophobic-core effects on protein conformation
We simulated the bacteriophage M13 major coat protein gp8. This is a 50-residue protein that assembles around the phage particle to encapsulate the circular single-stranded DNA, and has an important role in interacting with the host membrane in the infection process.76 Site-directed labeling (SDL) experiments revealed that gp8 adopts a continuous α-helix (I-shape) in fully hydrated DOPC:DOPG vesicles.77 Solid-state NMR showed a helix-turn-helix conformation (L-shape), consisting of the TM and juxtamembrane (JM) helices, in oriented POPC:POPG lipid bilayers.78 Solution NMR showed various conformations in SDS micelles, including not only I- and L-shapes but also U-shape.79 These experiments suggest that I- and L-shapes can commonly exist in micelle and bilayer environments, while U-shape is an artifact in the membrane mimetic environments.
To examine these characteristics, we performed umbrella sampling of gp8 in the IMIC60SDS and IMM1POPC models with the reaction coordinates of kink angle between the TM and JM helices. The amino-acid sequence of gp8 is shown in Figure 8a. We employed 18 window potentials, where the angle between the centers of Cα atoms of Ala9-Phe11, Thr19-Tyr21, and Lys44-Thr46 was restrained at 10° intervals from 10 to 180° with the force constant of 40 kcal/mol·rad2 for each window. We used U-shape (Model 8 in PDB entry: 2CPS) for the initial structure, and performed a 50-ns equilibration, followed by 50-ns production run for each window. The free energy profile was computed with the weighted histogram analysis method (WHAM).80,81
Figure 8b shows the potential of mean force (PMF) at 298.15 K as the function of the kink angle. In both IMIC and IMM1 models, the global energy minimum state was I-shape, where the kink angle was ∼180° (Basin I). In the IMIC model, there was another deep and wide basin around ∼64° (Basin II), which was mainly composed of L-shape. Free energy difference between I- and L-shapes was 3.0 kcal/mol. Conformational transition between L- and U-shapes (∼51°) may occur due to thermal fluctuation in this basin. In the IMM1 model, L-shape was also found in a local energy minimum at 90° (Basin III), but the free energy difference from I-shape was very large, suggesting that I-shape is predominant in lipid bilayers. As suggested by Vos et al., L-shape in membranes might be specifically stabilized in the oriented lipid bilayers, where the stacked bilayers can press the JM domain into the membrane-water interface.13
In the stable L-shape obtained in the IMIC model, most hydrophobic residues of the JM helix penetrated into the micelle surface, while a few hydrophobic residues in the IMM1 model. This is presumably because the TM domain can easily rotate or tilt inside the micelle, compared to the lipid bilayers, to accommodate hydrophobic matching in the TM domain and search optimal interactions between the JM helix and curved micelle surface. U-shape seems to be energetically unfavorable in lipid bilayers, since the N-terminal hydrophilic residues in the JM helix must be deeply inserted into the hydrophobic core region of the membrane. In micelle environments, it can be avoided because of the compact hydrophobic core space of the micelle. We suggest that accommodation of the interactions between the JM helix and micelle/membrane surface can make a structural difference of membrane proteins in different membrane environments.
Prediction of the transmembrane domain structure of APP dimer
Amyloid precursor protein (APP) is an integral membrane protein that plays an important role in the pathophysiology of Alzheimer’s disease.82 It has been suggested that the APP dimer forms different conformations in lipid bilayers and surfactant micelles. Solid-state NMR experiments demonstrated that APP adopts a right-handed Gly-in conformation in mixed DMPC:DMPG lipid bilayers, where GxxxG motif interactions are formed at the interface between the monomers.18 On the other hand, solution NMR experiments in DPC micelles showed left-handed Gly-out (PDB entry: 2LOH)19 or right-handed Gly-out conformations (PDB entry: 2LZ3),20 where GxxxA motif interactions exist at the interface. In computational studies, Miyashita et al. predicted Gly-in by using the REMD method83 with the IMM1-C19 model.84 Dominguez et al. performed multi-scale MD simulations combining coarse-grained and all-atom models, and suggested that Gly-in is stable in thick lipid bilayers, while Gly-out in micelles or thin lipid bilayers.85,86 They also reported a Gly-side conformation, in which the two helices adapt a parallel orientation to form a zipper-like interface.
To examine whether the IMIC and IMM1 models can reproduce these characteristics, we predicted the transmembrane domain structures of the APP dimer using the REMD method, and compared them with the NMR structures. We employed 60DPC and 85DPC micelles in IMIC, and POPC lipid bilayers with the membrane thickness of 28.8 and 32.0 Å in IMM1. In the initial structure, the two helices were perpendicular to the XY-plane, and each helix was rotated around its helical axis at 90° intervals in each replica, resulting in 16 individual conformations. We performed a 100-ns REMD simulation with 16 replicas (1.6 µs in total) for each system. The temperatures were distributed exponentially from 287.63 to 521.39 K, and replica exchange was attempted every 2,000 step (1 step = 2 fs) (for detailed simulation conditions, see Supporting Information).
During the REMD simulations, the two helices were associated and dissociated repeatedly. Both right-handed and left-handed configurations were sampled, where the right-handed ones were frequently observed (see Figure S6). Figure 9a shows the 2D-PMF at 300 K and 447 K along the RMSDs with respect to the left-handed and right-handed Gly-out conformations. The NMR structures of PDB entries: 2LOH and 2LZ3 are shown as blue squares and red circles, respectively, and the structures predicted by the PREDDIMER server87 are also plotted as triangles. In all systems, there were mainly two basins at 300 K (Basins I and II). Basin I was composed of Gly-in, which was close to the PREDDIMER structure, and Basin II was Gly-side (see Figure 9b). We found that the averaged effective energy of Gly-in was 2-3 kcal/mol larger than that of Gly-side in IMIC, while the tendency was opposite in IMM1 (see Table SVI). Interestingly, there was another basin (Basin III) at higher temperatures (382.58-521.39 K) in IMIC60DPC, IMIC85DPC, and IMM128.8Å, while not in IMM132.0Å. Basin III was mainly composed of Gly-out, and close to the NMR structures in DPC micelles.20 These results indicate that Gly-out is not so stable compared to Gly-in or Gly-side, but has a possibility to exist in micelles or thinner lipid bilayers.
The curved surface of the micelle is likely to induce formation of Gly-out, since the dimer can adapt an X-shaped structure due to hydrophobic matching. In fact, the averaged helix-crossing angle of Gly-out in IMIC was larger than those in IMM1 [55.6° in IMIC85DPC, 57.2° in IMIC60DPC, and 49.4° in IMM128.8Å (see Table SVI)]. Dominguez et al. suggested that water molecules that penetrate deeply into micelles or thin lipid bilayers could stabilize the hydrophilic interface in Gly-out.86 Such explicit interactions between water and APP might significantly shift the conformational equilibrium from Gly-in to Gly-out in micelles. Consequently, our simulations using the IMIC and IMM1 models are reasonably consistent with the NMR experiments and previous computational studies.
DISCUSSION
In this study, we developed a new implicit solvent model, named the IMIC model, which treats micelle environments implicitly to simulate protein-micelle complexes at low computational cost. We extended the lipid bilayer model (IMM1) to the micelle system by simply converting the hydrophobic slab to the ellipsoidal shape. This is based on the assumption that the parameters used for the calculation of the solvation free energy in the lipid bilayer model are directly applicable to a micelle model. This is reasonable because the micelle and lipid bilayer are composed of similar chemical compounds. Defining micelle shape with a super-ellipsoid function enables us to create various-shaped systems such as spherical micelles, rod-like micelles, bicelles, and nano-discs merely by changing a few parameters in the function. Our model will be further extended to the PB- and GB-based implicit micelle models.
Because the model proposed here is based on simple formulae and approximations, there is room for improvement. In the current model, only water and cyclohexane phases are considered. As in the IMM1-GC model,51 the Gouy-Chapman theory can be combined with the IMIC model, which can introduce one layer between the two phases to describe electrostatic interactions between surfactant headgroups and amino acids. A multiple-layered model has been also proposed for the GB-based implicit membrane model.47 It could also be important to incorporate a dynamic feature like local deformation of the micelle into the model, which could be realized by introducing a deformation energy of the micelle into the Hamiltonian, as in the DHDGB model.48
The IMIC model will be useful for structure determination of proteins by NMR techniques, especially when the experiments are conducted in micelles. The structure models are usually calculated from simulated annealing MD with geometrical restraints derived from the NMR spectra. It has been demonstrated that the NMR structure calculation in an implicit solvent model is comparable with that in the explicit solvent model.88 In addition, Tian et al proposed using the IMM1 model for effective structure determination of membrane proteins.89 If the target proteins were reconstituted into micelles, it would be reasonable to utilize IMIC rather than IMM1, since the protein structure might be perturbed by the curved surface of the micelles.
Another useful application will be a cryo-EM flexible fitting of membrane proteins.90 Cryo-EM is a powerful tool to determine 3D structures of biomolecules with near atomic resolution. In flexible fitting, MD simulation is usually carried out to guide the initial structure towards the target EM density map using a biasing potential. In such protocols, the implicit solvent model is useful to reduce computational cost, and the protein structure can be quickly changed and converged compared to the explicit solvent model.91 Experimentally, an ellipsoidal micelle can be clearly observed around the TM domain of membrane proteins in the density maps,3-6 enabling us to easily define the hydrophobic core region for the IMIC model. Recently, we have implemented a flexible fitting module into GENESIS,92 which is available with not only the EEF1/IMM1/IMIC models but also enhanced sampling algorithms like the replica-exchange MD method.93 The integrated methods are capable of realizing reliable structure modeling of membrane proteins in an experimental environment with low computational cost.
SUMMARY
In this study, we proposed an implicit micelle model (IMIC), which treats micelle environments implicitly. The IMIC model is one of the hydration shell models, and is an extended form of the implicit membrane model IMM1. We introduced a super-ellipsoid function to define the hydrophobic core region of the micelle, and the parameters were derived from all-atom MD simulations of protein-micelle complexes as well as pure micelles. We demonstrate that the structural and dynamic properties of membrane proteins in the IMIC model are similar to those in the explicit solvent model. We compared the conformations of selected membrane proteins (glycophorin A, HIV envelope glycoprotein gp41, bacteriophage M13 major coat protein gp8, and APP dimer) in the IMIC and IMM1 models, and obtained good agreement with NMR experiments. We conclude that the IMIC model faithfully reproduces curved surface effects and compact hydrophobic-core effects of micelles on membrane protein structures, and the combined use of IMIC and IMM1 models enables us to discriminate membrane mimetic artifacts from native structures among experimentally determined NMR structures.
ACKNOWLEDGMENTS
We would like to thank Drs. Kiyoshi Yagi, Hiraku Oshima, and Donatas Surblys at RIKEN for helpful comments and discussions. This work was supported by JSPS KAKENHI Grant Numbers JP15H05594 and JP26119006, a grant from Innovative Drug Discovery Infrastructure through Functional Control of Biomolecular Systems, Priority Issue 1 in Post-K Supercomputer Development (hp170254), and the RIKEN Pioneering Projects, Integrated Lipidology and Dynamic Structural Biology. MD simulations were partially carried out on HOKUSAI GreatWave and BigWaterFall at RIKEN.
REFERENCES
- (1).↵
- (2).↵
- (3).↵
- (4).
- (5).
- (6).↵
- (7).↵
- (8).↵
- (9).↵
- (10).↵
- (11).↵
- (12).↵
- (13).↵
- (14).↵
- (15).↵
- (16).↵
- (17).↵
- (18).↵
- (19).↵
- (20).↵
- (21).↵
- (22).↵
- (23).
- (24).
- (25).↵
- (26).↵
- (27).↵
- (28).↵
- (29).
- (30).↵
- (31).↵
- (32).
- (33).↵
- (34).↵
- (35).
- (36).
- (37).↵
- (38).↵
- (39).↵
- (40).↵
- (41).↵
- (42).↵
- (43).↵
- (44).↵
- (45).↵
- (46).
- (47).↵
- (48).↵
- (49).↵
- (50).↵
- (51).↵
- (52).↵
- (53).↵
- (54).↵
- (55).↵
- (56).↵
- (57).
- (58).↵
- (59).↵
- (60).↵
- (61).↵
- (62).↵
- (63).↵
- (64).↵
- (65).↵
- (66).↵
- (67).↵
- (68).↵
- (69).↵
- (70).↵
- (71).↵
- (72).↵
- (73).↵
- (74).↵
- (75).↵
- (76).↵
- (77).↵
- (78).↵
- (79).↵
- (80).↵
- (81).↵
- (82).↵
- (83).↵
- (84).↵
- (85).↵
- (86).↵
- (87).↵
- (88).↵
- (89).↵
- (90).↵
- (91).↵
- (92).↵
- (93).↵