Abstract
The circumsporozoite protein (CSP) on the surface of Plasmodium falciparum sporozoites is important for parasite development, motility, and host hepatocyte invasion. However, intrinsic disorder of the NANP repeat sequence in the central region of CSP has hindered its structural and functional characterization. Here, the cryo-EM structure at ∼3.4 Å resolution of a recombinant shortened CSP construct (rsCSP) with the variable domains (Fabs) of a highly protective monoclonal antibody reveals an extended spiral conformation of the central NANP repeat region surrounded by antibodies. This unusual structure appears to be stabilized and/or induced by interaction with an antibody where contacts between adjacent Fabs are somatically mutated and enhance the interaction. Such maturation in non-antigen contact residues may be an effective mechanism for antibodies to target tandem repeat sequences and provide novel insights into malaria vaccine design.
Summary An unusual spiral conformation is formed for the NANP repeat region in Plasmodium falciparum circumsporozoite protein (CSP) in complex with antibodies generated by the RTS,S vaccine and is stabilized by affinity-matured inter-Fab interactions.
With an estimated 445,000 deaths and 216 million cases in 2016, malaria poses a significant threat to public health (1). Emerging resistance against current front-line anti-malarials and insecticide resistance has furthered the need for an efficient malaria vaccine candidate (2). The pre-erythrocytic stage of the Plasmodium falciparum life cycle is an ideal target for the development of a vaccine that breaks the cycle of infection. After a bite from an infected mosquito, P. falciparum sporozoites (PfSPZs) migrate from the skin to the hepatocytes. Immunization with irradiated PfSPZs can induce strong protective immune responses in mice, monkeys and humans (3). For many years, the leading target for vaccine design has been the major surface protein of sporozoites, the circumsporozoite protein (PfCSP), which contains a central region consisting of multiple NANP repeats (4) that vary (from 25 to 49) among different P. falciparum isolates (5, 6). In addition, PfCSP contains a flexible N-terminal domain with a heparin sulfate binding site for hepatocyte attachment (7) and a structured C-terminal domain with a thrombospondin-like type I repeat (α,TSR) (8). The most advanced malaria vaccine to date is RTS,S, formulated in GlaxoSmithKline‘s adjuvant AS01. RST,S contains part of PfCSP, including 19 NANP repeats and the ケTSR domain, fused with hepatitis B surface antigen (HBsAg) (Fig. 1E) such that virus-like particles are formed when co-expressed with free HBsAg in yeast (9). The RTS,S vaccine has been shown to confer reasonable protection in children against clinical malaria (5-17 months old) with 51% protection over the first year of follow-up after a 0,1,2 month vaccination schedule [95% CI 48-55%]. Efficacy was see to wane to 26% over a 48 month follow up period [95% CI 21-31%]; however if a fourth boost was administered at month 20 post vaccination, efficacy is 39% [95% CI 34-43%]. (10-12). These results indicate that while the RTS,S vaccine is promising, an important objective in current malaria research is to improve and extend the vaccine efficacy. The R21 vaccine is an alternative approach to PfCSP with is composed of a single subunit with the same region of PfCSP fused to HBsAg (13). Clinical testing is underway using Matrix-M adjuvant as well as AS01.
One approach to improve vaccine designs, increasing the potency of antibody responses, involves investigation of the monoclonal antibody (mAb) responses generated through either whole PfSPZ or RTS,S immunization at the structural and functional level Recent X-ray structures of protective human fragments antigen-binding (Fabs) in complex with PfCSP repeat peptides revealed similarities and differences in how these repeats are recognized (14-17). Namely, the peptides are organized into NPNA structural units that can adopt type I β-turns and pseudo 310 turns as originally observed for free peptides in solution and in peptide crystal structures (18, 19). One of these antibodies, mAb311, was isolated from a protected volunteer in a phase IIa RTS,S/AS01B controlled human malaria infection (CHMI) clinical trial (20) and inhibited parasite development in the liver by ∼97% as assessed by mouse challenge experiments with engineered P. berghei SPZs that express PfCSP (14). Interestingly, a low-resolution negative-stain electron microscopy (nsEM) reconstruction of a recombinant shortened PfCSP construct (rsCSP, Fig. 1E) in complex with Fabs of mAb311 (Fab311), gave the first insight into organization of the NANP repeats with bound antibodies. However, a high-resolution structure would provide valuable information for optimal display of protective epitopes in a vaccine setting.
Cryo-EM structure of CSP and architecture of rsCSP-Fab311 complex
To decipher the architecture of the rsCSP-Fab311 complex at high resolution, we used single particle cryo-electron microscopy (cryo-EM). A final dataset of 206,991 particles was refined asymmetrically, resulting in an ∼3.4 Å resolution reconstruction (fig. S1). Eleven copies of the crystal structure of Fab311-(NPNA) 3 could be fit into the EM map and the rsCSP-peptide complex was then assembled in COOT to generate an initial model. This model was subjected to multiple rounds of refinement into the EM density map using RosettaRelax (Fig. 1A, fig. S1 and S2, table S1).
The repeat region of rsCSP is well defined with continuous cryo-EM density (Fig. 1F and Fig. 2C) and forms an unusual extended spiral structure (Fig. 1A and D), from which multiple Fab311 antibodies radiate tangentially in a pseudo-helical arrangement (Fig. 1B), consistent with a previous negative-stain EM reconstruction (14). In the cryo-EM map, however, two additional Fabs were observed, demonstrating that 11 Fabs can bind simultaneously to rsCSP (Fig. 1C), although the density for the N- and C-terminal Fabs were sparse (table S1). In addition, no density was observed for the N-terminal or C-terminal αTSR domains of rsCSP, likely due to flexibility. Even though the αTSR domain has been observed to be structured by itself (8), it is connected to the NANP repeats through a disordered linker that is devoid of epitopes for Fab311. The angular twist between Fab variable domains is ∼77° with respect to each other, where 4.7 Fabs (360°/77°) are required to complete one full turn of the spiral (Fig. 1C).
Helical conformations for NANP repeats have been proposed for CSP using computational methods. Gibson et al. used a modified buildup procedure to explore possible helical conformations for the (NANP) 6 peptide, assuming that tandem repeats are likely to display helical or near-helical conformations driven by cooperative interactions (21). Two lowest energy helices were identified, with a radius of 3.5Å and 3.7Å, and a pitch of 10Å and 7Å respectively. Brooks et al. proposed an alternate much wider 1238 helix with a pitch of 4.95Å using molecular dynamics (MD) calculations (22). Interestingly, another model previously suggested a stem-like superhelix for the complete NANP repeat region with a width of 15 Å (radius 7.5 Å), length of 180 Å, and pitch of 7 NPNA repeats (23). Each of these helical predictions are very different from our structure. We observe a wider radius of 13.4 Å, length of 145 Å, and a much larger pitch of 49 Å (9.5 NPNA repeats, Fig. 1,D and F). To complete a full turn on the spiral, a fifth Fab partially packs underneath the first Fab, thereby making the pitch similar to the width of Fab311 along the longitudinal axis of the spiral (44.7 Å (24)). Our structure also differs from a recent model for an anti-NANP mouse antibody, 2A10, as a complex with NANP repeats. Here, antibodies are proposed to bind a narrow helix of repeats that adopt type I β-turns derived from MD simulations (25).
Fab311 epitope on rsCSP
Traditionally, the repeat region of PfCSP has been described by the number of NANP repeats. However, NMR and X-ray crystallographic evidence show that the repeats are likely organized as NPNA structural motifs (14-18, 19). Hence, we adopt the NPNA nomenclature when discussing the epitope instead of the more general NANP notation. The Fab311 epitope was proposed to consist of a minimum of two to three NPNA repeats based on the crystal structure of Fab311 with the (NPNA)3 peptide and Isothermal Titration Calorimetry (ITC) affinity measurements (14). Here, the cryo-EM structure determines unambiguously that the epitope consists of only two NPNA repeats. In fact, the Fabs are so closely packed against one another that their two epitopes are seamlessly stitched together without the need of an additional repeat as a spacer. Furthermore, we observed that Fab311 is able to bind the NVDP repeats, thereby increasing the available epitopes on rsCSP from 15 (NPNA only) to 22 (including the DPNA and NPNV repeats). Remarkably, the only two sequence differences in DPNANPNV from NPNANPNA occurs on the edge of the epitope and thus are likely minimally inhibitory to Fab311 binding (Fig. 2). The Asp at the N-terminus is in a similar conformation to the Asn and the Val projects out into solvent. The calculated buried surface area (BSA) when taking two adjacent Fabs as one binding unit is 972Å2on the Fabs and 843Å2 on the (NPNA)4 peptide. Fabs are positioned such that the groove in which the peptide resides extends from one Fab directly into the other (Fig. 2, A and B). Overall, there is excellent agreement with the epitope in the cryo-EM structure with the first two of the three NPNA repeats observed in the crystal structure (Fig. 2E). The two repeats of the NPNA epitope adopt a type I β-turn followed by a pseudo 310 turn (Fig. 2D) that repeats throughout the length of the spiral structure. Each pseudo 310 turn has its asparagine (i) sidechain hydrogen bonding with the backbone amide of the next asparagine (i+2). Due to this unique repetition of the (NPNA)2 epitope in rsCSP, the proline residues consistently point away from the center of the spiral, serving as anchor points to which the Fabs latch on (Fig. 1F). Notably, CH/π interactions of the prolines with Trp52 and Phe59 alternate (Fig. 2B), with Cα-Cα distances of 9Å and 12Å between each consecutive proline pair (Fig. 2C). Trp52 provides key contacts with the peptide and may account for the frequent selection of germline VH3-33 (and related VH3-30) for recognition of the NANP repeats (16-17, 26).
Inter-Fab contacts stabilize the CSP spiral structure
It is unlikely that free PfCSP is predominantly present as a well-defined spiral on the surface of the PfSPZ, since the repeat region is predicted to be disordered (27), and Atomic Force Microscopy (AFM) and single-molecule Force Microscopy experiments indicate that PfCSP can adopt multiple conformations (28, 29). Thus, binding of Fab311 may induce and stabilize the rigid spiral structure in the NANP repeat region of PfCSP. Surprisingly, neighboring Fabs that bind adjacent (NPNA)2 epitopes contribute 319Å2and 340Å2BSA to a novel interface between the Fabs (Fig. 3B). Taking into account these additional contacts, the total BSA on each Fab with rsCSP and neighboring Fabs becomes 1145Å2((972Å2/2)+319Å2+340Å2), which increases the original Fab-peptide BSA more than 2-fold. Close inspection reveals that the inter-Fab BSA between two Fabs (A and B) binding successive epitopes of the rsCSP (interface 1) spiral consists of polar contacts that are made between BCDR L3/ACDR H3 and BCDR H2/ACDR H1 (Fig. 3C). Interestingly, many residues that are involved in inter-Fab contacts correlate with affinity maturation from the IGHV3-33*01 and IGLV1-40*01 germline genes for the heavy and light chain, respectively (Fig. 3, G and H). Notably, salt bridges are made between Asp99 of ACDR H3 and Arg93 and Arg94 of BCDR L3; additionally, a cation-π interaction is found between Arg94 of BCDR L3 and Tyr98 of ACDR H3 where Arg94 Nε and the center of the aromatic tyrosine ring are 4.2Å apart (Fig. 3D). Furthermore, Asn31 of ACDR H1 and Arg56, Asn57 and Glu64 of BCDR H2 form an extensive hydrogen bonding network, which would be abrogated if reverted to the germline sequence (Fig. 3E). Most of these residues do not contact the NPNA repeat motifs, except for Asn31. Affinity maturation of Ser31 to Asn31 is likely driven by inter-Fab contacts, since Asn31 hydrogen bonds with the repeats using its main-chain atoms, while simultaneously forming a hydrogen bond with a neighboring Fab using its side chain (Fig. 3F). Other Fabs in close proximity are those that bind four epitopes away (B and F) such that they complete a full spiral turn and are either above or underneath the Fab of interest. Although some BSA is present between these two Fabs (interface 2), there are no direct contacts as assessed by CONTACSYM (Fig. 3A).
Mutagenesis of the Fab311 interface
To investigate the specificity of the interactions between adjacent Fabs, affinity-matured residues that engage with neighboring Fabs were mutated to the inferred germline sequence (Fab311 inter-Fab contact residue reverted, Fab311R). Specifically, four and two residues were mutated in the heavy (N31S, R56N, N57K and E64K) and light (R93S and R94S) chains, respectively. First, we assessed whether Fab311R can still bind to the (NPNA)2 peptide using ITC affinity measurements (fig. S3, table S2) and found that its binding is unperturbed indicating that few if any mutations are required for high affinity peptide binding. Next, we determined if the germline reversion mutagenesis abrogated formation of the rsCSP spiral using nsEM. Surprisingly, the 2D class averages revealed a new phenotype with varying stoichiometries for the rsCSP-Fab311R complex in which a well-defined long-range spiral was absent. Nevertheless, the rsCSP-Fab311R particles still adopted curved conformations in which the Fabs can still bind relatively closely together, indicating that some form of inter-Fab contacts may be encoded in the germline (Fig. 4). Such heterogeneity led to the inability of the particles to converge into a stable 3D reconstruction, which could not be further refined. By comparison, 2D class averages of wild-type rsCSP-Fab311 complex show a much more homogeneous and compact complex, providing further evidence that the affinity-matured inter-Fab residues play a crucial role in stabilizing the spiral architecture of rsCSP and presumably help gain increased avidity to CSP.
Conservation of the spiral architecture
To visualize how Fab311 might bind to the more physiologically relevant PfCSP on the PfSPZ surface, we expressed a full-length PfCSP (flCSP, based on the 3D7 strain) for nsEM studies with Fab311. The amino-acid sequence of flCSP is identical to rsCSP with the exception of the repeat region, which has 38 NANP repeats and 4 NVDP repeats, of which three are located at the N-terminus and one in the middle of the NANP repeat region (Fig. 1E). The 3D reconstruction of flCSP-Fab311 revealed an identical helical architecture to the rsCSP-Fab311 complex (Fig. 5A, figs. S4 and S5). Since the number of NANP repeats is doubled in flCSP compared to rsCSP, we were expecting >20 bound Fabs. However, the total Fab count in the flCSP-Fab311 complex is only 14. One possible explanation is that the additional NVDP repeat in the center of the NANP repeat region breaks up the NPNA registry and rigidity of the structure, since the affinity for NVDP is approximately 5-fold less than for NANP repeats (14). Nonetheless, these results provide evidence that the spiral architecture can also be formed by PfCSP with a widely different repeat length.
To answer the question of whether an individual IgG is capable of binding to two epitopes within the same rsCSP molecule and further stabilize the spiral, we prepared a rsCSP-IgG311 complex for nsEM studies. A significant amount of aggregation was observed upon addition of IgG version of mAb311 (IgG311) to rsCSP as a result of crosslinking rsCSP molecules, which has also been termed the CSP reaction (30). After removal of aggregates by spin filtration (0.22 m) and subsequent size-exclusion chromatography (SEC), we were able to separate the sample into soluble aggregates, rsCSP-IgG311 complex, and unbound IgG311 fractions (Fig. S4). In the nsEM 2D classes of the rsCSP-IgG311 complex, the Fc domains appeared as diffuse densities radiating from the Fabs that did not converge in the 3D reconstruction (Fig. 5B). The 3D reconstruction closely matched the nsEM map of rsCSP-Fab311, but with a subtle difference in the helical twist. Comparison of the top views of the two reconstructions shows that Fab311 binds rsCSP in partially eclipsed orientations along the length of the spiral, while the two Fab domains of each bound IgG311 lie on top of one another (Fig. 5A and B, figs. S4 and S5). Notwithstanding, the rsCSP still adopts a spiral structure of identical radius with the IgG, despite the additional geometric constraints that the hinge region of the IgG311 poses on binding. A total of 5 IgG‘s (10 Fabs), were bound to rsCSP, in comparison to only 9 Fab311 in the nsEM 3D reconstruction (14). Thus, although IgG311 likely crosslinks PfCSP on the surface of PfSPZs, analysis of this minor population of single particles indicates that, just as for Fab311, IgG311 can bind with its Fab arms closely together and then still accommodate inter-Fab domain contacts between two different IgG molecules. The two Fabs that contribute to one IgG then are oriented such that the heavy and light chains are arranged light-heavy_ light-heavy and are not symmetric (light-heavy_heavy-light) as depicted in cartoons in most text books (Fig. 5C).
Structural ramifications and implications for vaccine design
The cryo-EM reconstruction of rsCSP saturated with Fab311 at 3.4 Å demonstrates an unprecedented open spiral structure of rsCSP, which is still present with IgG or with flCSP. This structure differs substantially from previous predicted helical models for the NANP repeat region. Unexpectedly, the Fab domains not only make specific interactions with the NANP repeat region, but also with neighboring Fabs along the NANP spiral surface. These inter-Fab contact residues have undergone somatic hypermutation and are crucial for spiral formation. This finding provides strong evidence for antigen-induced maturation of inter-Fab interactions for human antibodies, which may prove to be a common mechanism for increasing affinity against the PfCSP repeat region and for tandem repeat sequences in general. Recently, heavy chain antibody fragments (nanobodies) derived from Alpacas against a pentameric antigen were observed to have inter-nanobody contacts, suggesting that this mechanism may be present across certain antibodies in different animal kingdoms (31). A previous structure of antibody 2G12 to HIV Env revealed a novel domain swap within the Fabs of a single IgG molecule, where the heavy chain from one Fab paired with the light chain of the other Fab, such that a new VH-VH interface was formed that was also subject to affinity maturation (32). However, that configuration differs from the Fab arrangement here, where we observe instead affinity maturation between the Fabs that are connected to different IgG molecules. We do not know whether spiral formation correlates with protection, since mAb317 is bound in a less regular way to rsCSP, while being of similar efficiency as mAb311 in reducing the parasite liver load in mice experiments (14) (33) (fig. S6). Interestingly however, mAb317 2D class averages of the Fab bound to rsCSP are topologically similar to the more ordered mAb311 classes. Thus, it is likely that parts of the spiral may be present in the PfCSP conformational ensemble, perhaps even in the form of successive type-I β and pseudo 310 turns, which in effect may code for the spiral preference in the presence of Fab311. If protection is correlated with recognition of a particular conformation of the PfCSP repeat region (20), inter-Fab maturation and spiral formation could lead to higher avidity and potentially more protective anti-malaria antibodies.
A recently described human antibody (MGG4) bound to the N-terminal junction peptide (KQPADGNPDPNANP) showed binding to an NPDP repeat in the junction region just prior to the repeat region (16), which Fab311 is also capable based on our cryoEM structure. Additionally, Fab311 and MGG4 have identical heavy-chain germline gene (VH3-33/30) usage and mode of binding through CH/π interactions between a proline in a pseudo 310 turn with conserved Trp52 (Fig. 1a). Since Fab311 is derived from a volunteer immunized with RTS,S and MGG4 from a volunteer immunized with irradiated PfSPZs, the similarities between the two imply that the previously reported potent public antibody lineage, from which MGG4 originates (16), can be accessed using the RTS,S vaccine candidate. This intriguing cryoEM structure may provide the basis for design of previously unanticipated novel immunogens that now can take into account the three-dimensional spiral architecture of the CSP repeat region rather than information derived solely from Fab-peptide studies with smaller numbers of repeats.
ACKNOWLEDGEMENTS
We thank B. Anderson for maintaining the microscopes and H.L. Turner, C.A. Bowman, and G. Ozorowski for technical assistance. We thank Kelsey Mertes and Ashley Birkett of PATH MVI for critical reading and comments on the manuscript. This work was funded by PATH‘s Malaria Vaccine Initiative and the Bill and Melinda Gates Foundation (grant no. OPP1170236) under collaborative agreements with The Scripps Research Institute.