Abstract
Single-particle cryogenic electron microscopy (cryo-EM) provides a powerful methodology for structural biologists, but the resolutions typically attained with experimentally determined structures have lagged behind microscope capabilities. Here, we have exploited several technical solutions to improve resolution, including sub-Angstrom pixelation, per-particle CTF refinement, and most notably a correction for Ewald sphere curvature. The application of these methods on micrographs recorded on a base model Titan Krios enabled structure determination at ∼1.86-Å resolution of an adeno-associated virus serotype 2 variant (AAV2), an important gene-delivery vehicle.
Single-particle cryo-EM has become a powerful tool for macromolecular structure determination, owing largely to numerous technical advances over the past decade1. Whereas near-atomic resolution (~3–4 Å) can now be obtained almost routinely, achieving resolutions below ~2.5 Å remains challenging, and only one experimental cryo-EM structure has broken the nominal 2 Å barrier2. We sought to address a number of factors limiting the resolution of structure determination by single-particle cryo-EM. In these analyses, we studied a variant of adeno-associated virus (AAV) serotype 2 containing a single amino-acid substitution, L336C. The AAV2L336C variant is of particular biological interest, as it is defective in genome packaging and is associated with reduced infectivity3, 4 AAVs are single-stranded DNA viruses that infect vertebrates5 and are thereby attractive vehicles for gene delivery5, 6, with AAV2 being one of the most popular serotypes for such applications. The AAV viral capsid is formed by an icosahedral (T=1)5 arrangement of 60 viral protein (VP) monomers, and has a molecular weight of ~3.9 MDa and a shell diameter of ~250 Å. The three related capsid proteins, VP1, VP2, and VP3, share a common core sequence and occur in a predicted 1:1:10 ratio7. AAV was particularly suited to our cryo-EM studies because: 1) it is relatively small for a virus and can be packed across cryo-EM grid holes in reasonably thin ice; 2) it can be stably assembled into homogeneous virus-like particles (VLPs) devoid of genomic material; and 3) it has icosahedral symmetry, which increases the number of asymmetric subunits in the dataset by 60-fold for each particle imaged.
To image AAV2L336C particles, we used a base model Titan Krios operating at 300 keV with a K2 summit detector, without the use of newer technologies such as phase plates8, Cs correctors9, and energy filters2. For data collection, parameters such as aperture choice, beam size, magnification, camera settings, choice of stage shift for targeting and defocus range build upon optimal conditions elucidated in previous high-resolution single-particle collections2, 10 and are elaborated in supplementary note 1. After recording a Zemlin tableau from a gold-coated cross-grating replica calibration grid, which revealed a coma-free aligned beam and evidence for 1.44 Å gold diffraction spots (Supplementary Fig. 1), we proceeded with collecting cryo-EM micrographs of AAV2L336C (Supplementary Fig. 2). For data processing, multiple procedures resulted in statistically significant improvements in resolution, as evidenced by changes across most frequency ranges within Fourier Shell Correlation (FSC) curves, summarized in Figure 1a. Improvements are described in spatial frequency shells, since at higher resolutions, statistically significant gains are characterized by incrementally smaller increases in nominal resolution values. First, we removed particle images with the greatest angular uncertainty based on conventional scoring criteria in either Relion11 or Frealign12 and adjusted the weights for how different particles contribute to the reconstruction. We then performed per-particle CTF estimation using GCTF13 and subsequently refined these values in cisTEM14, providing a cumulative gain of 46 resolution shells (~0.3 Å). Correcting for magnification anisotropy (estimated at ~1%)15 provided gains in resolution by 35 shells (0.21 Å). Notably, map resolution increased by 15 shells (0.09 Å) after correcting for the curvature of the Ewald sphere16, which has been predicted, but not previously demonstrated with experimental single-particle cryo-EM data (discussed further below). Additionally, per-frame reconstructions allowed us to determine which of the 70 frames contained the most information content. Reconstructions from individual frames (each receiving a dose of 0.32 e−/Å2), provided maps with resolutions ranging between 2.1 to 3.4 Å (Supplementary Fig. 3). Discarding the first 4 frames, which contained the largest beam-induced movement, improved the map by 9 resolution shells (0.05 Å), and frames 5–19 could also be combined to produce a largely identical reconstruction to one composed from frames 5–70 (Fig. 1a and Supplementary Fig. 4). Finally, we found that correcting for the rotational particle movement through the course of the movie by refining the orientations of groups of five-frame averages improved low spatial frequency FSC values and the quality of the map, although the nominal value remained largely unchanged. Cumulatively, the above procedures resulted in a total gain of 71 resolution shells (~0.4 Å). As previously demonstrated17, the summation of individual gains is not equal to the cumulative improvement, as the effects are not additive.
The above results revealed that correcting for the curvature of the Ewald sphere is pertinent to experimental reconstructions in high-resolution single-particle cryo-EM analysis, warranting further investigation. Most 3D reconstruction algorithms assume that images correspond to direct projections of the 3D object, in accordance with the central slice projection theorem18. However, several aspects of cryo-EM data acquisition invalidate this approximation at resolutions approaching true atomic19, 20. Most notably, imaged objects have finite thickness along the optical axis of the microscope, which results in an inherent focus gradient during imaging. The focus gradient alters the phases and amplitudes associated with each Fourier coefficient, and the effects become progressively more pronounced at higher resolutions, lower accelerating voltages, or for thicker specimens21. Various schemes to estimate and correct for the curvature of the Ewald sphere have been developed16, 19, 20, 22, but no experimental reconstruction from a single-particle macromolecular sample has to date demonstrated improvements from taking Ewald curvature into account. The “simple-insertion” method for Ewald curvature correction implemented in Frealign9 will insert the data for each particle image twice into correct Fourier coefficients related by Friedel symmetry16. This procedure, performed during reconstruction, resulted in an increase of 15 resolution shells within our final map (Fig. 1a). We then evaluated the effect of the correction at lower resolution by reducing the number of particles in the reconstruction. Randomly selected subsets of the data containing an approximately equal defocus range were used to perform reconstructions with incrementally smaller numbers of particles. As few as ~60 particles (3,600 asymmetric units) were sufficient to produce a ~3.5 Å map, whereas ~120 particles (7,200 asymmetric units), and all larger subsets, were sufficient for <3 Å reconstructions (Supplementary Fig. 5). These maps could be used to evaluate Ewald curvature effects as a function of resolution for the ~250 Å diameter particle. Noticeable gains appeared at ~2.4–2.3 Å, and a final improvement of 15 shells (~0.1 Å) for the best reconstruction (Fig. 1b). The gains follow an increasing trend at higher resolution, as the effects of Ewald curvature become more pronounced at higher electron scattering angles19. Furthermore, the correct handedness of a reconstruction can be explicitly determined when accounting for the effects of the Ewald sphere16. Specifically, the handedness of the reconstruction defines how the Fourier coefficients are substituted in the reconstruction and whether an inversion operation must be applied to the Fourier coefficients (Supplementary Fig. 6).
The final map of AAV2L336C had a global resolution of 1.86 Å, with a largely homogeneous local resolution distribution within the core of the capsid shell that drops to >1.92 Å at the solvent exposed surfaces (Fig. 1c-d and Supplementary Table 1). Using this map, an atomic model was derived for the common region of the VP monomer, residues 226 to 735 (VP1 numbering), which was symmetry expanded by icosahedral matrix multiplication to produce the full 60-mer viral capsid. As in previously reported AAV structures, the VP1u, VP1/2 common sequence, and the N-terminus of common VP3 are disordered. The final model corresponded closely to the map, with good statistics (Supplementary Table 1), including a high EM-Ringer23 score of 8.49 and a correlation coefficient following model refinement in Phenix of 0.84924. The EM-Ringer score reflects accuracy of fit between model and map based on side-chain rotameric positions. At this resolution, the map is of sufficient quality to see numerous features with unprecedented detail (Fig. 2 and Supplemental Movie 1) including: 1) the backbone tracing with well-defined carbonyls; 2) explicit structure to most side-chains, rotamers, holes in aromatic residues, as well as prolines and associated puckers; 3) ordered solvent throughout the structure, including primary and secondary hydration shells; and 4) the distinct appearance of density for individual oxygens of carboxylate groups, which occasionally begin to show traces of H-bonding geometry.
Efforts to improve the AAV gene delivery system have focused on structure-function analysis of the viral life cycle and of engineered capsids to improve therapeutic efficacy. The 1.86-Å resolution structure of AAV2L336c represents the most accurately interpreted AAV capsid model thus far (Fig. 3). This particular variant also exhibits specific structural changes that are clearly captured in the density map, and these changes may be associated with infectivity defects (Supplementary Note 2 and Supplementary Fig. 7). High-resolution structural information can aid the annotation of: 1) water networks required to stabilize the capsid structure assembly and involved in its function; 2) the protonation states of acidic and histidine residues important for interactions in the endo/lysosomal pathway; 3) capsid interactions with the transcription machinery and during capsid assembly; and 4) precise receptor and antibody interactions. Details from such analyses can guide the engineering of AAVs at specific residues to eliminate interactions, such as those with pre-existing host immune system molecules, or improve function, such as specific tissue targeting. Notably, the number of known sites of interaction between the AAV ligand and host receptor/antibody far exceeds the number of experimentally derived AAV models. For this reason, the fact that high-resolution structures of AAV variants can be derived with as few as ~100 particles (Supplementary Fig. 5) is noteworthy and will accelerate the compilation of a comprehensive structural understanding of AAV:host interactions5.
The methods described herein provide a feasible route toward true atomic resolution in cryo-EM single-particle analysis. While we used a well-behaved sample for this work, the modest amount of time (3.5 days collection) and equipment (a base model Titan Krios and K2 Summit camera) used for this reconstruction would make our strategy generally applicable, even though the relative gains will differ by specimen (Supplementary Fig. 8). Correcting for the curvature of the Ewald sphere should be incorporated into reconstruction algorithms and may have particular relevance for improving resolution for samples collected at lower microscope accelerating voltage25. Finally, our sub-2 Å resolution reconstruction AAV2L336C also provides new insights into AAV life cycle and biology that will be invaluable for improving the effectiveness of AAV as a delivery vehicle in gene therapy applications.
Accession codes and deposition
All raw movie frames, micrographs, the particle stack and relevant metadata files will be deposited into EMPIAR. The electron density map will be deposited into EMDB. The model will be deposited into PDB.
Author Contributions
J.G. generated the baculovirus construct. J.A.H. purified the sample. D.L. and S.A. vitrified the sample and collected the data. Y.Z.T., D.L., and S.A. processed the data. D.L., M.M., Y.Z.T. and S.A. built and refined the model. M.A.M., R.J.S., T.S.B. and R.M. conceived the variant study. D.L. and Y.Z.T. conceived the high-resolution study. T.S.B., M.A.M. and D.L. supervised throughout the experiment. All authors read and contributed to the manuscript.
Conflict of Interest Statement
M.A.M. is a SAB member for Voyager Therapeutics, Inc., and AGTC, has a sponsored research agreement with AGTC and Voyager Therapeutics, and is a consultant for Intima Biosciences, Inc. M.A.M. is a co-founder of StrideBio, Inc. This is a biopharmaceutical company with interest in developing AAV vectors for gene delivery application. R.J.S. is the scientific founder of Bamboo Therapeutics, Asklepios Biopharmaceutics, Chatham Therapeutics, and Merlin. These companies also have interest in the development of AAV for gene delivery applications.
Supplementary Note 2 | Structural Comparison of AAV2L336C and AAV2WT
The density for C336 is clearly ordered in the AAV2L336C map, and the model is considerably improved compared with the prior structure of AAV2WT (Fig. 3)35. There is a 1.4 Å shift of the main-chain of C336 and neighboring residues, compared to AAV2WT (Supplementary Fig. 7a). This results in a 0.8 Å widening at the base of the 5-fold channel formed by five symmetry related DE loops (the loop between the βD and βE strands). In addition, the AAV2WT structure is ordered from residue 217 to 735, with the additional N-terminal residues compared to AAV2L336C (residues 226 to 735) occupying the base of the interior opening of the 5-fold channel (Supplementary Fig. 7b). The AAV2L336C variant displays a 23-fold defect in genome packaging compared to AAV2WT and lacks PLA2 activity resulting in a defect in infectivity3, 4 This defect was proposed to be due to the inability to expose the PLA2 and potential structural differences to AAV2WT. The annotated differences in AAV2L336C support these possibilities. An altered location of the PLA2 domain due to the N-terminal disorder would abrogate its externalization via the 5-fold pore and thus its function.
Online Methods
Statistics
For calculations of Fourier shell correlations (FSC), the FSC cut-off criterion of 0.143 38 was used.
Production and purification of AAV2L336C virus-like particles
The AAV2L336C substitution was created within the AAV2 cap gene encoding all three viral proteins, VP1, VP2, and VP3, as previously described4. A recombinant baculovirus, encoding the AAV2 cap gene, with the L336C substitution, was created using the Bac-to-Bac system (Thermo Fisher). A plaque purified and titered baculovirus stock was used to infect Sf9 insect cells, at a multiplicity of infectivity of 5 to generate virus-like particles (VLPs). The harvested pellet (from lysed cells and polyethylene glycol precipitated supernatant) was freeze/thawed three times with Benzonase (EMD Millipore Cat#712053) treatment. After the third thaw, the resulting clarified supernatant was purified using a step iodixanol gradient followed by anion exchange39 and then dialyzed into 50 mM HEPES, pH 7.4 with 2 mM MgCl2, 150 mM NaCl. The sample concentration was determined by optical density assuming an extinction coefficient of 1.7 mg/(mL-cm) for AAV2 VLPs. The VLP purity and integrity were confirmed by sodium dodecyl sulfate polyacrylamide gel electrophoresis and negative stain EM on an FEI Spirit TEM, respectively.
Single-Particle CryoEM Vitrification and Data Collection
Double blotting was used to increase particle concentration40. 2.5 μl of AAV2L336C sample at 2.5 mg/ml was added to a plasma-cleaned (Gatan Solarus) 1.2 μm hole, 1.3 μm spacing holey gold grid (Quantifoil UltrAuFoil) and blotted away using Whatman grade 4 filter paper after 20s wait time. 2.5 μl of the same sample was then re-applied to the grid and blotted after 20s wait time and then vitrified in liquid ethane using a manual plunger. All operations were performed in a 4°C cold room at >80% humidity to minimize evaporation and sample degradation.
Data Acquisition
Images were recorded on a Titan Krios electron microscope (FEI) equipped with a K2 summit direct detector (Gatan) at 0.394 A per pixel in super-resolution counting mode (0.788 Å for the physical pixel size) using the Leginon software package34. Data collection was performed using a dose of ~22.5 e−/Å2 across 70 frames (50 msec per frame) at a dose rate of ~4.0 e−/pix/sec, using a set defocus range of −0.6 μm to −2 μm. On our microscope, the 100 μm objective aperture would allow for transmission of information up to ~1.4 Å, but could not be aligned to produce a coma-free diffractogram. In contrast, the 70 Åm aperture would truncate information at the ~2 Å limit. For this reason, the objective aperture was removed to prevent physical truncation of the most widely scattering electrons - and thus the highest-resolution information. Removal of the objective aperture in this case has the benefit of eliminating this aperture as a potential source of image astigmatism. A total of 1,317 micrographs were recorded over a single 3.5-day collection.
Data Processing
Movie frames were aligned using MotionCor241 with 5 by 5 patches, a grouping of 3 and B-factor of 100, and Fourier space binning of 2 (resulting in a pixel size of 0.788 Å/pixel) through the Appion software package36. Micrograph CTF estimation was performed using both CTFFind442 for whole micrographs and GCTF13 for individual particles within the Appion software package. A subset of 8 micrographs was first used for particle picking using Gautomatch (Kai Zhang, unpublished), and particles were extracted and analyzed by 2D classification in Relion 2.111. 2D class averages that showed clear structural details were used as templates for template-based picking using Gautomatch on all 1,317 micrographs. A total of 78,194 particles were then extracted using a box size of 800 pixels and subjected to two initial rounds of 2D classification (binned by 4) to identify and discard false positives such as ice and other obvious contaminants. Following 2D classification, 36,620 particles were reextracted with the re-centering option in Relion.
3D refinements were performed first using Relion and finishing in cisTEM14, with the initial model generated by CryoSPARC43. Icosahedral symmetry was imposed during all 3D refinement steps, based on prior knowledge5 of AAV2 structure. All conversions between Relion, CryoSPARC, and cisTEM were performed using Daniel Asarnow’s pyem script (unpublished). An initial 3D refinement using 7 rounds of auto-refinement and 2 rounds of local refinement with binning of 2. Particles were discarded based on analysis of the “score” values in cisTEM leading to the removal of a distinct subset of particles with low scores (below 6 in this dataset). This resulted in 30,515 particles that were re-extracted, unbinned and used for all subsequent operations. Per-particle CTF refinements were performed within cisTEM. All final refinements used a ring-shaped mask with an inner diameter of 75 Å and an outer diameter of 150 Å to specifically include only the capsid density and exclude remaining solvent. For this dataset, and after applying the stack-filtering procedures described above, 3D classifications did not produce any noticeable further gains.
Plotting the defocusV against defocusU values44 showed a systematic scaling of the difference between these two values as a function of their magnitude. Using the mag_distortion_estimate software15 and micrographs collected from a gold-coated cross grating replica grid (Supplementary Fig. 1), a magnification anisotropy of 1.10% was calculated. The appropriate correction for magnification anisotropy was applied during frame alignment (see above). The particle stack that was re-extracted from magnification-anisotropy-corrected frame sums reached a resolution of 1.97 Å after derivation of an ab initio model in CryoSparc and refinement in cisTEM.
The aligned movie frame stack was also split into individual frames and using the best Euler angles and shifts from above, reconstructions were computed using Frealign9. Frames 5–19, each of which independently exceeded a resolution of 2.24 Å, were summed and used for subsequent manual refinement (including CTF refinement) within cisTEM to obtain a reconstruction at 1.93 Å. The final reconstruction at 1.84 Å was computed after correcting for the curvature of the Ewald sphere using Frealign9. Rotational motion correction was performed in cisTEM by splitting each particle sum into groups of 5 frames (frames 5–9, 1014, and 15–19), and refining each group-of-5 as if it were a single particle. Particle-frame-averages with score lower than 3 were removed, resulting in a final stack of 87,781 “particles” that refined to 1.86 A resolution. Although the nominal resolution at 0.143 cut-off was worse than without rotational frame alignment, inspection of the FSC curves and visual comparison of the two maps suggested that this procedure provided minor benefits. Notably, information at lower spatial frequencies was slightly improved within the reconstruction following rotational frame alignment.
To generate maps of the opposite handedness, Euler angles of the particles were changed from (phi, theta, psi) to (-phi, 180-theta, psi). Ewald sphere curvature corrected reconstructions of same and opposite handedness were done by setting IEWALD to either 1 or −1 respectively in Frealign9.
AAV2L336C model refinement
For model refinement of the AAV2L336C variant, the deposited structure of AAV2 (PDB-ID: 1LP3) was used as a starting template. A 60mer capsid model downloaded from VIPERdb45 (http://viperdb.scripps.edu) was docked into the map using the ‘Fit-in-map’ function in the Chimera46 program. To optimize the correlation coefficient (CC) between the model and map the voxel (pixel) size of the map was adjusted. From the fitted 60mer, a monomer was extracted for the model building. For model building and real space refinement in the Coot47 program, the map was converted from the Purdue Image Format (PIF) to the XPlor format using e2proc3D.py subroutine in the EMAN248 application and finally to the CCP4 format using MAPMAN49. In Coot47 L336 in AAV2WT was substituted to a cysteine and the side-chain and main-chain atoms, including those of neighboring residues, adjusted to better fit the experimental density map using the real-space-refinement subroutine. After the manual refinement of the monomer was completed, a 60mer was regenerated in VIPERdb by T=1 icosahedral matrix multiplication and the model refined against the cryo-EM map utilizing the real space, and B-factor refinement subroutines in the Phenix24 program. The CC and refinement statistics, including root mean square deviations (RMSD), bond lengths and angles were analyzed by Phenix. Model adjustment and refinement were performed iteratively in Coot and Phenix, and the statistics were examined using Molprobity50 until no further improvements were observed. The final map and model were then validated using 1) EMRinger23 to compare map to model, 2) SPARX28 to calculate map local resolution and 3) 3DFSC program suite27 to calculate degree of directional resolution anisotropy through the 3DFSC.
Acknowledgements
Molecular graphics and analyses were performed with the UCSF Chimera package (supported by NIH P41 GM103331). We thank Bill Anderson at TSRI for help with EM data collection. We also thank David DeRosier and Gordon Louie for critical reading of the manuscript. The work was supported by Agency for Science, Technology and Research Singapore (to Y.Z.T.); NIH R01 GM109524 (to R.M. and M.A.M.), R01GM033050 (to T.S.B.), and NIH R01AI136680-01 (to D.L.). We acknowledge consultations with Bridget Carragher and Clinton S. Potter. Some of the work was performed at the National Resource for Automated Molecular Microscopy at the Simons Electron Microscopy Center which is supported by National Institute of General Medical Sciences (GM103310), Simons Foundation (SF349247) and NYSTAR.