Abstract
Bacterial type II secretion systems (T2SS) translocate virulence factors, toxins and enzymes across the cell outer membrane (OM). An assembled T2SS has not yet been isolated. Here we use a fusion of negative stain and cryo-electron microscopy (EM) to reveal the core architecture of an assembled T2SS from the pathogen Klebsiella pneumoniae. We show that 7 proteins form a stoichiometric ∼2.5 MDa complex that spans the cell envelope. The outer membrane complex (OMC) includes the secretin PulD and the pilotin PulS. The inner membrane assembly platform (AP) includes PulE and the cytoplasmic domain of PulL. These components combine to form a flexible hexameric hub consistent with an inactive state. The OMC and AP are coupled by PulC linkers across the periplasm with the PulC HR domain bound at the secretin base. Our results show the T2SS to have a highly dynamic modular architecture with implication for pseudo-pilus assembly and substrate loading.
The bacterial T2SS is found in human pathogens such as Acinetobacter baumannii, Chlamydia trachomatis, Escherichia coli and Vibrio cholerae1. It secretes a broad repertoire of substrates including digestive enzymes and infective agents like the cholera and heat-labile (LT) toxins2. Between 12-15 genes in a single operon usually encode the majority of T2SS components. Whilst the soluble domains for many of these proteins have been solved by X-ray crystallography3,4, their stoichiometry, binding partners, and temporal coordination for assembling a functional secretion apparatus is still poorly understood.
The protein GspD forms a 15-fold rotationally symmetric pore termed the secretin that inserts into the OM and provides a conduit for substrate into the external environment. OM insertion is usually dependent on a lipidated pilotin5, which binds to the GspD C-terminal S-domain with 1:1 stoichiometry6,7. The pilotin gene is often chromosomally discrete from the main T2SS operon. Multiple recent high-resolution cryo-EM structures report the partial secretin architecture8-10, and in complex with the pilotin7. However, the entire secretin has not yet been fully resolved due to instability in the periplasmic N0 and N1 domains.
Within the inner membrane (IM), GspL and GspM are bitopic and monotopic membrane proteins, respectively, that together form homo-and hetero-dimers11. Combined with the polytopic membrane protein GspF12 and the ATPase GspE, these proteins constitute an assembly platform (AP) for the pseudo-pilus13. The relative stoichiometry and overall ultrastructure of the AP is unknown. GspE is a cytoplasmic AAA+ ATPase that energises the T2SS and drives pilin assembly and pseudo-pilus formation. The active state is considered to be a hexameric ring as ATP turnover is significantly upregulated in an artificially oligomerized GspE-Hcp1 fusion14. The homologous ATPases PilB and PilT in the closely related type IV pilus (T4P) system also function as hexamers15,16. Ultimately, the functional oligomeric state of GspE has not yet been determined. The GspE N-terminal N1E domain connects to the N2E domain with an extended linker. A shorter but known flexible linker connects the N2E domain to the C-terminal CTE ATPase domain17. Such inherent flexibility within GspE is predicted to facilitate key large-scale conformational changes. The N1E domain of GspE forms a 1:1 stoichiometric complex with the cytoplasmic domain of GspL17,18. These two proteins contact GspF13,19, which is predicted to reside centrally. Concerted interplay between the GspE, GspL and GspF complex are thought crucial for coupling GspE conformational changes to the mechanical loading of pilin subunits within the pseudo-pilus assembly20,21,22,23. As GspL is a bitopic membrane protein, the direct contact between GspE and GspL also represents a mechanism for enabling cross-talk across the inner-membrane to other periplasmic components such as GspM. The coupling of the AP and OMC across the cell envelope is mediated by GspC, where the GspC N-terminus associates with GspL and GspM within the inner membrane11. The C-terminus of GspC must then span the periplasm as the GspC HR domain binds the GspD N0 domain24.
Here we isolate an assembled T2SS so that both OM and IM components are captured together. Using a fusion of cryo and negative stain EM, combined with stoichiometry measurements, we provide a reconstruction of the entire OMC and a model for the cytoplasmic components of the AP. Combined they reveal the core ultrastructure of this cell envelope spanning nanomachine.
Purification and EM analysis of PulCDELMNS
The T2SS from the human pathogen K. pneumoniae comprises 12 genes in a single unidirectional operon termed PulC through to PulN (Fig. 1a). Note that Pul and Gsp nomenclature relate to equivalent proteins in homologous T2SS systems. The pilotin PulS is located in a separate position within the chromosome. These 13 genes were cloned and over-expressed in Escherichia coli. Using affinity chromatography tags positioned on the cytoplasmic ATPase PulE and the periplasmic pilotin PulS, a complex containing 7 components was purified by two successive pulldowns. Glutaraldehyde stabilization was included after the initial pulldown. The complex comprised PulC, PulD, PulE, PulL, PulM, PulN and PulS, and is here termed PulCDELMNS (Fig. 1b). Visualisation of PulCDELMNS by negative stain EM yielded particles ∼40 nm long and 17-22 nm wide (Fig. 1c). The OMC PulD secretin was readily identifiable within 2D class averages. Hanging beneath the OMC and separated by a 5-10 nm gap the inner membrane AP was observed. Highly flexible linkers connect the OMC and AP so that these two assemblies effectively constitute independent particles tethered together. The PulCDELMNS complex was vitrified on thin carbon film and imaged by cryo-EM (Fig. 1c). 2D class averages of PulCDELMNS yielded well resolved side views of the OMC. All domains of PulD were identifiable with additional densities observed at the base of the secretin where the PulC HR domain was expected to bind to the N0 domain24, and where PulS decorates the exterior of the secretin core7 (Fig. 1c). The AP was not structured here due to high flexibility and averaging effects. In the absence of glutaraldehyde, the OMC sometimes separated from the AP and yielded top views, which confirmed PulD and PulS in 1:1 stoichiometric ratio7 with C15 symmetry (Fig. 1c). The stoichiometry of the other components within PulCDELMNS was determined by SDS-PAGE band integration using PulS for calibration (Fig. 1b). Band integration for PulD was avoided as its propensity to multimerize impeded analysis by SDS-PAGE despite phenol treatment. The relative stoichiometry of PulE:L:M:N was 5.8:6.7:6.3:3.4. PulN bound weakly to the complex so that after all purification stages it was likely under represented. Overall, the data supports a model where PulELMN constitute a 6:6:6:6 complex (Fig. 1c). The stoichiometry of PulC was 10.7 ± 4.3 suggesting an equilibrium between monomer and dimer so that between 6 and 12 copies are bound per PulCDELMNS complex.
PulD secretin structure determination
Focused refinement of the OMC yielded a reconstruction with an overall resolution of 4.3 Å (Fig. 2 and Supplementary Fig. 1). All domains of PulD were resolved with sufficient map quality (Supplementary Fig. 2a) to build a complete model of the monomer and the secretin, excluding the amino acids (aa) in loops 288-303, 462-470 and 632-637 (Fig. 3 and Supplementary Fig. 3 and 4a). The PulD fold is similar to partial Escherichia coli K12 and H10407 GspD models (RMSD Cα = 3.2 Å and 3.3 Å) where the secretin core and N3 domain have been described, along with homology modelled N2 and N1 domains7,10 (Supplementary Fig. 4b). The entire secretin is 20 nm long with an external diameter of 15 nm at the base. It includes an occluding central gate and N3 domain constriction sites within the secretin channel. It lacks a Vibrio cholerae cap gate10 (Fig. 2b and Fig. 3). The N1, N2 and N3 domains pack tightly (Supplementary Fig. 5) with a diagonal offset of 36°. N0 is positioned almost directly below the N1 domain and does not maintain the diagonal offset. The N0 fold is similar to that described in crystal structures 24-26 with a core of 2 helices flanked on each side by β-sheets.
However, its position relative to the N1 domain within the secretin is significantly different to these crystal structures (Supplementary Fig. 4c). The N0 and N1 domains are connected by loop 7, which constitutes a substantial 26 aa linker. The N-terminus of loop 7 forms a wedge that packs between neighboring N1 domains. Its C-terminus partially envelops the proximal N1 domain whilst making additional secondary contacts with N0 domain helix 2 (Fig. 3a and Supplementary Fig. 5d). The N0 domains form a tightly packed ring with alternating stacked β-sheets sandwiched between helices 2 and 4 (Supplementary Fig. 5d). Failure to stabilize the N0 domain and to promote formation of the loop 7 wedge likely accounts for the previously reported N0, N1 and N2 domain flexibility in other systems 7-10,27. Overall, a single PulD monomer has an azimuthal span twisting around the secretin long axis of 130° (Fig. 3c).
PulC HR domain binds at the secretin base
Hanging beneath the N0 domains in the OMC map, additional globular densities at 7-8 Å resolution protrude from the secretin base (Fig. 2 and Supplementary Fig. 1b). Focused refinements28 failed to markedly improve resolution. These densities were predicted to be the PulC HR domain given its known interaction with the PulD N0 domain in homologous systems24,29. Using the GspC HR domain and GspD N0 domain crystal structure24 as a reference, a homology model of PulC HR domain was fitted as a rigid body (Supplementary Fig. 4d). The model closely follows the surface envelope of the map in this region, with a pair of triple β-sheets opposed around a central cavity (Fig. 2c). Based on the quality of this fit, the PulC HR domain was assigned to each globular protrusion. Given the stoichiometry of PulC (Fig. 1b), not every PulD N0 domain was expected to bind a PulC HR domain in the assembled PulCDELMNS complex despite there being no steric clash between bound PulC HR domains. In this way, any symmetry mismatch may be readily overcome between the OMC and the inner membrane AP. Overall, the binding of the PulC HR domain to the PulD N0 domain appears to be a critical factor for the correct positioning of the N0 domain within the secretin and the subsequent stabilization of the N1 and N2 domains. Additional stabilization is derived from a substantial plug that occludes the lumen of the secretin at the level of the N0-N1 domains (Fig. 2b). The plug was observed in both the 2D class averages (Fig. 1b) and the 3D reconstruction. Note that high resolution plug ultrastructure was not resolved likely due to symmetry mismatch with the C15 averaged OMC. Attempts to resolve the plug structure through refinement using lower symmetries yielded reconstructions of insufficient quality and resolution. A speculative candidate for plug formation is the PulC PDZ domain given its role in substrate recruitment 30 and positive modulation by the PulD N1 domain31. Truncation of the PulC PDZ domain destabilized the PulCDELMNS complex leading to aggregated and poorly assembled particles.
The PulS pilotin decorates the secretin core
Decorating the outside of the secretin core proximal to the PulD S-domain in the map, globular densities were observed in a position consistent with the pilotin AspS relative to GspD in ETEC7 (Fig. 2). For these densities, map resolution was limited to ∼7 Å (Supplementary Fig.1) and focused refinements28 did not markedly improve resolution. A homology model of the PulS pilotin in complex with the PulD S-domain helix 15 based on the equivalent structure in Dickeya dadantii (PDB 4K0U32) was fitted as a rigid body (Supplementary Fig. 2b). Compared to AspS7, the position of PulS differs by a 12° azimuthal rotation around the secretin long axis (Supplementary Fig. 4b). Loop 38 between S-domain helix 14 and helix 15 bound to PulS constitutes the lone contact point between the secretin core and PulS (Supplementary Fig. 2b). No additional contacts were observed in contrast to AspS-GspD where the secretin core helix a11 forms extensive secondary contact with the pilotin7. The lack of equivalent secondary contacts between PulS and PulD likely accounts for the apparent flexibility between these proteins.
PulC links the OMC to the inner membrane AP
Whilst the PulC HR domain binds to the base of the secretin, its N-terminus is located within the inner membrane AP11 so that PulC is predicted to span the periplasm and link the AP and OMC. To verify the presence and positioning of the PulC N-terminus within the AP, a hexahistidine tag was inserted after aa 61 where PulC was predicted to exit the inner membrane and enter the periplasm. Ni-NTA gold labelling showed beads localize exclusively to the AP and not the OMC (Fig. 4a and Supplementary Fig. 6). Given the PulC HR domains bind to the base of the secretin, PulC therefore spans the periplasmic gap between the OMC and AP (Fig. 1c).
PulE and PulL form a flexible hexameric hub
Cryo-EM 2D class averages of the PulCDELMNS complex revealed the ultrastructure of the AP positioned beneath the OMC. A 20-22 nm outer ring is coupled to a 10-12 nm inner ring by six radial linkers (Fig. 4b). Focused alignments of the AP where the OMC was masked out show the outer ring to be comprised of weakly associating non-contiguous globular densities with the inner ring exhibiting overall C6 symmetry (Fig. 4c and Supplementary Fig. 7a). This concentric ring structure is highly flexible and represents the preferred single orientation of the AP so that 3D structure determination was impeded. The addition of non-hydrolysable ATP analogues made no obvious change to the PulCDELMNS complex under conditions tested. In order to dissect the observed AP ultrastructure, a sub-assembly constituting PulE, PulL, PulM, and PulN (PulELMN) was purified by 2-step affinity chromatography. GraFix33 stabilized the complex and reduced significant particle heterogeneity. As indicated by the PulCDELMNS stoichiometry measurements, PulN bound weakly to the complex and only trace quantities were observed by SDS-PAGE within the PulELMN complex after GraFix (Supplementary Fig. 7b). The resultant PulELM complex was analyzed by cryo and negative stain EM on continuous carbon film (Fig. 4d and Supplementary Fig. 7b and 7c). Under vitreous conditions, PulELM yielded a preferred orientation concentric ring structure that was similar to the AP within the PulCDELMNS complex, with equivalent dimensions and overall C6 symmetry. By negative stain, the same ultrastructure was observed although the sample was compacted so that the outer and inner rings have dimensions of 16-18 nm and 8-9 nm, respectively. Compaction was likely a consequence of sample flexibility and drying effects during the negative stain procedure. Purification and GraFix stabilization of PulE alone or PulE in complex with the cytoplasmic domain of PulL17 (aa 1-235 and termed PulELcyto), both isolated from the membrane fraction, yielded negative stain 2D class averages similar to PulELM with the same single preferred orientation (Supplementary Fig. 8a and 8b). PulE therefore constitutes the bulk of the observed AP concentric ring structure. The N2E and CTE ATPase domains of PulE homologues constitute flexible rings with 11-15 nm diameter depending on crystal packing14-16. Our data is consistent with a model of PulE where the N2E and CTE domains constitute the loosely packed outer ring, whilst the N1E domain locates to the inner ring (Supplementary Fig. 8a). Such architecture is reminiscent of the hexameric Hsp104 chaperone whose N and NBD1 domains form a similar concentric ring structure with equivalent dimensions34. The PulE model suggests that within the PulELcyto complex, the PulLcyto subunits will be located within the inner ring given PulLcyto and PulEN1E form a 1:1 stoichiometric complex17,18 (Fig. 4f). To demonstrate this, PulEN1E and PulLcyto were purified as a complex (termed PulN1E-Lcyto) by 2-step affinity chromatography with 1.1:1 stoichiometry (Supplementary Fig. 8c). Negative stain EM 2D class averages of PulE-N1E/Lcyto showed 8-9 nm diameter rings with C6 and sometimes C5 symmetry, which was consistent with the diameter and symmetry of the inner ring in the PulELcyto and PulELM complexes when imaged by negative stain EM (Fig. 4f and Supplementary Fig. 8c). PulEN1E and PulLcyto therefore constitute the inner ring of the AP concentric ring structure. The existence of a PulE-N1E/Lcyto ring has previously been speculated based on a Vibrio vulnificus GspE and GspL cytoplasmic domain co-crystal structure17.
Discussion
Our results support a model where the core cytoplasmic components of the T2SS AP constitute a PulE-N1E/Lcyto ring-like hub with 6:6 stoichiometry and C6 symmetry. The PulE N2E and CTE domains form a dynamic and weakly interacting ring hanging from the PulE-N1E/Lcyto hub (Fig. 4g). This conformation is consistent with a T2SS relaxed or inactive state. In some instances, C5 symmetry class averages in the PulE-N1E/Lcyto hub and in the PulELM complex were observed suggesting that the AP may be able to assemble and function using both C5 and C6 symmetries. The periplasmic components of the PulCDELMNS and PulELM complexes, which include the periplasmic domains of PulL, PulM and PulN were not resolved here likely due to relatively small size and disorder induced by the absence of membrane. Other T2SS components such as the pseudo-pilus may be essential for stabilizing these periplasmic domains. The cryo-electron tomography structure of the closely related T4P system 35 indicates that these periplasmic components will form a ring or shaft collar that acts as a support for the pseudo-pilus.
PulC spans the periplasm to recruit the OMC with the PulC N-terminus located within the inner membrane and the HR domain bound to the PulD N0 domain at the secretin base. Although the secretin has 15 available PulC HR domain binding sites, stoichiometry measurements suggest PulC exists in a monomer-dimer equilibrium with a copy number between 6 and 12. In an activated T2SS the copy number likely shifts towards 6 so as to be in equal stoichiometric ratio with other AP components36. This means that PulC constitutes a punctuated cage that spans the periplasm so that substrate has the potential to gain access to the secretin lumen in positions where PulC is absent. Such a cage is reminiscent of the virB10 N-terminus which spans the periplasm in the type IV secretion system37. The PulC PDZ domain is closely linked to the PulC HR domain and therefore represents a speculative candidate for the plug domain. Given the PulC PDZ domain has a potential role in substrate recruitment30, its potential positioning within the secretin lumen may provide a natural mechanism for coupling secretin gating with substrate loading.
The PulE-N1E/Lcyto ring is dynamic and its 3-4 nm central lumen, as observed under vitreous conditions, sufficient to accommodate the cytoplasmic domains of the polytopic membrane protein PulF12, which is a key driver of pseudo-pilus assembly. PulF homologues are known to contact PulE and PulL in other T2SS systems13,19. The equivalent protein PilC in the T4P system promotes pilus assembly through a possible rotary mechanism driven by large scale conformational changes in the PilB and PilT hexameric ATPases15,16,38,39. For the T2SS, PulE activates the secretion system by ATP hydrolysis, which requires association of the N2E and CTE domains and formation of a closely packed hexamer14. During the formation of this hexamer, our data supports a model where the closer packing between the N2E and CTE domains shifts them centrally beneath the PulE-N1E/Lcyto ring17 so that direct contact with PulF is facilitated (Fig. 4g). Such an arrangement is broadly consistent with the ultrastructure of the T4P system with PilB or PilT bound, as described by cryo-electron tomography35. Note that since the PulE N1E domain is absent in T4P PilB and PilT, it is not expected that these ATPases are recruited to the T4P and form a T2SS-like relaxed or inactive state as described here (Fig. 4g). Overall, our results reveal the ultrastructure of an assembled T2SS and show the core architecture to be different to other known secretion systems (Supplementary Fig. 9).
Author Contributions
A.C and H.L designed experiments. H.L initially cloned and purified the PulCDELMNS complex.
A.C and H.L cloned and purified proteins, collected and processed data, and solved structure.
A.C built the OMC structure. A.C determined stoichiometry and undertook gold labelling. H.L wrote the paper with contributions from A.C.
Author Information
The authors declare no competing interests. Correspondence and requests for materials should be addressed to H.L. (h.low@imperial.ac.uk).
Data Availability
3D cryo-EM density maps produced in this study have been deposited in the Electron Microscopy Data Bank with accession code EMD-0193. Atomic coordinates have been deposited in the Protein Data Bank (PDB) under accession code 6HCG.
Methods
Cloning, protein expression and purification
All clones were generated using a modified version of the Gibson isothermal DNA assembly protocol44. To obtain the PulCDELMNS complex, the Klebsiella pneumoniae T2SS operon encoding genes from pulC to pulO was cloned into pASK3c vector (IBA-GO) with a StrepII tag at the C-terminus of PulE. The pulS gene was cloned into pCDF-duet vector with a C-terminal Flag tag. These vectors were co-transformed into Escherichia coli C43 (DE3) electro-competent cells (Lucigen) modified here to incorporate a pspA gene knockout (PspA is a common contaminant induced by PulD over-expression). Cells were grown on selective LB-agarose plates with chloramphenicol (30 μg/ml) and spectinomycin (50 μg/ml). 2xYT media was inoculated and cells grown at 36°C until induction at OD600 = 0.5-0.6 with anhydrotetracycline (AHT, 0.2 mg/L) and isopropyl β-D-1-thiogalactopyranoside (IPTG, 0.24 g/L). Cells were grown for ∼15 hr at 19 °C and processed immediately. Pellets were re-suspended in ice-cold buffer 50 mM Tris-HCl pH 7.5, treated with DNase I, lysozyme and sonicated on ice. The lysate was clarified by centrifugation at 16,000g for 20 min. The membrane fraction was collected by centrifugation at 142,000g for 45 min. Membranes were mechanically homogenized and solubilized in 50 mM Hepes-NaOH pH 7.5, 150 mM NaCl, 1 % w/v DDM (Anatrace) and 5 mM EDTA at room temperature for 30-40 min. The suspension was clarified by centrifugation at 132,000g for 15 min. The supernatant was loaded onto a StrepTrap HP column (GE Healthcare) and washed with 50 mM Hepes-NaOH pH 7.5, 150 mM NaCl, 0.06 % w/v DDM and 5 mM EDTA (Buffer W) at 4 °C. All prior buffers were supplemented with EDTA-free cOmplete protease inhibitor tablets (Roche). The protein sample was eluted in Buffer W supplemented with 2.5 mM desthiobiotin (IBA) but with protease inhibitors removed. Peak fractions were pooled, 0.05 % glutaraldehyde (Sigma-EM grade) added and incubated on ice for 10 min before quenching with 100 mM Tris-HCl pH 7.5. The sample was batch incubated with Flag resin (Sigma) for 1 hr. Flag resin was washed with Buffer W and then eluted with the same buffer supplemented with 3xFlag peptide. Peak fractions were collected and used immediately.
To obtain purified PulE, pulE was cloned into pASK3c vector to include an N-terminal StrepII tag and C-terminal Flag tag. The same initial purification strategy was then followed as for the PulCDELMNS complex with the exception that no glutaraldehyde or Tris quenching buffer were added subsequent to elution from the Strep column. After elution from the Flag column, due to sample heterogeneity as judged by negative stain EM, GraFix33 was undertaken. Using Beckman Ultra-Clear 4.2 ml 11×60 mm ultracentrifugation tubes 2.1 ml of 50 mM Hepes-NaOH, 150 mM NaCl, 0.06 % w/v DDM, 30 % v/v glycerol, 5 mM EDTA and 0.1 % glutaraldehyde was loaded under 2.1 ml of the equivalent but with 10 % v/v glycerol. A continuous gradient was made using a BioComp Gradient Master cycle set for 66 seconds at 83° tilt and 22 rpm. The PulE sample was loaded and spun at 71,000g for 16 hr at 4°C using a Beckman Ti 60.1 swing rotor. 150 μl fractions were collected manually and analysed.
To obtain the PulE-N1E/Lcyto or the PulELcyto complex, the N1E domain comprising aa 1-108 from pulE, or the full-length gene, were cloned into pASK3c vector to include a C-terminal StrepII tag. For PulLcyto, aa 1-235 relating to the cytoplasmic domain of PulL were cloned into pCDF-duet vector with a C-terminal Flag tag. To obtain the PulELM complex, the full-length pulE gene was cloned into pASK3c vector to include an N-terminal StrepII tag. The pulL, pulM and pulN region of the T2SS operon was cloned into pCDF-duet vector with a PulN C-terminal Flag tag. Desired combinations of vectors were co-transformed, and the same 2-step affinity chromatography followed by GraFix purification strategy was then followed for all these complexes as for PulE.
Gold labelling
This was performed on the PulCDELMNS complex modified to include a hexahistidine tag within PulC after aa 61. Purification was the same as for the PulCDELMNS complex but without the addition of fixative. 5 nm Ni-NTA-Nanogold (Nanoprobes) pre-washed in Buffer W was added to 10 μl of the protein sample and incubated for 30 min at 4 °C. A homemade continuous carbon grid was deposited on the 10 μl sample for 3 min, blotted and washed 2 times in Buffer W supplemented with 10 mM imidazole, then 2 times in Buffer W before being stained with 3 drops of 2 % uranyl acetate.
Stoichiometry determination
Purified PulCDELMNS complex from 4 independent purifications was extracted using phenol to disrupt PulD multimerization45. The sample was precipitated with an equal volume of ice cold phenol, vortexed and then 4 volumes of ice cold acetone were added. The mixture was kept overnight at −20°C. The precipitate was pelleted in a bench-top centrifuge at 14,000 g for 30 min at 4 °C, washed once with ice cold acetone, dried under vacuum, resuspended in Buffer W with SDS loading buffer, and analyzed by SDS-PAGE and Sypro stain. The stoichiometry was calculated by band integration using ImageJ having normalized all intensities against the known stoichiometry of PulS. Despite phenol treatment, PulD stoichiometry was variable with PulD often under-represented relative to PulS and other components within the complex. This was likely due to incomplete phenol extraction or multimerization effects occurring during SDS-PAGE. PulD band integration was therefore not undertaken and its stoichiometry, as with PulS, assigned from EM studies undertaken in this work and carried out previously7,10. To determine the stoichiometry of the PulE-N1E/Lcyto complex, gels from 2 independent experiments with duplicate lanes run were stained with Simply Blue (Invitrogen) and then quantified as above.
Electron microscopy sample preparation and data collection
For outer membrane complex (OMC) structure determination, 4 μl of purified PulCDELMNS complex solution was incubated for 30 seconds on glow discharged homemade continuous thin carbon grids before vitrification in liquid ethane using a Vitrobot Mark IV (FEI). Data was collected at 300 kV on a Titan Krios (M02 beamline at eBIC Diamond, UK) equipped with a Gatan Quantum K2 Summit detector. Images were acquired at a magnification of 28,090 yielding 1.78 Å/pixel using EPU software. Images were dose-weighted over 40 frames with 12 second exposures. Total dose was ∼50 e/Å2.
All other cryo and negative stain datasets were collected in-house at 200 kV on a Tecnai F20 microscope equipped with Falcon II direct electron detector. For cryo data, PulCDELMNS complex (Fig. 4b-c and Supplementary Fig. 7a) was vitrified as described above. Other GraFix treated samples required glycerol removal so that 4 μl of sample was loaded onto the glow discharged continuous carbon EM grid and after 1 min incubation was washed 4 times in Buffer W before plunge freezing. Images were acquired at a magnification of 90,909 yielding 1.65 Å/pixel using EPU software. Images were collected over 54 frames with 3 second exposures. Total dose was ∼50 e/Å2. For negative stain data, 4 μl of PulCDELMNS complex was loaded onto the glow discharged continuous carbon EM grid, after 40 seconds the grid was washed with 3 drops of distilled water and stained with 3 drops of 2 % uranyl acetate. Other GraFix treated samples were similarly incubated on EM grids, washed iteratively with 4 drops of 15 μl Buffer W and then negatively stained as above. Images were acquired at a magnification of 90,909 using EPU software. Single frames were collected with 1 second exposure and total dose ∼15 e/Å2.
Image processing
For OMC structure determination, individual movie image frames were aligned with MotionCor246 and the contrast transfer function estimated using Gctf 1.0647. Low quality images were discarded and 3427 micrographs used for subsequent reconstruction in Relion 2.148. Initial manual particle picking was focused on the OMC/secretin region of the PulCDELMNS complex. For particle extraction a box and mask diameter were chosen so that contributions from the inner membrane assembly platform (AP) were excluded. In this way, low-resolution 2D class averages of just the OMC were used as a template for auto-picking. OMC side views only were prevalent in this dataset, which provided a sufficiently even equatorial band distribution for a reliable reconstruction49. Low quality particles were removed by 4 rounds of 2D classification resulting in a stack of 36,240 particles. A single round of 3D classification was undertaken generating 10 classes. The Vibrio cholerae GspD reconstruction EMDB-1763 was used as an initial model filtered to 40 Å50. C15 symmetry was applied based on top views of the OMC (obtained in an alternative PulCDELMNS complex purification) and an unambiguous 15 peaks observed from the rotation auto-correlation function calculation (Fig. 1c). A single class containing 7284 particles was used for the final refinement, which attained 4.4 Å resolution. Post-processing yielded 4.3 Å resolution with an auto-estimated B-factor 51 of −142 Å2 applied to sharpen the final 3D map for model building. A locally sharpened map was also generated using LocScale40 once initial models were built. Further particle polishing and 3D refinement did not yield a marked increase in resolution. Resolutions reported are based on gold standard Fourier shell correlations (FSC) = 0.143. Statistics for data collection and 3D refinement are included in Table 1. Local 3D refinements with various particle subtraction strategies focusing on PulS or PulC HR domain did not markedly improve resolution.
To generate all other 2D class averages both in cryo and negative stain conditions as for PulE, PulE-N1E/Lcyto, PulELcyto, PulELM, and PulCDELMNS complexes, the following protocol was followed. Working initially within Relion 2.0 or 2.1, Gctf 1.06 was used for estimating the CTF. Negative stain micrographs were phase flipped. Low quality micrographs were discarded. Initial 2D class averages were generated from a manually picked stack to yield templates for autopicking. 3-4 rounds of 2D classification were then undertaken to remove low quality particles. Using the ‘relion_stack_create’ a cleaned image stack was generated for further processing. For cryo-EM images the stack was created from phase flipped particles. In Imagic52, particles were normalised, band pass filtered, centred and subjected to reference-free MSA and classified. The best classes, typically judged by lowest variance, were used as references for multi-reference alignment (MRA) in Spider53 followed by MSA and classification in Imagic. This cycle of MRA and MSA was typically iterated a further 2-3 times. For PulE, 805 micrographs yielded 92,507 extracted particles and a cleaned stack of 7061 particles for subsequent MSA and MRA. Note that in vitro PulE self-assembled loosely to yield particles with variable size and copy number. This combined with the inherent flexibility between N1E, N2E and CTE PulE domains yielded heterogeneous particle datasets. GraFix significantly improved the quality of the particle and facilitated particle averaging. Generally, for PulE, PulE-N1E/Lcyto, PulELcyto and PulELM(N) complexes, relatively large datasets were therefore required to yield 2D class averages of sufficient quality to robustly determine 2D ultrastructure. Due to PulE, particles from all these datasets usually adhere to continuous carbon EM grids with a single preferred orientation consistent with a bottom or end view. For PulE-N1E/Lcyto, 363 micrographs yielded 108,324 extracted particles and a cleaned stack of 26,341 particles. For PulELcyto, 333 micrographs yielded 101,245 extracted particles and a cleaned stack of 49,534 particles. For negative stain PulELM(N) data, 2746 micrographs yielded 81,727 extracted particles and a cleaned stack of 63,014 particles. For cryo PulELM(N) data, 467 micrographs yielded 4937 hand-picked particles, and all particles were used for subsequent MSA and MRA. For PulCDELMNS, 3059 micrographs yielded 89,381 extracted particles and a cleaned stack of 66,855 particles for subsequent MSA and MRA (Fig. 4b). From this same dataset, 1148 micrographs were then used to hand-pick 4020 AP-focused particles (Fig. 4c and Supplementary Fig. 7a). Rotation auto-correlation functions were calculated using Imagic.
Model building
A PulD homology model was generated with I-Tasser54 using the E. coli GspD PDB 5WQ7 as a template. This yielded a starting model for the secretin core, and N3, N2 and N1 domains. E. coli GspD shares 57 % sequence identity with PulD. A homology model for the N0 domain was generated using Swissmodel55 and the relevant part of 3OSS as template (51 % sequence identity). Homology models were rigid body fitted into the map using Chimera56 Fit in Map function. Using these models as a starting guide and the side chain detail from bulky residues to confirm sequence register, Coot57 was used to manually build a complete model for PulD aa 27-652 excluding aa 288-303, 462-470 and 632-637. The model was further refined using real-space refinement in Phenix58 with secondary structure, geometry and NCS restraints applied. For the low-resolution regions specific to the PulC HR domain, a homology model based on ETEC GspC HR domain PDB 3OSS (27 % sequence identity) was generated using Swissmodel. For fitting this homology model, the PDB 3OSS which includes the ETEC GspD N0 domain was first superimposed onto the PulD N0 domain. The PulC HR domain homology model was then superimposed onto the ETEC GspC HR domain from PDB 3OSS resulting in a near perfect fit within the map. The Chimera Fit in Map function was then applied to PulC HR domain resulting in a minor shift so that the PulC HR-PulD N0 domain complex has a RMSD Cβ= 1.4 Å when aligned to PDB 3OSS (Supplementary Fig. 4d). For the low-resolution regions specific to the PulD S-domain C-terminus (aa 638-652) in complex with the PulS pilotin, a homology model was generated using Swissmodel based on the equivalent structure from Dickeya didantii PDB 4K0U (>50 % sequence identity for both chains). The map was low pass filtered to 8 Å and the homology model initially fitted manually so that the PulS lipidated N-terminus orientated towards the membrane. The Chimera Fit in Map function was then used for final positioning. Cross-validations were carried out as previously described10,59 using the auto-estimated B-factor sharpened map. Briefly, the PulD secretin model was displaced randomly by 0.2 Å and then refined against a map reconstructed from one of the independent data halves (Half map 1). FSC curves were then calculated using the resulting model and Half map 1 (FSCwork). FSC curves were also calculated between this same model and another reconstruction generated from the other independent data half (Half Map 2 and FSCfree). The similarity between FSCwork and FSCfree curves indicates an absence of overfitting within the PulD secretin model (Supplementary Fig. 4d). The final models were assessed using Molprobity60 and statistics outlined in Table 1.
Acknowledgements
We thank eBIC for cryo-EM data collection support particularly Kyle Dent and Yuriy Chaban. Tillmann Pape and Paul Simpson for in-house EM support. Francesca Gubellini for gold labelling advice. Arjen Jakobi and Carsten Sachse for LocScale support. This work was funded by a Wellcome Trust Career Development Fellowship Enhancement Award (200074/Z/15/Z) to H.L.