Abstract
In bacteria and archaea, several distinct types of CRISPR-Cas systems provide adaptive immunity through broadly similar mechanisms: short nucleic acid sequences derived from foreign DNA, known as spacers, engage in complementary base pairing against invasive genetic elements setting the stage for nucleases to degrade the target DNA. A hallmark of type I CRISPR-Cas systems is their ability to acquire spacers in response to both new and previously encountered invaders (naïve and primed acquisition, respectively). In this work, we leverage the power of Legionella pneumophila, a genetically tractable, gram-negative bacterium and the causative agent of Legionnaires disease, to examine CRISPR array dynamics and the interplay between two extremely similar type I-F systems present in a single isolate. Using an established transformation efficiency assay, we show that the type I-F system in L. pneumophila is a highly protective system, with prominent spacer loss occurring in some transformed populations for both plasmid and chromosomal systems. Turning to next-generation sequencing, we demonstrate that, during a primed acquisition response, both systems acquire spacers in a strand-biased and directional manner, consistent with the patterns observed for previously studied type I-F systems in other bacterial species. We also show that the two systems can undergo cross-priming, whereby a target for one system can stimulate a primed acquisition response in the second. Finally, we combine these experimental data with bioinformatic analyses to propose a model in which cross-priming may replenish a depleted CRISPR array following a mass spacer deletion event.
IMPORTANCE Legionella pneumophila is an aquatic bacterium that causes Legionnaires’ disease, an often-fatal pneumonia. Many L. pneumophila strains possess one or more bacterial immune systems (CRISPR-Cas) that protect them from potentially harmful genetic elements. The genetic tractability of L. pneumophila, together with the diversity of CRISPR-Cas systems found within the species, make these bacteria attractive model systems within which to study bacterial defenses. In particular, key strengths are the ability to compare the functionality of different systems in otherwise identical genetic backgrounds and the cross-talk between multiple systems present within a single isolate. In this work, we characterized two nearly identical systems in a single L. pneumophila isolate and propose a model whereby cross-talk may restore functionality to otherwise defenseless systems.
Introduction
Microorganisms have evolved over millions of years to survive in harsh environments, and their prosperity can be attributed in part to immune strategies that protect against antagonistic genetic elements, such as viral phages and foreign DNA elements (1). Clustered regularly interspaced short palindromic repeats (CRISPR) when coupled with associated cas genes form a potent adaptive immune response in numerous prokaryotic species (2–4). These systems have been classified into six major types, which are further divided into various sub-types, based on their mechanism of action and Cas protein content (5–7).
A CRISPR response to invading DNA occurs in three distinct phases: adaptation, expression and interference (2–4). In the adaptation phase, the CRISPR-Cas system acquires a DNA sequence (spacer) from the invader and integrates it into an array of spacers interspersed with repetitive sequences (2, 8–11). The spacers are generally derived from foreign elements whose infection was unsuccessful, such as defunct phage (12), and form the basis of immunological memory for the bacterium. During the expression phase, the array is transcribed and processed to form CRISPR RNA (crRNA) molecules that recruit Cas proteins to form a surveillance complex (3, 13). Infection by a previously encountered invader initiates the interference step, wherein the surveillance complex recognizes and binds the foreign DNA via base-pairing with the complementary crRNA, and cleaves it using a double stranded break, effectively neutralizing the threat to the host (3, 14–16).
Although there are many differences between CRISPR-Cas systems, Cas1 and Cas2 are present in all known systems (5–7), and are the only Cas proteins necessary for adaptation in Escherichia coli type I-E systems (17, 18). When an invading element has not been previously encountered by the bacterium, “naïve” acquisition occurs (17, 19), in which Cas1 and Cas2 form a complex (20–22) that binds a dsDNA “pre-spacer” substrate (23), which is processed and integrated into the CRISPR array on the leader-proximal end (23, 24).
Despite the sophistication of CRISPR-Cas systems, phages and foreign DNA elements can still escape CRISPR-Cas targeting. A common mechanism of escape is the accumulation of random mutations, which can prevent complementary base pairing with crRNAs during interference (18, 25, 26). Although effective, the CRISPR-Cas system can overcome this challenge by simply acquiring a new spacer; in fact, imperfect CRISPR targeting often leads to a highly efficient “primed” acquisition response, providing an intrinsic mechanism to protect against mutational escape (18, 27–30). Primed acquisition has been studied in type I-B (31–33), I-C (34), I-E (18, 27–29, 35) and I-F (30, 36, 37) CRISPR-Cas systems, and a model has been proposed in which the interference complex is recruited to the targeted sequence and subsequently “slides” away from the site in a 3′-5′ direction (30, 31, 37). When it recognizes an appropriate protospacer adjacent motif (PAM) sequence, the complex recruits Cas1 and Cas2 to extract the spacer and integrate it into the array (30, 31, 37). Interference-driven acquisition, or targeted acquisition, has also been observed in type I-C (34) and I-F systems (37), wherein a primed acquisition response occurs against a target with a perfect match to a spacer already within the array.
Most isolates of Legionella pneumophila, a genetically tractable gram-negative bacterium and the causative agent of Legionnaires’ disease, possess any of three different CRISPR-Cas systems: types I-C, I-F and/or II-B (38, 39). We recently showed that the type I-C system actively acquire spacers to protect against invasion (39) and characterized its targeted acquisition response (34). One strength of L. pneumophila as a model is the frequent presence of multiple CRISPR-Cas loci in one isolate, allowing for the study of interplay between different systems. For instance, in L. pneumophila str. Lens, two type I-F CRISPR-Cas systems are present: one on its chromosome and one on an endogenous 60 Kb plasmid (38, 39). The two systems have a 97.6% Cas protein identity and the repeat units between the spacers in the CRISPR array differ by only a single nucleotide (39). The CRISPR arrays themselves are of different lengths (64 spacers for chromosomal Lens and 53 spacers for plasmid Lens) and each array contains a set of non-overlapping, unique spacer sequences (38, 39). The presence of two remarkably similar I-F systems in L. pneumophila str. Lens provided us with an opportunity to examine targeted spacer acquisition in both of these largely uncharacterized CRISPR-Cas systems and the interplay between them.
Results
The two type I-F CRISPR-Cas systems in L. pneumophila str. Lens can undergo targeted spacer acquisition and spacer loss
In previous studies, we established that L. pneumophila type I-C CRISPR-Cas systems are active (39), and that it is a relatively permissive system that allows for targeted spacer acquisition when challenged with the most recently acquired spacer in the CRISPR array (34). To similarly lay the groundwork for type I-F study in L. pneumophila, we sought to determine the appropriateness of perfectly matched protospacer containing plasmids for driving spacer acquisition in these systems. As a first step, we performed an established transformation efficiency assay (4) to assess CRISPR-Cas activity in both Lens systems using two different targeted protospacer sequences: one matching the most recently acquired spacer and one matching a spacer from the middle of the array. (Unless otherwise stated, all targeted protospacer sequences used to investigate spacer acquisition were located on the DNA minus (-) strand.) When normalized to a scrambled plasmid control transformation, the protospacer matching the most recently acquired spacer (spacer 1) exhibited a ~100-fold reduction in transformation efficiency compared to the protospacer matching a mid-array spacer (chromosomal spacer 23 and plasmid spacer 50) (Fig. 1).
To determine whether spacer acquisition occurs within the context of a perfectly matched protospacer target, we pooled the transformed populations, passaged them on an automated liquid handler for 20 generations without selection, extracted their genomic DNA and screened the leader end of the CRISPR array by PCR and agarose gel electrophoresis. Notably, while the populations transformed with plasmids encoding either protospacer 23 (chromosome) or protospacer 50 (plasmid) exhibited spacer acquisition in both Lens systems (Fig. 2), the populations transformed with protospacer 1 plasmids exhibited spacer loss, with spacer acquisition undetectable on a gel. While spacer loss has been noted previously in the literature (34, 40–44), its prominence in our populations stand in stark contrast to our observations on the L. pneumophila type I-C system, which is relatively permissive and highly adaptive - even in the context of a perfectly matched protospacer (34). Given this observation, we proceeded to use the mid-array targeted protospacer sequences for the remainder of our experiments on L. pneumophila type I-F adaptation.
Targeted spacer acquisition in the plasmid Lens CRISPR-Cas system
To characterize the patterns of targeted spacer acquisition in the plasmid Lens CRISPR-Cas system, we amplified the leader-proximal region of the plasmid Lens CRISPR array from the populations transformed with the protospacer 50 plasmid. We Illumina sequenced these PCR products and used an established bioinformatics pipeline (34) to identify newly acquired spacer sequences within each read (Table 1). We mapped the protospacer locations on the priming plasmid for the newly acquired spacers and visualized these patterns with Circos (45) using an average of three replicates (Fig. 3A), although the individual distributions for all three replicates were consistent (Fig. S1). Similar to the patterns of primed and targeted spacer acquisition observed in the Pectobacterium atrosepticum type I-F CRISPR-Cas system (30, 37), the plasmid Lens CRISPR-Cas system exhibited a biased distribution of acquired spacers. The majority of the acquired protospacers clustered around the priming sequence on the plasmid (Fig. 3A).Furthermore, the non-primed strand of DNA, in this case the plus (+) strand, contained ~3/4 of the newly targeted protospacers. A similar distribution skew was observed moving in the 3′ and 5′ directions from the priming protospacer, as the 3′ direction contained ~2/3 of the new protospacers, consistent with the aforementioned sliding model (30, 31, 37). One prediction of the sliding model is that swapping the strand on which the protospacer resides should result in a “mirror-reflection” pattern of acquisition (30, 37). To test this prediction, we repeated the above experiment with a protospacer 50 plasmid that targeted the (+) strand instead of the (-) strand. As expected, we observed the distribution of new protospacers mirrored the distribution observed when the (-) strand contained the targeted protospacer (Fig. 3B)
We next sought to determine the length distribution of the acquired spacers and the PAM sequences associated with the new protospacers. When the (-) strand contained the targeted protospacer, the predominant length for the acquired spacers was 32 nt (~95%), which is the only spacer length found in the wild-type plasmid Lens CRISPR array (Fig. 3C). The most prevalent PAM for the new protospacers was the canonical GG PAM found in type I-F systems (30, 36, 37, 46, 47), which accounted for ~95% of new protospacer PAMs (Fig. 3D and 3E). In the mirrored (+) strand targeted samples, the spacer length and PAM distributions are comparable with those of the (-) strand targeted samples (Fig. S2). Taken together, these data suggest distribution bias of new protospacers is influenced by the strand containing the targeted protospacer, while the spacer length and PAM distributions are not in the plasmid Lens CRISPR-Cas system, the, consistent with the results reported by Staals and colleagues for targeted acquisition in a P. atrosepticum type I-F CRISPR-Cas system (37).
Targeted spacer acquisition in the chromosomal Lens CRISPR-Cas system
After surveying the plasmid Lens CRISPR-Cas system for targeted spacer acquisition, we turned our attention to exploring this phenomenon in the chromosomal Lens CRISPR-Cas system. We amplified the leader-proximal region of chromosomal Lens CRISPR array from the populations transformed with the protospacer 23 plasmid, and subsequently analyzed targeted spacer acquisition as described for the plasmid Lens system.
Unsurprisingly, given how similar the chromosomal Lens and plasmid Lens systems are on a Cas protein sequence level, the distribution of new protospacers for the chromosomal Lens system resembled that of the plasmid Lens system (Table 1, Fig. 4A). The predominant spacer length was 32 nt, accounting for ~90% of acquired spacers (Fig. 4B), and the canonical GG PAM (30, 36, 37, 46, 47) also accounted for ~90% of new protospacer PAMs (Fig. 4C and 4D). Taken together, these results suggest that the chromosomal and plasmid Lens CRISPR-Cas systems operate in a highly comparable manner during targeted spacer acquisition.
The chromosomal Lens and plasmid Lens CRISPR-Cas systems can undergo cross-priming
Since the plasmid Lens CRISPR-Cas system and the chromosomal Lens CRISPR-Cas system function in a very similar manner during targeted acquisition, we speculated that cross-priming between the two systems could occur; that is, a targeted protospacer sequence for one CRISPR-Cas system could initiate a primed acquisition response in the second CRISPR-Cas system. In order to test this hypothesis, we analyzed spacer acquisition in the chromosomal CRISPR array in populations transformed with the protospacer 50 plasmid (complementary to the plasmid mid-array spacer) and analyzed spacer acquisition in the plasmid CRISPR array in populations transformed with protospacer 23 plasmid (complementary to the chromosomal mid-array spacer). We observed strikingly similar patterns of distribution for the new protospacers in the two populations (Fig. 5), which were comparable with those seen in the previous targeted acquisition experiments, indicating that the two CRISPR-Cas systems undergo a high degree of cross-priming. There were some slight, but noticeable, differences in protospacer distribution on the (+) strand on the 5′ end of the priming protospacer for the chromosomal Lens primed, plasmid Lens amplified sample. However, the peaks were not large enough for us to postulate that they are “hotspot” regions of spacer acquisition, and we did not investigate them further.
Bioinformatic analysis of L. pneumophila I-F CRISPR-Cas systems suggests cross-priming can re-populate depleted CRISPR arrays
We next aimed to further explore the implications of our observation that perfectly targeted protospacer 1 plasmids result in populations enriched for spacer loss. While selecting for maintenance of an efficiently targeted plasmid in the context of a wild-type CRISPR-Cas system is a laboratory construct, such observations may have real-world implications as CRISPR-Cas systems are known to acquire self-targeting spacers at a low, but detectable rate (17, 18, 28, 34, 36, 37). In such instances where a system accidentally acquires the ability to cleave its resident genome, our data suggest that loss of one or more spacers (sometimes the entire array) might be a mechanism by which to escape the dire consequences of such an event. Given our observations that a protospacer 1 priming plasmid promoted spacer loss instead of spacer acquisition (Fig. 2), and that cross-priming was occurring between the two Lens CRISPR-Cas systems (Fig. 5), we bioinformatically tested the hypothesis that cross-priming between two related CRISPR-Cas systems could be a way to re-populate a depleted CRISPR array.
In total, we analyzed five chromosome-based systems and three plasmid-based type I-F CRISPR-Cas systems present in different L. pneumophila isolates, using data collected from our previous study (39) and from Genbank (accessed September 2017). We evaluated three different criteria in each CRISPR array: the repeat sequence, any mutations present in the last repeat, and the number of spacers in the array (Table 2). Our analyses showed that 7/8 of the strains share the same repeat sequence and the same mutated last repeat sequence, with the exception of a C to T single nucleotide polymorphism present at position 12 in all repeats of the three plasmid-based systems. Notably, the remaining chromosome-based system, in L. pneumophila str. Alcoy, has no mutations in its last repeat. However, its repeat sequence is identical to the mutated last repeat sequence present in the other chromosome-based systems. One intriguing interpretation of these data is that Alcoy underwent a whole CRISPR array deletion through homologous recombination between the first and last repeat sequences, leaving it with only the mutated last repeat. This would have been followed by array replenishment, since the array contains 56 spacers, but no mutations have emerged in the repetitive sequences, suggesting this was a relatively recent event.
The spacer sequences in the Alcoy array are unique and many of the spacer targets are unknown. However, one spacer corresponds to a foreign plasmid element known as Legionella mobile element-1 (LME-1), that was discovered as a common target for CRISPR-Cas in many L. pneumophila strains (39). Together, our observations suggest that in strains with a depleted CRISPR array, if a plasmid harboring a related CRISPR-Cas system was horizontally transferred to the array-less strain, it could re-populate the CRISPR array through cross-priming when it comes into contact with a widespread foe, such as LME-1. Subsequent loss of this plasmid would leave little trace of such an event, other than a potential modification of the consensus repeat sequence.
Discussion
We previously showed that type I-C CRISPR-Cas in Legionella pneumophila is highly permissive, protects against a mobile genetic element, and is adaptive (34, 39). The patterns and fidelity of primed spacer acquisition that we observed for L. pneumophila type I-C were consistent with the previous observations of type I-F spacer acquisition in other bacterial species, including Pseudomonas aeruginosa (36), Escherichia coli (36) and Pectobacterium atrosepticum (30, 37). One strength of L. pneumophila as a model for studying CRISPR-Cas is the diversity of system types present in this species and the frequent coexistence of multiple CRISPR-Cas systems within the same isolate. We have bioinformatically identified eight distinct type I-F systems in Legionella, and experimentally shown activity for 3 of them: L. pneumophila str. Lens (plasmid and chromosome) and str. Mississauga-2006 (plasmid) (39). Each system contains nearly identical cas genes but different spacer arrays. As we previously hypothesized the diversification of type I-F arrays in L. pneumophila could emerge from extensive spacer acquisition (39), we sought to directly test the adaptability of two of these arrays, both present in L. pneumophila str. Lens.
The patterns of targeted acquisition observed in both the plasmid Lens and the chromosomal Lens type I-F systems are remarkably similar to both primed and targeted acquisition in other type I-F systems (30, 36, 37) (Figs. 3 and 4). Consistent with the similarity of the cas genes, these two Lens systems undergo cross-priming, where the targeted sequence for one system stimulates a primed acquisition response in the second system (Fig. 5). Regardless of the source of priming, our data support the sliding model of primed acquisition, in which the interference complex translocates away from the targeted sequence in a 3′ to 5′ manner, and recruits Cas1 and Cas2 to capture a new spacer for array integration after recognizing an appropriate PAM (30, 31, 37).
Our bioinformatic analyses of CRISPR arrays from type I-F systems in eight strains of L. pneumophila showed that with the exception of a C to T polymorphism present at position 12 in the three examined plasmid systems, the repetitive sequences are the same across all eight arrays (Table 2). Additionally, 7/8 of the strains possessed a mutation in the last repeat of the array. Based on these data, we hypothesize that the I-F system in L. pneumophila was horizontally acquired from a plasmid and that this common ancestor has subsequently diverged based on the spacer content and repeat sequences found in the varying arrays. Since the majority of the examined arrays harbor mutations in the last repeat, it is plausible that genetic drift has occurred since the acquisition of the I-F system to form the consensus repeat found in the remainder of the array. This could be used to compare the timing of acquisition events within the array, as one might expect other mutations to arise over time in the repeat sequences due to genetic drift.
Combining our bioinformatic analyses with our experimental data, we propose that L. pneumophila str. Alcoy (which has a consensus repeat that matches the mutated last repeat of other type I-F systems) underwent a mass spacer loss event followed by subsequent array replenishment. We hypothesize that cross-priming between two CRISPR-Cas systems could be yet another mechanism to not only protect against spacer loss, as spacers can be acquired at a more frequent rate, but also to aid the system in quickly and efficiently replenishing an array that has undergone a mass loss event.
Many of the I-F systems in L. pneumophila have different array lengths, ranging from 24 spacers to 74 spacers, with an average length of 54 spacers (Table 2). Toms and Barrangou recently performed a global analysis of class I CRISPR arrays and found that the average array length for type I-F systems was 33 spacers, with statistically significant differences between the array lengths of different type I subtypes (48). Accordingly, if spacer acquisition is a driving force in array divergence, it is likely coupled to spacer loss. Close examination of the mechanisms driving spacer loss in these systems, combined with comparative genomics of otherwise related strains, will be crucial to further testing the model of array diversification in L. pneumophila.
Methods and Materials
Bacterial strains, plasmids and oligos used
The bacterial strains and plasmids used in this study are listed in supplementary table 1, and the oligos used in this study are listed in supplementary table 2.
The priming plasmids were created by annealing oligos (see supplementary table 2) to create the protospacer insert with the canonical GG PAM (30, 36, 37, 46, 47) and subsequently ligating the insert into an ApaI/PstI-cut pMMB207 vector (49). The scrambled control plasmid was created in the same manner, except it contained a 32-nt scrambled sequence in place of a targeted protospacer sequence.
Transformation efficiency assay and population pool generation
The transformation efficiency assay was performed as we have previously described (39) with some modifications. Briefly, overnight cultures of L. pneumophila str. Lens were grown in ACES-buffered yeast extract (AYE) medium to an OD600 of ~4.0 using two-day patches that were grown on charcoal buffered ACES yeast extract (CYE) plates. Pellets from 4.0 OD600 of culture underwent three washing steps: twice with 1 mL of ice-cold ultrapure water and once with 1 mL of ice-cold 10% glycerol. The pellet was then re-suspended in 200 uL of ice-cold 10% glycerol and for every 50 uL of cell suspension, 100 ng of plasmid was added to the sample. The solution was transferred to an ice-cold electroporation cuvette with a 2 mm gap and electroporated with the following settings: 2500 kV, 600 Ω and 25 mF. After electroporation, 800 uL of AYE medium was added to each sample and the samples recovered for 3 hours at 37°C at 600 RPM in a shaking incubator. The samples were plated in a dilution series on CYE plates supplemented with 5 mg mL−1 of chloramphenicol and incubated at 37°C for 3 days. The relative transformation efficiency for each targeted plasmid was calculated as a percentage of the transformation efficiency obtained from the scrambled control plasmid. Three biological replicates were performed for each transformation efficiency assay.
Population pools were generated by mixing together ≥ 50 colonies per population from the CYE plates supplemented with 5 μg mL−1 of chloramphenicol using AYE medium supplemented with 5 μg mL−1 of chloramphenicol. Population pools were made in triplicate for each transformed plasmid.
Serial passaging on an automated liquid handler
The serial passaging of transformed L. pneumophila str. Lens populations was performed as described previously (39). Briefly, overnight cultures of the populations pools in AYE medium supplemented with 5 μg mL−1 of chloramphenicol for plasmid maintenance were grown to an OD600 of ~2.0. The culture was then back diluted to an OD600 of ~0.0625 and grown in a flat-bottom 48-well plate (Greiner) in a shaking incubator at 37°C. A Freedom Evo 100 liquid handler (Tecan) connected to an Infinite M200 Pro plate reader (Tecan) measured the optical density of the plate every 20 minutes, until an OD600 of ~2.0 was reached. The cultures were then automatically back diluted to an OD600 of ~0.0625 in the adjacent well to continue growth, and the remaining culture was transferred to a 48-well plate that was kept at 4°C. In this manner, each saved culture represented ~5 generations of growth. The passaging was done without selection in AYE medium to allow for plasmid loss during passaging.
Genomic DNA extraction, PCR and agarose gel screen
Genomic DNA was extracted from the passaged cultures using the Machery-Nagel Nucleospin Tissue kit as per the kit protocol. The extracted samples were used as a template in a 30-cycle PCR reaction with Econotaq Polymerase (Lucigen) to amplify the leader end of the CRISPR array using primers listed in Table S2. The PCR products were then separated on a 3% agarose gel to determine if spacer acquisition or spacer loss had occurred based on the presence of an upper or lower band, respectively, relative to the control sample.
Nextera library prep and Illumina sequencing
The extracted genomic DNA was prepared for leader-end array sequencing by performing a 20-cycle PCR using Kapa HiFi Polymerase (Kapa Biosystems) and the primers listed in Table S2. The PCR products were purified using a Machery-Nagel Nucleospin Gel and PCR Clean-up kit as per the manufacturer’s instructions and normalized to 1 ng using Picogreen. The DNA was then tagmented using the Nextera XT tagmentation kit as per the manufacturer’s instructions. The tagmented products were sequenced with a paired-end (2 × 150 bp) sequencing run on an Illumina NextSeq platform at the Centre for the Analysis of Genome Evolution and Function (CAGEF) at the University of Toronto.
Bioinformatic analyses
The bioinformatic analysis of the Illumina sequence data were performed as described previously (34). Briefly, the raw paired-end reads were merged using FLASH (50), and any unpaired reads were subsequently quality trimmed using Trimmomatic (51). These processed reads were then combined and analyzed using a Perl script (available upon request) that annotated existing spacers (S), newly acquired spacers (X), repetitive sequences (R) and the downstream sequence (D). The newly acquired spacers were aligned to the priming plasmid, the L. pneumophila str. Lens chromosome or the L. pneumophila str. Lens plasmid using BLASTN. The results from the BLASTN alignment for the priming plasmid were then processed to obtain the coverage per nucleotide, and plotted on the reference sequence using Circos (45). For the PAM analyses, the flanking sequence of each new spacer was extracted and plotted using Web Logo (52).
For the bioinformatics analyses of the L. pneumophila type I-F CRISPR arrays, the repetitive sequence in L. pneumophila str. Lens was subjected to a BLAST search against other L. pneumophila strains in Genbank (accessed September 2017). The hits were processed using CRISPRFinder (53) to determine if there was a CRISPR system present in the strain and its type; only eight strains with a type I-F system were examined, noting the repeat sequences, the number of spacers present in each array and whether the system was on a chromosome or on a plasmid. Mutations in the last sequence of each array were noted, as were any mutations between the consensus repeat sequences of the different strains.
Data Accessibility
The raw Illumina reads have been deposited into the NCBI sequence read archive under the BioProject PRJNA433194.
Acknowledgements
The authors thank Griffin Deecker (a volunteer high school student) for his assistance in bioinformatically examining the diversity of I-F repeat sequences in L. pneumophila. We also thank the Center for the Analysis of Genome Evolution and Function (CAGEF) at the University of Toronto for performing Illumina sequencing. We thank members of the Ensminger laboratory for their suggestions and careful reading of the manuscript, in particular Beth Nicholson and Malene Urbanus. SRD is supported by a fellowship from the Department of Biochemistry, University of Toronto. This work was supported by a Project Grant from the Canadian Institutes of Health Research (PHT-148819), the Connaught Fund (NR-2015-16), and an infrastructure grant from the Canada Foundation for Innovation and the Ontario Research Fund (30364) to AWE.