Abstract
CRISPR-Cas9, which imparts adaptive immunity against foreign genomic invaders in certain prokaryotes, has been repurposed for genome engineering applications. More recently, another RNA-guided CRISPR endonuclease called Cpf1 was identified and is also being repurposed. Little is known about the kinetics and mechanism of Cpf1 DNA interaction and how sequence mismatches between the DNA target and guide-RNA influence this interaction. We have used single-molecule fluorescence imaging and biochemical assays to characterize DNA interrogation, cleavage, and product release by three Cpf1 orthologues. Like Cas9, Cpf1 initially binds DNA in search of PAM (protospacer-adjacent motif) sequences, verifies the target sequence unidirectionally from the PAM-proximal end and rapidly rejects any targets that lack a PAM or that are poorly matched with the guide-RNA. Cpf1 requires ~ 17 bp sequence match for both stable binding and cleavage, contrasting it with Cas9 which requires 9 bp for stable binding and ~16 bp for cleavage. Unlike Cas9, which does not release the DNA cleavage products, Cpf1 rapidly releases the PAM-distal cleavage product, but not the PAM-proximal product. Our findings have important implications on Cpf1-based genome engineering and manipulation applications.
In prokaryotes, CRISPR (clustered regularly interspaced short palindromic repeats)–Cas (CRISPR-associated) acts as an adaptive defense system against foreign genetic elements1. The system achieves adaptive immunity by storing short sequences of invader DNA into the host genome, which get transcribed and processed into small CRISPR RNA (crRNA). These crRNAs form a complex with a CRISPR nuclease to guide the nuclease to complementary foreign nucleic acids (protospacers) for cleavage. Binding and cleavage also require that the protospacer be adjacent to a protospacer adjacent motif (PAM)2,3. CRISPR-Cas9, chiefly the Cas9 from Streptococcus pyogenes (SpCas9), has been repurposed to create an RNA-programmable endonuclease for gene knockout and editing4⇓–6. Nuclease deficient Cas9 has also been used for tagging genomic sites in wide-ranging applications4⇓–6. This repurposing has revolutionized biology and sparked a search for other novel CRISPR-Cas enzymes7,8. One such search led to the discovery of the Cas protein Cpf1, with some of its orthologues reporting highly specific cleavage activities in mammalian cells9⇓⇓–12.
Compared to Cas9, Cpf1 has an AT rich PAM (5’-YTTN-3’ vs. 5’-NGG-3’ for SpCas9), a longer protospacer (24 bp vs. 20 bp for Cas9), creates staggered cuts distal to the PAM vs. blunt cuts proximal to the PAM by Cas99, and is an even simpler system than Cas9 because it does not require a trans-activating RNA for nuclease activity or guide-RNA maturation13. Off-target effects remain one of the top concerns for CRISPR-based applications but Cpf1 is reportedly more specific than Cas910,11. However, its kinetics and mechanism of DNA recognition, rejection, cleavage and product release as a function of mismatches between the guide-RNA and target DNA remain unknown. Precise characterization of differences amongst different CRISPR enzymes should help in expanding the functionalities of the CRISPR toolbox.
Here, we have used single-molecule imaging and biochemical assays to understand how mismatches between the guide-RNA and DNA target modulate the activity of three Cpf1 orthologues from Acidaminococcus sp. (AsCpf1), Lachnospiraceae bacterium (LbCpf1) and Francisella novicida (FnCpf1)9. Single-molecule methods have been helpful in the study of CRISPR mechanisms14⇓⇓⇓⇓⇓⇓⇓⇓–23 because they allow real-time detection of multiple and distinct steps of varying time lengths i.e. transient to long-lived24.
Results
Real-time DNA interrogation by Cpf1-RNA
We employed a single-molecule fluorescence resonance energy transfer (smFRET) binding assay25,26. DNA targets (donor-labeled, 82 bp long) were immobilized on a polyethylene glycol (PEG) passivated surface and Cpf1 pre-complexed with acceptor-labeled guide-RNA (Cpf1-RNA) was added. Cognate DNA and guide-RNA sequences are identical to the Cpf1 orthologue-specific sequences that were previously characterized biochemically9 with the exception that we used canonical guide-RNA of AsCpf1 for FnCpf1 analysis because guide-RNAs of AsCpf1 and FnCpf1 are interchangeable9 (Supplementary Fig. 1). Locations of donor (Cy3) and acceptor (Cy5) fluorophores were chosen such that FRET would report on interaction between the DNA target and Cpf1-RNA27 (Fig. 1a and Supplementary Fig. 1). Fluorescent labeling did not affect cleavage activity of Cpf1-RNA (Supplementary Fig. 2). We used a series of DNA targets containing different degrees of mismatches relative to the guide-RNA referred to here with nPD (the number of PAM-distal mismatches) or nPP (the number of PAM-proximal mismatches) (Fig. 1b). Cognate DNA target in the presence of 50 nM Cpf1-RNA gave two distinct populations with FRET efficiency E centered at 0.4 and 0. Using instead a non-cognate DNA target (nPD of 24 and without PAM) or guide-RNA only without Cpf1 gave a negligible E=0.4 population, allowing us to assign E~0.4 to a sequence-specific Cpf1-RNA-DNA complex where the labeling sites are separated by 54 Å27 (Fig. 1c and Supplementary Fig. 1). The E=0 population is a combination of unbound states and bound states but with an inactive or missing acceptor. smFRET time trajectories of the cognate DNA target showed a constant E~0.4 value within measurement noise (Fig. 1c).
Cpf1-RNA titration experiments yielded dissociation constants (Kd) of 0.27 nM (FnCpf1), 0.1 nM (AsCpf1), 3.9 nM (LbCpf1) in our standard imaging condition and 0.13 nM (LbCpf1) in a reducing condition (Supplementary Fig. 3). Binding is much tighter than the 50 nM Kd previously reported for FnCpf113. We performed purification and biochemical experiments in buffer containing dithiothreitol (DTT) as per previous protocols9 but did not include DTT for standard imaging condition because of severe fluorescence intermittency of Cy5 caused by DTT 28. DTT did not affect FnCpf1 or AsCpf1 DNA binding but made binding >20-fold tighter for LbCpf1 (Supplementary Fig. 3). Cleavage by AsCpf1 is most effective at pH 6.5-7.0 ( Supplementary Fig. 4). Therefore, we used pH 7.0 for AsCpf1 and standard pH 8.0 for FnCpf1 and LbCpf1.
E histograms obtained at 50 nM Cpf1-RNA show the impact of mismatches on DNA binding (Fig. 2). The apparent bound fraction fbound, defined as the fraction of DNA molecules with E > 0.2, remained unchanged when nPD increased from 0 to 7 (0 to 6 for LbCpf1 in non-reducing conditions) (Fig. 2, 3d). Binding was ultra-stable for nPD ≤ 7 because fbound did not change even 1 hour after washing away free Cpf1-RNA (Fig. 3a). fbound decreased steeply when nPD exceeded 7 for FnCpf1 and LbCpf1 but the decrease was gradual for AsCpf1 and for LbCpf1 in the reducing condition (Fig. 2, 3d). For all Cpf1 orthologues, ultra-stable binding required nPD ≤ 7, corresponding to a 17 bp PAM-proximal sequence match. This is much larger than the 9 bp PAM-proximal sequence match required for ultra-stable binding of Cas918. PAM-proximal mismatches are highly deleterious for Cpf1 binding because fbound dropped by more than 95% if nPP ≥ 2 (Fig. 2, 3d). In comparison, Cas9 showed a more modest ~50% drop for nPP = 2 18. Overall, Cpf1 is much better than Cas9 in discriminating against both PAM-distal and PAM-proximal mismatches for stable binding.
Single molecule time-trajectories of all Cpf1 orthologues for nPD ≤ 7 showed a constant E~0.4 value within noise, limited only by photobleaching. For nPD > 7, we observed reversible transitions in E likely due to transient binding (Supplementary Fig. 5-7). Dwell-time analysis as a function of Cpf1-RNA concentration confirmed that E fluctuations are due to binding and dissociation, not conformational changes (Fig. 3b-c and Supplementary Fig. 3). We used hidden Markov modeling analysis29 to segment the time traces to bound and unbound states. Average lifetime of the bound state, τavg, was > 1 hour for nPD ≤ 7 but decreased to a few seconds with nPD > 7 or any PAM-proximal mismatches (Fig. 3e). The unbound state lifetime differed between orthologues but was nearly the same among most DNA targets, indicating that initial binding has little sequence dependence. The bimolecular association rate kon was 2.37 × 106 M−1 s−1 (FnCpf1), 0.87 × 106 M−1 s−1 (LbCpf1) and 1.33 × 107 M−1 s−1 (LbCpf1 in reducing conditions) (Fig. 3c, f). Much longer apparent unbound state lifetimes with PAM-proximal mismatches or DNA targets without PAM are likely due to binding events shorter than the time resolution (0.1 s).
These results show that Cpf1-RNA has dual binding modes. It first binds DNA non-specifically (mode I) in search of PAM and upon detection of PAM, RNA-DNA heteroduplex formation ensues (mode II) and if it extends ≥ 17 bp, Cpf1-RNA remains ultrastably bound to the DNA. Some reversible transitions in E were observed even for DNA with nPD = 7, indicating that multiple short-lived binding events take place before the one resulting in ultra-stable binding (Supplementary Fig. 5-7). RNA-DNA heteroduplex extension is likely unidirectional from PAM-proximal to PAM-distal end because any PAM-proximal mismatch prevented stable binding.
Consistent with dual binding modes, survival probability distributions of bound and unbound state were best described by a double and single exponential decay, respectively (Supplementary Fig. 8).
DNA cleavage by Cpf1 as a function of mismatches
Next, we performed gel-based experiments using the same set of DNA targets to measure cleavage by Cpf1. Cleavage was observed at a wide range of temperatures (4-37 °C), required divalent ions (Ca2+ could substitute for Mg2+), and showed a pH dependence. AsCpf1 is most active only at slightly acidic to neutral pH (6.5-7.0) whereas FnCpf1 has more activity at pH 8.5 than pH 8.0 (Supplementary Fig. 9-11). Cleavage required 17 PAM-proximal matches, corresponding to nPD ≤ 7, (Fig. 4a, Supplementary Fig. 9-10) which is identical to the threshold for stable binding (Fig. 2, 3). This contrasts with Cas9, which requires only 9 PAM-proximal matches for stable binding18 but 16-18 PAM-proximal matches for cleavage3,14.
We measured the time it takes to cleave DNA, τcleavage (Supplementary Fig. 12). τcleavage remained approximately the same among DNA with 0 ≤ nPD ≤ 6 for FnCpf1 (30-60 s) but steeply increased upon increasing nPD to 7 (Fig. 4b, c). AsCpf1 showed a more complex nPD dependence with a minimal τcleavage value of 8 minutes for nPD = 6. (Fig. 4c). τcleavage is much longer than the 1 to 15 seconds it takes Cpf1-RNA to bind the DNA at the same Cpf1-RNA concentration, suggesting that Cpf1-RNA-DNA undergoes additional rate-limiting steps after DNA binding and before cleavage. These additional steps are likely the conformational rearrangement of Cpf1-RNA-DNA complex that position the nuclease domains and DNA strands for cleavage, as has been described in structural analysis of Cpf1-RNA-DNA complex27,30.
Because of the finite τcleavage we can infer that the ultra-stable binding (lifetime > 1 hr) for nPD ≤ 7 is that of Cpf1-RNA binding to the cleaved product, and it is in principle possible that cleavage stabilizes Cpf1-RNA binding. In order to test this possibility, we purified catalytically dead FnCpf1 (dFnCpf1) and performed DNA interrogation experiments. dFnCpf1 binding was ultra-stable for cognate DNA but showed a substantial dissociation after 5-10 min for nPD=6 or 7 (Supplementary Fig. 13). Therefore, cleavage can further stabilize Cpf1-RNA binding to DNA. Cleavage was negligible for DNA targets that showed transient binding. Therefore, transient binding and dissociation we observed is not to and from a cleaved DNA product.
Fate of cleaved DNA
For an efficient addition of a new piece of DNA at a cleaved site, the cleaved site needs to be exposed31. To investigate the fate of DNA targets post cleavage, we relocated the Cy3 label to a PAM-distal DNA segment that would depart the imaging surface if the Cpf1 releases cleavage product(s) (Fig. 4d, Supplementary Fig. 14). The number of fluorescent spots decreased over time (Fig. 4e), suggesting the cleavage product is released under physiological conditions, which is in stark contrast to Cas9, which holds onto the cleaved DNA and does not release except in denaturing condition 14,18. Cpf1 releases only the PAM-distal cleavage product, however, because when Cy3 is attached to a site on the PAM-proximal cleavage product, the number of fluorescence spots did not decrease over time ( Fig. 1-3). The average time for fluorescence signal disappearance ranged from ~30 s to 30 min depending on the PAM-distal mismatches and Cpf1 orthologues. By subtracting the time it takes to bind and cleave, we estimated the product release time scale (τrelease) (Fig. 4f), which showed a dependence on nPD. Therefore, PAM-distal mismatches can also affect product release.
Discussion
The two-step mechanism of sampling for PAM followed by unidirectional RNA-DNA heteroduplex extension (Fig. 5) is shared between Cas9 and Cpf1, suggesting this to be a general target identification mechanism of these CRISPR systems. Ultra-stable binding of Cpf1 requires the same extent of sequence match (17 bp PAM-proximal matches) as target cleavage. This contrasts with Cas9, which requires only 9 bp and 16 bp PAM-proximal matches for ultra-stable binding and cleavage respectively18,32,33. Therefore, Cpf1 can be more sequence specific in experiments involving the use of catalytically dead CRISPR for imaging, tracking and transcription regulation purposes34. The binding specificity of engineered Cas9s (eCas935 & Cas9-HF136) is still much lower than that of Cpf133. Therefore, Cpf1 has the potential to be a better alternative to all current Cas9 variants.
Cleavage rate is reduced with increasing PAM-distal mismatches (Fig. 4c) even when the mismatches do not affect stable binding (Fig. 3), suggesting that shorter RNA-DNA heteroduplexes result in slower conformational changes required for cleavage activation. Previous studies on Cas9 revealed that mismatches alter the kinetics of DNA unwinding, RNA-DNA heteroduplex extension, and nuclease and proof-reading domain movements19,21,32,33.
For cognate DNA target, RNA-DNA heteroduplex extension would require unwinding of the parental DNA duplex. In the crystal structure of AsCpf1-RNA-DNA complex, four PAM-distal base pairs are unwound but not involved in RNA-DNA heteroduplex 27, hinting that DNA unwinding does not necessarily cause a concomitant annealing with the RNA. We performed cleavage experiments using DNA with PAM-distal mismatched region pre-unwound in order to test the relative importance of parental DNA duplex unwinding and annealing with RNA in cleavage activation. Cpf1 needed much fewer PAM-proximal matches to cleave if the mismatched region is pre-unwound (Supplementary Fig. 15) indicating indeed DNA unwinding is likely more important than RNA-DNA heteroduplex in activating cleavage. Accordingly, ssDNA can also be cleaved by Cpf1 (Supplementary Fig. 15). Therefore, the role of RNA may primarily be in keeping the DNA unwound through annealing with the target strand.
CRISPR enzymes bend DNA to cause a local kink near the PAM, which acts as a seed for unwinding and heteroduplex extension27,37,38. Perturbation of DNA bending by introducing a nick near the PAM slowed down cleavage, underscoring the importance of DNA bending for Cpf1 induced cleavage (Supplementary Fig. 16). Cas9 causes a larger DNA bend than Cpf127,37, possibly contributing to its higher tolerance of PAM-proximal mismatches in binding and cleavage activity.
Shorter and simpler guide-RNA9 for Cpf1 could potentially be deleterious for its engineering or extension, as is done for Cas9’s guide-RNA39. For e.g., an extra 5’ guanine in the guide-RNA was extremely deleterious for LbCpf1 (Supplementary Fig. 17). This feature could affect applications where guide-RNAs are transcribed using U6/T7 RNA polymerase systems that require first nucleotide in transcribed RNA to be the guanine40,41.
Cas9 has provided a highly efficient and versatile platform for DNA targeting, but the efficiency of gene knock-in is low42. Amongst the possible reasons is the inability of Cas9 to release and expose cleaved DNA ends. In contrast, the ability of Cpf1 to release a cleavage product readily, combined with staggered cuts it generates, could in principle increase the knock-in efficiency. Although it remains to be seen how this property affects the downstream processing in vivo, we can also envision a scenario where product release by Cpf1 can be detrimental to genome engineering applications. Applying positive twist to the DNA in a Cas9-RNA-DNA complex can release Cas9-RNA from DNA by promoting rewinding of parental DNA duplex15. Positive supercoiling is generated ahead of a transcribing RNA polymerase43 and Cas9 holding onto the double strand break product may help build the torsional strain required to eject Cas9-RNA. If the PAM-distal cleavage product is released prematurely as in the case of Cpf1, transcription-induced positive supercoiling cannot build up and the Cpf1-RNA would remain bound stably to the PAM-proximal cleavage product, hiding the cleaved end and preventing efficient knock-in.
High specificity of adaptive immunity by Cpf1 against hypervariable genetic invaders is a little paradoxical. But Cpf1 and Cas9 systems co-exist in many species and thus they likely provide immunity suited to their features, effectively broadening the scope of immunity. Overall, our results establish major different and common features between Cpf1 and Cas9 which can be useful for the broadening of genome engineering applications as well.
Author contributions
D.S., T.H. designed the experiments. D.S performed single molecule experiments. J.M. performed radio-labeled gel electrophoresis experiments. D.S. performed gel electrophoresis experiments involving SYBR staining of nucleic acids. D.S., J.M., R.T. expressed and purified Cpf1. D.S. prepared DNA and RNA substrates. D.S. wrote the MATLAB package for data analysis and performed it with help from A.P., Y.W. A.P. assisted with the PEG passivation of some slides. O.Y., Y.W. assisted with some experiments. D.S, T.H., S.B. discussed the data. D.S., T.H. wrote the manuscript.
Authors declare no competing financial interests. Correspondence: T.H. (tjha{at}jhu.edu).
Materials and methods
DNA targets for smFRET analysis of DNA interrogation
Single-stranded DNA (ssDNA) oligonucleotides were purchased from Integrated DNA Technologies. ssDNA target and non-target (labeled with Cy3) strands and a biotinylated adaptor strand were mixed. Excess target strand was used to ensure near complete hybridization of non-target strand with the target strand. Upon surface immobilization of the assembled DNA target, any free target strand can be washed away because it does not contain biotin. The non-target strand was created by ligating two component strands, one with Cy3 and the other containing the protospacer region to avoid having to synthesize modified oligos for each mismatch construct. For schematics, see Supplementary Fig. 1a.
Fully duplexed DNA targets but with a nick were also used. The oligonucleotide containing Cy3 is referred to as “Cy3 oligo” and is in part, complementary to a “biotin oligo”. Hybridization of the two oligos results in a biotin-Cy3 adaptor, which has a 14 nt overhang complementary to the “target oligo” that contains the protospacer region. Finally, the non-target “oligo” complementary to the target oligo was used to complete the duplexed DNA target ( Supplementary Fig. 1a). DNA targets were prepared by mixing all of the four component oligos in the buffer containing 50 mM NaCl, 20 mM Tris-HCl (pH 8.0), which was then heated to 90 °C followed by slow-cooling to room temperature over 3 hr. The mixing ratio of component oligos was 1:1:2:3 for Cy3 oligo: biotin oligo: target oligo: non-target oligo. An excess of target and non–target oligo was used to ensure that any Cy3 oligo detected on the surface is in complex with three other oligos. The Cy3 fluorophore is located 4 bp upstream of the protospacer adjacent motif (PAM: 5’-YTTN-3’) and was conjugated via Cy3 N-hydroxysuccinimido (Cy3-NHS; GE Healthcare) to the Cy3 oligo at amino-group attached to a modified thymine through a C6 linker (amino-dT) using NHS ester linkage. smFRET experiments were done with both sets of DNA targets (with or without a nick) and no significant differences were found between them. Supplementary Table 1 shows all DNA targets used.
DNA targets for real time single-molecule assay for interrogating fate of cleaved DNA
For single-molecule cleavage product release experiments, a non-target strand with the Cy3 relocated in a different position was used. Cy3 label was conjugated onto the amine modification (amino-dT) using Cy3-NHS, as described above. Schematic of these DNA targets is in the Supplementary Figure 14 and their sequences in Supplementary Table 5.
DNA targets for gel electrophoresis experiments
They were prepared and hybridized as described above. For radio-labeled gel electrophoresis experiments, the target strand was 5′ radiolabeled with T4 polynucleotide kinase (New England BioLabs) and γ-32P ATP (Perkin Elmer). The target and non-target strands were annealed with the non-target strands in excess.
Guide-RNA
For single molecule experiments, guide-RNA was purchased from IDT with modifications for Cy5 labeling as described in Supplementary Table 5. Cy5 was conjugated via Cy5 N-hydroxysuccinimido (Cy5-NHS; GE Healthcare) to the RNA as described previously18,44. For all other experiments, unmodified guide-RNA was used and they were either in vitro transcribed or purchased from IDT. Guide-RNA sequences used in this study is available in Supplementary Table 5.
Preparation of Cpf1-RNA
The Cpf1-RNA was freshly prepared prior to each experiment by mixing the guide-RNA (50 nM) and Cpf1 in 1:3.5 ratio in the following reaction buffers and incubated for at least 10 min at room temperature. 50 mM Tris-HCl (pH 8.0) 100 mM NaCl, 10 mM MgCl2, (FnCpf1 and LbCpf1) and 50 mM HEPES (pH 7.0) 100 mM NaCl, 10 mM MgCl2, (AsCpf1). 5mM DTT was only used in the buffer when specified. 0.2 mg/ml Bovine serum albumin (BSA), 1 mg/ml glucose oxidase, 0.04 mg/ml catalase, 0.8% dextrose and saturated Trolox (>5 mM)) were additional contents of the reaction buffers for single-molecule imaging experiments. Excess Cpf1 was used to achieve highest extent of complexation of all the available guide-RNA and the concentration of guide-RNA was used as the concentration of Cpf1-RNA. Cpf1 activity using the similar guide-RNA and on DNA targets with same protospacer and PAM have been characterized previously9. Fluorophore labeling of either DNA targets or guide-RNA did not impair Cpf1 activity. (Supplementary Fig. 2).
Expression and purification of Cpf1
The methods of Cpf1 protein expression and purification were adapted from a protocol described previously9. Codon optimized Cpf1 gene sequence cloned into a bacterial expression vector (6-His-MBP-TEV-FnCpf1, a pET based vector) was cloned in house or purchased from GenScript. The vector was transformed into Rosetta (DE3) pLyseS (EMD Millipore) cells and cells were plated onto LB-Kanamycin agar plates and grown at 37 °C overnight. Single colony from the agar plates was then cultured overnight in 10 ml of SOC medium (Thermo Fisher Scientific). The overnight miniculture of Rosetta (DE3) pLyseS cells containing the Cpf1 expression construct were inoculated (1:500 dilution) into 4 liters of Terrific Broth (Sigma Aldrich) growth media containing 50 μg/ml Kanamycin. Growth media with the inoculant was grown at 37 °C in a shaker at 100 rpm until the cell density reached 0.2 OD600, at which point the temperature was lowered to 21 °C. Growth was continued and 6-His-MBP-TEV-Cpf1 protein expression was induced when cells reached 0.6 OD600 by addition of IPTG (Sigma) to 0.5 mM final concentration in the growth media. The induced culture was kept for 14–18 hr at 21 °C after which the cells were harvested by centrifugation at 5000 rpm for 30 min at 4 °C. The harvested cells were quickly stored at −80° C until further purification.
The harvested cells were then suspended in 200 ml of lysis buffer (50 mM HEPES [pH 7], 2M NaCl, 5 mM MgCl2, 20 mM imidazole) supplemented with protease inhibitors (Roche complete, EDTA-free) from Roche and lysozyme (Sigma Aldrich) and incubated at 4 °C for 30-45 minutes. After homogenization, cells were further lysed by sonication (Fisher Model 500 Sonic Dismembrator; Thermo Fisher Scientific) at 30% amplitude in 3 cycles of 2 s sonicate-2 s relax mode, each cycle lasting 1 min. Following lysis, cell solution was centrifuged at 15,000 × g for 30-45 minutes, the cellular debris was discarded and the supernatant of lysate was collected. The clear lysate was then incubated at 4° C with Ni-NTA slurry (Qiagen) for 45 min in a shaker at 30 rpm. The lysate with the Ni-NTA slurry was then applied to a column and multiple cycles of lysis buffer were used to wash the Ni-NTA slurry through the column. The 6-His-MBP-TEV-Cpf1 was eluted in a single step with 300 mM imidazole buffer (50 mM HEPES [pH 7], 2 M NaCl, 5 mM MgCl2, 300 mM imidazole). TEV protease (Sigma Aldrich) was then added, and the sample was dialyzed using Slide-A-Lyzer™ dialysis casettes (Thermo Fisher Scientific) overnight into the buffer suitable for TEV protease activity (500 mM NaCl, 50 mM HEPES [pH 7], 5 mM MgCl2, 2 mM DTT). TEV protease activity resulted in the deconstitution of 6-His-MBP-TEV-Cpf1 into 6-His-MBP and Cpf1, which was confirmed by SDS-PAGE. The free 6-His-MBP was removed by another round of Ni-NTA chromatography resulting in the solution containing only Cpf1. Sample was then injected on to a HiLoad 26/600 S200 size exclusion column equilibrated with gel filtration buffer (50 mM Tris-HCl pH 8.0, 100 mM NaCl, 10 mM MgCl2, 5mM DTT). Fractions containing Cpf1 were pooled, concentrated, and then flash frozen in liquid nitrogen. Final sample was stored at −80 °C until used in experiments.
Single-molecule detection and data analysis
Neutravidin-biotin interaction was used to immobilize the biotinylated Cy3-labeled DNA targets on the polyethylene glycol (PEG) passivated flow chamber surface prepared following protocols reported previously26 or purchased from Johns Hopkins Microscope Supplies Core. Cy5-labeled Cpf1-RNA or unlabeled Cpf1-RNA (both referred to as Cpf1-RNA for brevity) was added to the flow chamber. The flow chamber was then illuminated with green laser and imaged with two color total internal reflection fluorescence microscopy. A buffer suitable for single-molecule imaging and Cpf1 activity was used and is referred to as the imaging-reaction buffer (50 mM Tris-HCl (pH 8.0) 100 mM NaCl, 10 mM MgCl2,0.2 mg/ml Bovine serum albumin (BSA), 1 mg/ml glucose oxidase, 0.04 mg/ml catalase, 0.8% dextrose and saturated Trolox (>5 mM)) (Supplementary Fig. 18). 50 mM HEPES (pH 7.0) was used in place of 50 mM Tris-HCl (pH 8.0) for AsCpf1 experiments only, unless stated otherwise. 5 mM DTT was only added to these buffers for experiments done and stated to be in reducing conditions (chiefly LbCpf1 only). Technical details of single-molecule imaging, data acquisition and analysis have been described previously26. Video recordings obtained using EMCCD camera (Andor) were processed to extract single molecule fluorescence intensities at each frame and custom written scripts were used to calculate FRET efficiencies. Data acquisition and analysis software can be downloaded from https://cplc.illinois.edu/software/. FRET efficiency of the detected spot was approximated as FRET = IA/(ID+IA), where ID and IA are background and leakage corrected emission intensities of the donor and acceptor respectively. For single-molecule cleavage experiments, series of snapshots of different imaging areas were taken at different time points, under same green laser illumination via total internal reflection. The snapshots were then analyzed to estimate the changing number of Cy3-labeled DNA targets on the surface. Time resolution for all the experiments was 100 ms unless stated otherwise.
FRET efficiency histograms and Cpf1-RNA bound DNA fraction
A smFRET time-trajectory is a series of E values every 100 ms. First five E values of each single-molecule trace were pooled together to build single molecule E histograms. Cpf1-RNA bound DNA fraction (fbound) was calculated as a ratio between the number of molecules with E > 0.2 and the total number of molecules in the E histograms. E histograms shown in Figure 2 were constructed by combining data from two independent experiments (except for AsCpf1; PAM-less DNA).
Determination of binding kinetics
For DNA targets that showed real-time reversible binding/dissociation of Cpf1-RNA, idealization of smFRET traces via Hidden Markov Model29 analysis yielded two pre-dominant FRET states, of zero (E< 0.2) and bound state (E> 0.2). Lifetime of the unbound state, τunbound, was calculated by fitting survival probability of dwell-times of unwound state (E< 0.2) vs time to a single-exponential decay (exp[−t/τunbound]). The survival probability of the bound state required a double-exponential decay for adequate fitting (A*exp[-t/τ1]+ [1-A] *exp[-t/τ2], and the average lifetime was calculated as τavg = Aτ1 +(1-A)τ2.
The bimolecular association rate constant kon, binding rate kbinding and dissociation rate koff were calculated as follows.
kbinding =τunbound−1
koff = τbound−1
kon = kbinding / [Cpf1-RNA]
Due to under-sampled binding events, τavg of FnCpf1 for PAM-less DNA and DNA with 2 nPP were calculated as the algebraic average of E> 0.2 dwell-times.
Cy5 labeling efficiency of guide-RNA was ~90% and thus fbound andτunbound were appropriately corrected. Due to high noise, the smFRET traces from experiments involving AsCpf1 could not be idealized with high accuracy thus preventing their koff and kon analysis.
Estimation of dissociation constant (Kd)
To estimate Kd, Cpf1-RNA bound DNA fraction (fbound) vs Cpf1-RNA concentration (c) was fit using fbound= M × c / (Kd + c) where M is the maximum observable fbound. M is typically less than 1 because inactive or missing acceptors or because not all of the DNA on the surface are capable of binding Cpf1-RNA.
Overall lifetime of release of cleavage products
Single-molecule experiments were used to estimate the lifetime of the release of cleavage products by fitting the decreasing number of Cy3 spots (loss of spots due to Cpf1-RNA induced cleavage and release) to a single-exponential decay. The time of binding (kon×50 nM) and time of cleavage (τcleavage) were subtracted from the obtained lifetime to get the true lifetime of the release (τrelease) of cleavage products. But since τcleavage was not measured for LbCpf1, its reported τrelease is without the τcleavage and time of binding subtraction.
Gel electrophoresis experiments
All the biochemical experiments were done in the following reaction buffers: 50 mM Tris-HCl (pH 8.0) 100 mM NaCl, 10 mM MgCl2, 5 mM DTT (FnCpf1 and LbCpf1) and 50 mM HEPES (pH 7.0) 100 mM NaCl, 10 mM MgCl2, 5 mM DTT (AsCpf1).
Gel electrophoresis experiments involving visualization of nucleic acid bands via SYBR Gold II staining
All experiments were conducted by mixing DNA targets and Cpf1-RNA in 1:5 ratio in the reaction buffer. The reaction was incubated for 4.5-5 hr (unless stated otherwise) before being resolved by 4% native/denaturing agarose gel electrophoresis and SYBR Gold II staining of nucleic acids using the precast gels containing SYBR Gold II, purchased from Thermo Fisher Scientific. For native gel electrophoresis, the reaction aliquots were directly loaded onto the gels. All the reactions were incubated at the room temperature, 37 °C or 4 °C and indicated in the presentation of their results. The gel electrophoresis was run at room temperature for experiments incubated at room temperature/37 °C and at 4 °C for experiments incubated at 4 °C. The cleaved-uncleaved DNA target with/without the bound Cpf1-RNA along with other nucleic acids were stained by SYBR Gold II and imaged by blue laser illumination (480 nm; GE Amersham Molecular Dynamics Typhoon 9410 Molecular Imager and 488 nm; Amersham Imager 600). For all of these experiments, the concentration of the DNA targets ranged from 20 nM to 60 nM and consequently the effective concentration of Cpf1-RNA ranged from 100 nM to 300 nM respectively. Volume of aliquots used for gel loading ranged from 10 to 20 μL per lane. For the time-lapse denaturing gel electrophoresis experiments, the acquired gel-images were quantified using ImageJ45. Entire panel of DNA targets used in these gel-electrophoresis experiments is available in Supplementary Table 3, 4. Tris-HCl at pH 8.0 was used in the reaction buffers for all the experiments except for the ones reported in Supplementary Figure 2, 9 where Tris-HCl at pH 8.5 was used.
Gel electrophoresis experiments and autoradiography
Experiments containing radiolabeled DNA substrates were performed as above. However, samples were quenched, in buffer containing 95% formamide, 0.01% SDS 0.01% bromophenol blue, 0.01% xylene cyanol, and 1 mM EDTA and incubated at 95 °C for 5min then on ice for 2min. Volume ratio of quenching buffer to reaction was 5:1. Samples were loaded on to denaturing polyacrylamide gels (10% acrylamide, 50%(w/v) urea) and allowed to separate. Amount of sample loaded on to gel was normalized to 10,000 counts per sample. Gels were imaged via phosphor screens. Entire panel of DNA targets used in these gel-electrophoresis experiments is available in Supplementary Table 2.
Acknowledgements
The project was supported by grants from National Science Foundation (PHY-1430124 to T.H.) and National Institutes of Health (GM065367; GM112659 to T.H and GM097330 to S.B.); T.H. is an investigator with the Howard Hughes Medical Institute. J.M is supported by the Nation Institutes of Health Chemical Biology Interface training program (T32GM080189).