Abstract
In recent years, CRISPR has evolved from “the curious sequence of unknown biological function” into a functional genome editing tool. The CRISPR/Cas9 technology is now delivering novel genetic models for fundamental research, drug screening, therapy development, rapid diagnostics and transcriptional modulation. Despite the apparent simplicity of the CRISPR/Cas9 system, the outcome of a genome editing experiment can be substantially impacted by technical parameters as well as biological considerations. Here, we present guidelines and tools to optimize CRISPR/Cas9 genome targeting efficiency and specificity. The nature of the target locus, the design of the single guide RNA and the choice of the delivery method should all be carefully considered prior to a genome editing experiment. Different methods can also be used to detect off-target cleavages and decrease the risk of unwanted mutations. Together, these optimized tools and proper controls are essential to the assessment of CRISPR/Cas9 genome editing experiments.
Introduction
Engineered nucleases, from zinc-finger nucleases to TALENs and CRISPRs, form a powerful class of genome editing tools. Among these, the CRISPR/Cas9 system has become the most popular, owing to its ease of use and rapidity. The CRISPR/Cas system was discovered in prokaryotes where it provides adaptive immunity against foreign elements (Mojica et al., 2005; Barrangou et al., 2007; Deltcheva et al., 2011; Gasiunas et al., 2012). In 2013, the CRISPR/Cas9 system from Streptococcus Pyogenes (spCas9, further indicated in the text as Cas9) was successfully adapted for genome editing in eukaryotic cells (Cong et al., 2013, Cho et al. 2013; Jinek et al. 2013; Mali et al., 2013). Since then, the technique has become extremely popular as it can modify the genome of a large variety of organisms with unprecedented ease.
Despite the potential of the CRISPR technology, not all genome editing experiments work equally well and this technology is not as easy as it was once assumed. Despite significant improvements, there is still limited predictability of whether the CRISPR system will be able to effectively target a given region of interest. This aspect is of particular importance in the context of CRISPR/cas9-based screens in model organisms and is related to the definition of the target site and the sequence of small guide RNA (sgRNA). Another major hurdle common to all engineered nucleases is the risk of unwanted mutations at sites other than the intended on-target site (off-target effects). The off-target mutations are the consequence of sgRNA binding to DNA sites with less than perfect complementary (Fu et al., 2013; Hsu et al., 2013). Current strategies to increase targeting specificity notably include: refinements in guide RNA selection, enzyme and guide engineering and improvements in the delivery method. Here, we describe a series of guidelines to optimize CRISPR/Cas9 efficiency and specificity.
Analysis of the target locus
Careful determination of target sites is essential. For many applications, a loss of function may be desirable or even required. Targeting of functional protein domains was recently shown to result in higher proportions of loss-of-function mutations (Shi et al., 2015). A common strategy is to select sgRNAs that will target Cas9 nuclease to the N-terminal-coding exons of protein coding genes. After the action of Cas9 nuclease, the introduction of indels by the error-prone Non-Homologous End Joining repair of double strand breaks introduces frame-shift mutations and subsequent premature stop codons, leading to mRNA elimination by nonsense-mediated mRNA decay. Genome editing experiments to generate knockouts should be designed to disrupt exons that are shared by all transcript variants of a given gene. This strategy can also be applied to whole gene families using a sgRNA against exons that are conserved between all family members (Endo M et al., 2015). The CRISPys algorithm aims at designing the optimal sequence to target multiple members of gene family (Hyams et al., 2018).
The high frequency with which CRISPR-induced mutations can be directed to target genes enables easy isolation of homozygous gene knockouts. Paradoxically, a potential caveat is found in this high efficiency. This holds particularly true in cell lines upon targeting genes essential for cell viability and fitness. In this regard, two distinct genome-wide CRISPR-Cas9-based screens have identified ≈2000 essential genes in the human genome (Hart et al., 2015; Wang T et al., 2015). More recently, Lenoir and colleagues published a database of pooled in-vitro CRISPR knockout library essentiality screens that can be searched to identify genes which are essential across different human tissues (Lenoir et al., 2018).
Genetic screen in zebrafish and mouse have estimated that as many as 30% of genes are embryonic lethal (Driever et al. 1996; Haffter et al., 1996; Ayadi et al., 2012). The functional characterization of such essential genes requires the generation of heterozygous knockouts. The generation of hypomorphic alleles with the CRISPR system has been reported by different groups (Challa et al., 2016; Goto et al., 2016) but the method is not, at the moment, commonly used.
RNAi or CRISPRi (Qi et al., 2013) are efficient alternative loss-of-function methods and their effects can be directly evaluated at the transcriptome level. In addition, the development of inducible CRISPR tools provides a solution for genome editing with tight temporal control (Zestche et al., 2015b; Cao et al., 2016). They additionally circumvent the mechanisms of genetic compensation that not unfrequently mask the phenotypes of knock-outs but not knockdown models (El-Brolosy and Stainier, 2017).
Genetic polymorphism in the target region should be carefully assessed as it might have a profound influence on the CRISPR/Cas9 efficacy. Although base mismatches (up to 5) may be tolerated between the sgRNA and targeted sequences, the PAM and its proximal sequence have a stricter adherence to the consensus (Zheng et al., 2017). When a sgRNA is selected, the potential presence of a single-nucleotide polymorphism (SNP) in the PAM and the sgRNA-binding site should be verified as it can abolish Cas9 binding and cleavage. Of note, commonly used laboratory cell lines such as HeLa present a variant spectrum that slightly differs from the one found in the human population (Landry et al., 2013). In general, sequences found in genomic databases may not exactly correspond to the DNA sequences of the model used for the genome editing experiment. Sequencing of the target locus prior to sgRNA design will solve this potential pitfall. On the opposite, this PAM constraint can be exploited to target and disrupt heterozygous single-nucleotide mutations in certain dominant autosomal disorders, while leaving the wild-type allele intact (Courtney et al., 2016; Li et al., 2016). Cell line ploidy is an additional consideration to take into account. Many common laboratory cancer cell lines carry four or more copies of a chromosome. Full knockouts would then require the introduction of mutations in all copies of the target gene. In practice, it is strongly advised to sequence the target loci to verify homozygous knockout when generating mutant clonal cell lines (see further section).
Besides the influence of the sequence features, chromatin states also strongly impact Cas9 binding and nuclease activity in vertebrates. Nucleosomes constitute fundamental units of chromatin and their positioning directly impedes Cas9 binding and cleavage in vitro and in vivo. Highly active sgRNAs for Cas9 are found almost exclusively in regions of low nucleosome occupancy (Horlbeck et al., 2016; Isaac et al., 2016). Higher order chromatin structure (i.e. organization beyond the level of the linear array of nucleosomes) also influences Cas9 binding and enzymatic activity. Several authors showed that Cas9 cleavage efficiency positively correlates with open chromatin based on DNase I hypersensitivity. Along the same line, the activity of Cas9 can be significantly hindered by compact heterochromatin in cells (Daer et al., 2017). Interestingly, the engineered Cas9 variants designed to improve specificity, Cas9-HF1 (Kleinstiver et al., 2016) and eSpCas9(1.1) (Slaymaker et al., 2016), might be even more impacted than Cas9 by the chromatin-related factors (Chen et al., 2016; Jensen et al., 2017; Chen et al., 2017). While some gene editing applications have the option to select easy-to cleave targets, such practice may not be feasible for gene corrections and other potential therapeutic applications. Many CRISPR genome editing experiments focus on gene targeting and the study of the phenotypic consequences. In these applications, the gene of interest is usually transcriptionally active and the associated chromatin is relatively accessible to Cas9. Nevertheless, chromatin compactness can vary considerably between different genomic sites and from one cell type to another. Gene targeting in model organisms presents an additional challenge as chromatin landscape is under constant change to ensure coordinated growth and differentiation during early development. Atlases of transcriptional activity (RNA-Seq) and of chromatin accessibility (ATAC-Seq, ChIP-Seq, …) are valuable resources of information (see notably the ENCODE project: https://www.encodeproject.org) to predict sgRNA efficiency (Uusi-Makela et al., 2018) and have been used to elaborate a predictive algorithm for zebrafish sgRNA selection taking into account chromatin accessibility (Chen et al., 2017). Gene editing in mouse and human cells has been greatly facilitated by the publication of the genome-wide Brie and Brunello libraries (Doench et al., 2016). These optimized sgRNA libraries respectively target the mouse and human genomes, and provide 3-4 sgRNA sequences per gene with predicted high on-target efficiency and low off-target effects.
For more challenging applications such as the editing of heterochromatin embedded sequences, chromatin manipulation might enhance the CRISPR targeting efficiency. While treatment with chromatin-disrupting drugs does not appear sufficient, transient overexpression of a targeted transcriptional activator might be an effective method to enhance Cas9 editing at closed chromatin regions (Daer et al., 2017).
Delivery methods
Introduction of the CRISPR/Cas9 components into cultured cells is often achieved by DNA-based delivery systems such as transfection of plasmids encoding nuclear targeted Cas9 and sgRNA. Transduction with viral particles is also commonly used and is typically more efficient compared to plasmid transfection and is applicable to many cell types including primary cells. Plasmid transfection and viral transduction methods lead to a prolonged or a permanent expression of Cas9, respectively. Extended expression of Cas9 in cells can lead to accumulation of off-targeting events (Kim et al., 2014). Indeed, constitutive expression of lentiviral-based Cas9 and sgRNAs leads to an enrichment of predicted off-target sites over time. Reducing the concentration of delivered plasmid during transfection was shown to decrease off-targeting (Hsu et al., 2013). These data support the idea that controlling the expression of Cas9 and the sgRNAs in order to limit the time of action can reduce genome-wide off-targeting. A doxycycline-inducible promoter allows for transient Cas9 expression and is compatible with lentiviral delivery of the nuclease (Wang et al., 2018). Because gene editing results in a permanent change in the genome, CRISPR-mediated editing can be achieved using Cas9 protein/sgRNA ribonucleoprotein (RNP) complexes. sgRNAs can be rapidly synthetized in vitro or ordered from various sources. Recombinant nuclear targeted Cas9 protein can also be produced in house or obtained from different commercial suppliers. RNP complexes can be delivered by a variety of techniques such as lipid-mediated transfection (Liang et al., 2015; Zuris et al., 2015), electroporation (Kim et al., 2014), induced transduction by osmocytosis and propanebetaine (iTOP) (D’Astolfo et al., 2015), micro-injection (Gagnon et al., 2014) or cell-penetrating peptide-mediated delivery (Ramakrishna et al., 2014a). Uncoupling administration of the sgRNA and Cas9 protein (e.g. in the context of genome-scale screens) can lead to successful gene editing in human primary cells (Shifrut et al., 2018) and appears to be more efficient upon delivery of Cas9 protein complexed with a non-targeting gRNA (Ting et al., 2018). Finally, biolistic transfer of Cas9/sgRNA RNP complexes or of Cas9- and sgRNA-encoding plasmids appears as an attractive alternative for cells resistant to other delivery methods such as plant cells (Svitashev et al., 2016; Hamada et al., 2018).
The sgRNA-Cas9 RNPs were shown to cleave the target chromosomal DNA between 12 and 24 h after delivery and the frequency of gene editing reaches a plateau after one day. For plasmid expression of Cas9 and sgRNA, equivalent gene editing levels were only achieved at three days after delivery (Kim et al., 2014). Furthermore, the Cas9 protein has been shown to be degraded rapidly in cells, within 24–48 h after delivery, compared to several days when continuously expressed from a plasmid (Kim et al., 2014; Liang et al., 2015). Several authors showed that the ratio of the indel frequency at the on-target site to off-target sites strongly increases when RNPs are transfected in comparison with plasmid (Ramakrishna et al., 2014a; Liang et al., 2015). While off-target effects may be less of a concern in screening applications since any identified “hits” will be confirmed through follow-up experiments, constitutive expression or high stability of Cas9 nuclease and/or sgRNA may be undesirable for many applications, such as generation of clonal cell lines for a phenotypical study of a specific gene knockout. In addition to the increased potential for off-target effects due to prolonged or constitutive expression of components of the CRISPR-Cas9 system, unwanted incorporation of the plasmid DNA into the cell genome is not uncommon. When the DNA repair pathways are activated after Cas9-mediated double-stand breaks, the risk of foreign DNA integration is increased. The absence of transgene eliminates the risk of unintended DNA integration.
The delivery of RNP complexes has also the major advantage to be easily applicable to a wide range of model organisms and cell types. In in vivo contexts, the functionality of the Cas9-sgRNA RNP complexes has been reported as being superior to other delivery methods. In zebrafish, mutagenesis can be performed through micro-injection of Cas9-encoding mRNA or of Cas9 protein together with sgRNA into fertilized embryos. Contrary to Cas9-encoding mRNA, RNPs are immediately active upon microinjection and are generally more effective (Burger et al., 2016 and Figure 1). This is of significant importance as the first cell division in zebrafish occurs very rapidly (40 min. after fertilization) and mutagenesis occurring after this first division leads more likely to mosaicism. The fact that sgRNAs can be easily synthetized in vitro makes it possible to use multiple sgRNAs simultaneously to achieve multigenic targeting (Liang et al., 2015; Song et al., 2017). The DNA-free system also suppresses the variability that can arise from the choice of promoter used to drive expression from vector-based CRISPR-Cas9 systems. It is well known that not all promoters are functional in every cell type or cell line, so delivery of Cas9 protein or Cas9 mRNA avoids incompatibilities of certain promoters in specific cells. Codon usage patterns also vary between species and Cas9 derived from DNA or mRNA expression may not yield the expected result as every organism has its own codon bias. Optimization of codon usage is a routine process but can be relatively time-consuming. Codon optimization becomes unnecessary when using Cas9 protein instead of a DNA- or mRNA-based delivery method. Independent of the delivery mode, specific anti-Cas9 antibodies can be used to measure Cas9 expression level by western and to confirm Cas9 presence in the nucleus (Figure 2).
Lentiviral or plasmid delivery of Cas9 and sgRNA often utilizes a selection gene encoding either a drug selectable marker (hygromycin, blasticidin, puromycin, …) or a reporter protein (GFP, NGFR, …) to isolate cells that are successfully transduced or transfected. When RNPs are transfected or electroporated, alternative strategies can be used such as surrogate reporters (Ramakrishna et al., 2014b, He et al., 2016; Wu et al., 2017). However, these methods are inefficient for assessing sgRNAs efficiency at a large scale because it is both time- and labour-consuming to construct a specific reporter for each individual sgRNA. To avoid specific cloning, the transfection efficiency can also be indirectly evaluated with a dTomato reporter assay (D’Astolfo et al., 2015). Fluorescent versions of Cas9 such as Cas9-GFP (Mircetic et al., 2017) or Cas9-Cy3 (Kim et al., 2017) can also be used to sort RNPs-transfected cells. These latter methods focus on the physical separation of edited cells from unedited cells. An important aspect to consider is that CRISPR experiments lead to a genetic heterogeneity due to the random nature of DNA repair by the NHEJ pathway. As this genetic heterogeneity could yield phenotypic heterogeneity, monoclonal populations should be isolated prior to phenotypic analysis. The first step is to determine the editing efficiency of the entire cell population. This information can indicate how many individual clones should be isolated and checked for editing. If limited dilutions are used to isolate individual cells, it should be realised as soon as possible after termination of the edition process, as non-edited cells could potentially outgrow edited cells.
Gene editing in in vivo mouse models was greatly facilitated by the generation of a knock-in (KI) transgenic mouse in which a Cre-inducible Cas9-P2A-GFP was inserted in the Rosa26 locus (Platt et al., 2014). Cre-recombination leads to cell- or tissue-specific Cas9 expression, as evidenced by GFP expression. Apart from allowing for gene editing following in vivo delivery of sgRNAs, this model can also be used to efficiently edit the genome of primary cells ex vivo.
Guide RNA efficiency and specificity
The performance of sgRNAs targeting the same gene can vary dramatically. This was recently highlighted in a novel approach to CRISPR genomics where expression of sgRNAs was coupled with specific protein barcodes, allowing for simultaneous multidimensional phenotypic analysis of several dozens of knockouts at a single cell resolution (Wroblewska et al, 2018). In a pooled parallel analysis of gene editing efficiency for 10 genes (3-4 sgRNAs guide per gene), the authors demonstrated that the gene KO at the protein level was highly variable depending on the sgRNAs used. There are many bioinformatic tools available for sgRNA design and some of these tools also apply filters or show ‘scores’ related to predicted effectiveness. Small guide RNAs with potential for weak secondary structures are likely to be more efficient than alternatives with strong secondary structures (Thyme et al., 2016). Nevertheless, no computational tool can guarantee the efficacy of a sgRNA and, when possible, several sgRNAs should be tested. Endonuclease cleavage assays can be used to characterize the in vitro efficacy of a particular sgRNA. Experimental validation of sgRNAs before practical application is particularly important to minimize wasted experiments on sgRNAs with poor activity. In these in vitro assays, the target DNA site, including its PAM motif, is either inserted into a plasmid or provided in the form of a PCR product. The Cas9 recombinant protein and the sgRNA are pre incubated in a 1:1 molar ratio in the cleavage buffer to reconstitute the Cas9-sgRNA complex prior to the addition of target DNA. Cleavage of plasmid or PCR substrates are monitored by agarose gel electrophoresis with an intercalating dye (Figure 3). The reaction rate can strongly vary in function of DNA source and length (PCR product versus plasmid, circular plasmid versus linear plasmid), optimal enzyme and substrate concentrations, and also reaction time points need to be determine empirically (Anders and Jinek, 2014). This in vitro test validates sgRNA intrinsic capacity to form cleavage-competent complexes, however it does not guarantee in vivo effectiveness which also greatly depends on chromatin accessibility as previously mentioned.
The targeting specificity of Cas9 is believed to be tightly controlled by the 20-nt guide sequence of the sgRNA and the presence of a PAM adjacent to the target sequence in the genome. Nevertheless, potential off-target cleavage activity can still occur on DNA sequence with even three to five base pair mismatches in the PAM-distal part of the sgRNA-guiding sequence (Fu et al., 2013). Of note, shortening of sgRNA guide sequence to 17-18 nucleotides was shown to improve target specificity (Zhang et al., 2016; Fu et al., 2014). Numerous online tools are available to assist in sgRNA design but the correlation between the predictions and the actual measurements vary considerably since sequence homology alone is not fully predictive of off-target sites (Haeussler et al., 2016). These tools also suggest probable off-target sites but the appropriate number of potential sites to experimentally assay remains unclear. Moreover, there are still contradictory conclusions as to the prevalence of off-target effects, from low (Kim et al., 2015) to high levels of off-targeting (Tsai et al., 2015).
Cleavage at on- and off-target sites can be assessed using various methods which include mismatch-sensitive enzymes (Surveyor or T7 endonuclease I assay), restriction fragment length polymorphism (RFLP analysis), High Resolution Melting curve Analysis (HRMA) or PCR amplification of the locus of interest followed by sequencing. Surveyor and T7 Endonuclease I specifically cleave heteroduplex DNA mismatch. The T7 endonuclease I assay outperforms the Surveyor nuclease in terms of sensitivity with deletion substrates, whereas Surveyor is better for detecting single nucleotide changes. The limit of sensitivity for T7 endonuclease I assay is around 5% (Vouillot et al., 2015). HRMA utilizes the difference in melting curve of the heteroduplex and mutant homoduplex. A recent report demonstrates that techniques such as targeted Next Generation Sequencing (NGS), Tracking Indels by Decomposition (TIDE) and Indel Detection by Amplicon Analysis (IDAA) outperform nuclease-based methods to detect Cas9-mediated edition in pools of cells (Sentmanat et al., 2018). Ultimately, Sanger sequencing of DNA from individual clones is the gold standard for confirming the presence of indels at on-target site but is not easily applicable to off-target detection. Overall, these indels detection methods are relatively straightforward but are low throughput and interrogate one locus at the time.
Unbiased off-target analysis requires the detection of mutations generated in the target cells by the CRISPR/Cas9 system outside their target locus. In theory, Whole Genome Sequencing (WGS) of cells before and after editing could be used to study CRISPR/Cas9 specificity. In a clonal population, off-target sites can be determined by the analyses of the new mutations that have been generated outside the intended locus. However, WGS faces its own challenges and might not be easily applicable to the detection of off-target mutations. While sequencing costs continue to drop, a certain degree of bioinformatic expertise is necessary to detect small indels and separate signal from noise. In fact, many spontaneous new mutations may appear during clonal expansion and it might not be possible to distinguish them from off-target effects. WGS of individual induced Pluripotent Stem Cells clones reveals a large number of indels in the genome that are not the result of Cas9 activity, but rather a consequence of clonal variation or technical artefacts (Smith et al., 2014). To circumvent these limitations, several methods have been recently developed to measure Cas9 off-target activity across the genome such as BLESS (labeling of double strand breaks followed by enrichment and sequencing) (Ran et al., 2015), HTGTS (high throughput genome-wide translocation sequencing) (Frock et al., 2015), GUIDE-Seq (genome-wide unbiased identification of double-strand breaks enabled by sequencing) (Tsai et al., 2015), Digenome-Seq (in vitro Cas9-digested whole genome sequencing) (Kim et al., 2015), IDLV (detection of off-targets using integrase-deficient lentiviral vectors) (Wang X et al., 2015) and most recently, SITE-Seq (a biochemical method that identifies DNA cut sites) (Cameron et al., 2017) and CIRCLE-Seq (an in vitro method for identifying off-target mutations) (Tsai et al., 2017). Overall, these unbiased methods tend to be less sensitive and have a lower throughput than biased targeted sequencing, in addition to typically requiring higher sequencing coverage and much more complex protocols. These techniques also require manipulation of the genome and might be difficult to apply on some samples (primary cells, in vivo…).
Chromatin Immunoprecipitation followed by Next Generation Sequencing (ChIP-Seq) is a technique of choice for studying protein-DNA interactions. ChIP has been used to pull down the Cas9 nuclease protein together with the DNA fragments to which the nuclease was bound (Kuscu et al., 2014; Wu et al., 2014). The immunoprecipitation of Cas9 bound to the genome is technically challenging due to the nuclease activity of Cas9. However, the introduction of two amino-acid changes (D10A and H840A) in Cas9-coding sequence results in a nuclease-inactive DNA-binding protein named “dead Cas9” (dCas9). Specific enrichment of dCas9 at on-target regions can be evaluated by ChIP-qPCR using ChIP-grade Cas9 antibodies (Figure 4A). Moreover, this approach can be extended to the unbiased analysis of off-target sites by ChIP-Seq (Figure 4B). dCas9-based ChIP-PCR/Seq is thus a powerful approach to score several sgRNA at once thanks to its rapidity, reduced sequencing cost and high coverage. Moreover, it is of predictive value of sgRNA performance upon association with catalytically active Cas9 (Kuscu et al., 2014), although Cas9 DNA-binding and cleavage activities are sometimes uncoupled (Wu et al., 2014). As no single method guarantees a complete coverage of off-target sites, multiple approaches should ideally be combined. Therefore, sequence-based in silico prediction combined to genome-wide ChIP-Seq dCas9-binding analysis can efficiently identify off-target sites (O’Geen et al., 2015).
Variants of dCas9 have recently been generated that allow repurposing of the system to a variety of applications. Fusing dCas9 to various transcriptional activating or repressing modules proved to be a potent way of regulating gene expression (Gilbert et al., 2013; Tanenbaum et al., 2014; Konermann et al., 2015; Yeo et al., 2018). This approach has also been used to identify enhancers of key loci (Simeonov et al., 2017). Moreover, dCas9 can be fused to domains that regulate the epigenetic landscape at endogenous loci (O’Geen et al., 2017). It can also be used to label endogenous loci for live visualization (Neguembor et al., 2017) or to edit a single base in the genome (Komor et al., 2016). In those applications, the binding specificity of dCas9 fused to various effectors could be tested by dCas9 ChIP-Seq as we describe here.
Perspectives
From the first description of Cas9 derived from Streptococcus Pyogenes for gene editing in 2013, an incredible progress has been made to optimize and adapt its use in a wide range of applications. Structural studies of Cas9 led to the generation of several variants such as enhanced specificity Cas9 (eSpcas9), high fidelity Cas9 (Cas9-HF1) and hyper-accurate Cas9 (HypaCas9) which display increased specificity due to reduced DNA-binding affinity (eSpCas9 and Cas9-HF1) (Slaymaker et al., 2016; Kleinstiver et al., 2016) or locking of the nuclease domain upon guide/target mismatches (HypaCas9) (Chen et al., 2017). In addition, Cas9 nickases (Cas9n) were developed by inactivating the cleavage activity on target or non-target DNA and have been demonstrated to nick only one DNA strand instead of generating a double strand break (DSB). DSB are generated only upon recruitment of a Cas9n pair with two sgRNA that target opposite strands in close proximity (Hsu et al., 2013; Cho et al., 2014), thereby increasing specificity by double selection. A similar strategy was used to develop a fusion of dCas9 with the catalytic domain of FokI nuclease (fCas9) which induces DSB only upon dimerization of the FokI domains by sgRNA pairing to complementary strands (Tsai et al., 2014). With the same aim of reducing Cas9 off-target activity, several Cas9 variants whose editing activity can be irreversibly or reversibly programmed are now also available (Adli M, 2018; Wu et al., 2018 for review). Finally, engineered Cas9 variants with novel PAM specificities enlarge the edition spectrum to previously inaccessible sites (Kleinstiver et al., 2015 and 2016).
Several limitations have yet to be addressed to promote Cas9 use in gene therapy. First, the source of Cas9 nucleases, i.e. S. pyogenes and S. aureus, are common human pathogens. Recent reports have highlighted pre-existing immunity towards both SpCas9 and SaCas9 in the human population, with a high prevalence of both Cas9-reactive T cells and antibodies (Wagner et al., 2018; Simhadri et al., 2018; Charlesworth et al., 2018). Although it is still unclear whether AAV delivery of Cas9 leads to the immune rejection of transduced cells in vivo, strategies to control the anti-Cas9 T cell responses, such as transient immunosuppression or engineering Cas9 proteins with mutated T cell epitopes, are being considered (Crudele and Chamberlain, 2018; Ferdosi et al., 2018). Another limitation of Cas9 for its use in gene therapy resides in its rather large size which is incompatible for efficient packaging into Adeno-associated virus (AAV) vectors, the most commonly used delivery systems in gene therapy. Although this hurdle can be overcome by the separation of the recognition lobe from the nuclease lobe into two separate vectors (Truong et al., 2015), the emergence of Cas9 orthologs of smaller size might provide more efficient alternatives (Cebrian-Serrano et al., 2017, for review). Beside solving the delivery problem, CRISPR effectors from other bacterial and archeal species offer different substrate specificities or operate according to different mechanisms. This is notably the case of Cas12a (Cpf1) which is structurally different from Cas9, has no requirement for tracer RNA, recognizes a T-rich (TTTN) PAM sequence lying 5’ of the target sequence, and uses a different mechanism for target recognition and cleavage (Zetsche et al., 2015a). Cpf1 also possesses the ability to cleave RNA and generate multiple crRNAs from a single pre-crRNA array. This capacity has been harnessed to achieve multiplex gene editing using a single pre-crRNA array, which can both increase KO efficiency (when using multiple crRNAs targeting the same locus) or easily KO multiple genes with a single construct (Zetsche et al., 2017). Moreover, gene editing by Cpf1 results in lower off-target effects than Cas9, as evidenced by genome-wide analysis of edited cells (Kim et al., 2016). Finally, the discovery of Cas13a and CasRx as RNA-guided nucleases targeting RNA paves the way to new therapeutic approaches based on RNA editing (Shmakov et al., 2017, Konermann et al., 2018).
While a number of solutions and guidelines to harness CRISPR-Cas9 based gene targeting has been provided, it is expected that therapeutic, industrial, and research applications will still place high demand on improving the specificity and efficiency of the CRISPR/Cas9 system. As CRISPR-based gene targeting technology continues to become more sophisticated and diverse, optimized procedures and quality controls guidelines should be established.
Acknowledgements
We thank Romuald Soin and Nadège Delacourt for technical support. Claude Van Campenhout was funded by a FIRST Entreprise grant from DGO6 from the Walloon Region, Belgium.