Abstract
CRISPR-Cas9 genome editing creates targeted double strand breaks (DSBs) in eukaryotic cells that are processed by cellular DNA repair pathways. Co-administration of single stranded oligonucleotide donor DNA (ssODN) during editing can result in high-efficiency (>20%) incorporation of ssODN sequences into the break site. This process is commonly referred to as homology directed repair (HDR) and here referred to as single stranded template repair (SSTR) to distinguish it from repair using a double stranded DNA donor (dsDonor). The high efficacy of SSTR makes it a promising avenue for the treatment of genetic diseases1,2, but the genetic basis of SSTR editing is still unclear, leaving its use a mostly empiric process. To determine the pathways underlying SSTR in human cells, we developed a coupled knockdown-editing screening system capable of interrogating multiple editing outcomes in the context of thousands of individual gene knockdowns. Unexpectedly, we found that SSTR requires multiple components of the Fanconi Anemia (FA) repair pathway, but does not require Rad51-mediated homologous recombination, distinguishing SSTR from repair using dsDonors. Knockdown of FA genes impacts SSTR without altering break repair by non-homologous end joining (NHEJ) in multiple human cell lines and in neonatal dermal fibroblasts. Our results establish an unanticipated and central role for the FA pathway in templated repair from single stranded DNA by human cells. Therapeutic genome editing has been proposed to treat genetic disorders caused by deficiencies in DNA repair, including Fanconi Anemia. Our data imply that patient genotype and/or transcriptome profoundly impact the effectiveness of gene editing treatments and that adjuvant treatments to bias cells towards FA repair pathways could have considerable therapeutic value.
Main Text
The type II CRISPR endonuclease Cas9 and engineered guide RNA (gRNA) form a ribonucleoprotein (RNP) complex that introduces double stranded breaks (DSBs) at DNA sequences complementary to the 23 bp protospacer-PAM sequence. This activity stimulates two major types of DNA repair within a host cell that are relevant to genome editing: genetic disruption, which creates insertions or deletions (indels) at the cut site and can disrupt functional sequences; and genetic replacement, which incorporates exogenous donor DNA sequences at the cut site, allowing the correction of dysfunctional elements or insertion of new information3. Efficient and targeted genetic replacement is particularly exciting, as it holds great promise for the cure of myriad genetic diseases.
Despite the rapid adoption of CRISPR-Cas9 genome editing, relatively little is known about which cellular DSB repair pathways underlie Cas9-mediated genetic replacement. This lack of clarity has complicated efforts to better understand and rationally improve the process of genome editing. The pathways responsible for genetic replacement are frequently referred to in aggregate as HDR, which includes DSB repair programmed from dsDonors (both linear and plasmid) requiring several kilobases of homology to the targeted site, as well as synthetic ssODNs with only 100-200 bases of homology to the target4. Repair from dsDonors is relatively inefficient in most cell types5 and is assumed to utilize a repair mechanism paralleling meiotic homologous recombination (HR)6. By contrast, SSTR is highly effective in human cells (>20% of alleles)1,5,7 and broadly conserved among metazoans8, but very little is known about the mechanism responsible. While screening human cancer cell lines, we found that SSTR-based genome editing at a given locus can vary from completely ineffective (0% SSTR) to extremely efficient (30% SSTR) depending on the cell background [Extended Data Figure 1]. This implies genetic or transcriptional differences that up- or down-regulate gene editing in different contexts.
To map the pathways involved in SSTR, we developed a coupled inhibition-editing screening platform that combines individual CRISPR inhibition (CRISPRi) of thousands of genes with Cas9 editing at a single-copy genomically integrated BFP reporter [Figure 1A]. Each cell in the screening pool stably expresses a dCas9-KRAB CRISPRi construct as well as a gRNA targeting the TSS of a single gene. This pool is then nucleofected with preformed Cas9-gRNA ribonucleoprotein complex (RNP) targeting the BFP reporter, as well as an ssODN that programs a 3 basepair codon-swap that converts BFP to GFP7. The femtomolar affinity between S. pyogenes Cas9 and the gRNA9, along with the transient nature of the Cas9 RNP10 strongly disfavors guide swapping between Cas9 molecules, preserving separation between CRISPRi and targeted gene editing. Editing outcomes in each cell are separated by fluorescence activated cell sorting (FACS) and next generation sequencing is used to determine genes whose knockdown leads to enrichment or depletion from each sorted population. [Figure 1A].
To enable discovery of relatively low frequency events, we created a focused CRISPRi lentiviral library containing 2,000 genes (10,000 gRNAs, 5 gRNAs targeting each primary gene transcript) with gene ontology terms related to DNA processing [Document S2 GUIDES]11. This library was stably transduced at low multiplicity of infection into cells expressing dCas9-KRAB and selected for the gRNA construct for ten days to allow gRNA populations to reach equilibrium and for gRNAs targeting essential genes to drop out of the population. We harvested a sample of cells at this point as a control for comparison with previously published essentiality screens. We then electroporated cells with Cas9 RNP and the BFP-to-GFP ssODN7. Under unperturbed conditions, this combination of reporter, RNP, and ssODN yields ∼70% gene disruption (no longer BFP+) and ∼20% SSTR (BFP edited to GFP) [Extended Data Figure 2]. We harvested another sample of cells seven days after electroporation to identify genes whose knockdown is synthetic lethal with a Cas9-induced DSB, as measured by depletion only after introduction of Cas9. To identify genes involved in editing events, we used FACS to separate cells into unedited (BFP+/GFP-), Indel (BFP-/GFP-) and SSTR edited (BFP-/GFP+) populations [Figure 1A, Extended Data Figure 2]. We used Illumina sequencing to measure gRNA abundances in each population, and compared these distributions to the edited unsorted cell population to reveal which target genes promote (gRNA depleted from edited population) or restrict (gRNA enriched from edited population) specific genome editing activities.
To benchmark our screening system, we identified essential genes by comparing the library-infected dCas9-KRAB CRISPRi cells with cells infected with only the gRNA library and no dCas9-KRAB [Extended Data Figure 3A, Document S3 Essential Genes]. Genes that were depleted after 14 days from the functional CRISPRi cells as compared to the gRNA-only control cells were significantly enriched for critical biological processes (DAVID12 analysis: proteasome core complex p=8.6e-14 and DNA replication p=1.7e-11). Furthermore, genes we identified as essential reproduced previously published essentiality screens [Extended Data Figure 4A]11, demonstrating that we had achieved stable gene knockdown and robust hit calling from the cell pools.
We next investigated genes whose knockdown was synthetic lethal with a Cas9-induced DSB. gRNAs targeting these genes should be depleted after Cas9 editing as compared to unedited cells [Extended Data Figure 3B]. While essential genes are progressively lost from the cell population over time [Extended Data Figure 4B], genes whose knockdown is synthetic lethal with DNA damage are lost only after a Cas9 DSB [Extended Data Figure 4C]. Genes required to survive a single Cas9-induced DSB were enriched for the GO terms such as cell cycle arrest (p=3.3E-05) and response to DNA damage (p= 3.4E-22) and include several factors previously reported to be synthetic lethal with other DNA damaging agents [Document S3 SurviveDSB]. For example, knockdown of MYBBP1A has recently been reported to cause senescence in combination with nonspecific DNA damage13. Our screening cell line contains a single copy of the targeted BFP allele, which suggests that a single DSB is sufficient to trigger genotoxic-induced senescence when these synthetic lethal genes are depleted. Together, our results indicate that our coupled inhibition-editing strategy performs well in identifying not only essential genes, but also genes involved in DNA repair pathways required to survive a Cas9 DSB. Future investigation of novel genes identified as synthetic lethal with a DSB could provide new insight into mechanisms of genome surveillance.
To identify factors required for SSTR editing, we used FACS to isolate GFP+ cells (that had undergone BFP-to-GFP conversion via SSTR) and measured depletion of gRNAs relative to the unsorted edited pool [Figure 1B]. Strikingly, 70% (28/40) of genes annotated in the Fanconi Anemia (FA) pathway were robustly and consistently depleted from the GFP+ population relative to unsorted edited cells. Gene set enrichment analysis14 verified that DNA repair in general and the FA pathway in particular was a defining feature of SSTR [Figure 1C]. Several distinct functional groups within the FA repair pathway were identified as required for SSTR: multiple components of the FA core complex that senses lesions, FA core regulatory components that activate the FANCD2-FANCI heterodimer, downstream effector proteins that repair lesions, and associated proteins that interact with canonical FA repair factors [Figure 1D]. Our identification of the FA pathway as central to SSTR was striking, as to the best of our knowledge the FA pathway has not previously been investigated for its role in Cas9 gene editing.
We used individual knockdown of FANCA, RAD51, and other DNA repair genes to further investigate the genetic basis of SSTR and dsDonor HDR. Previous reports have indicated the editing outcomes of SSTR are RAD51-independent and ineffective during G2/M15,16, while dsDonor HDR is RAD51-dependent17, FANCA-dependent18, and active during G2/M19. Using Cas9 RNPs to edit the same locus with dsDonors and ssODNs in K562 human erythroleukemia cells, we found approximately four-fold higher gene replacement efficiency with ssODNs. Knockdown of FANCA caused statistically significant four-fold reduction in SSTR (p<0.05, Welch’s two-sided t-test) and a non-significant two-fold reduction in dsDonor HDR (p=0.22, Welch’s two-sided t-test) [Figure 1E]. These results highlight the unexpected role of FANCA in SSTR and suggest it might also play some role in HDR. As expected, neither SSTR nor HDR required NHEJ (mediated by LIG4), or the related Alternative End Joining (Alt-EJ) pathway (mediated by PARP1). Notably, we found that knockdown of RAD51 abolished dsDonor HDR but had no effect upon SSTR [Extended Data Figure 5]. Taken together, individual knockdown bolsters the hypothesis derived from the primary screen that SSTR is a genetically distinct pathway from dsDonor-mediated HDR.
The FA repair pathway is best understood in its capacity to identify and repair interstrand crosslinks (ICLs) throughout the genome, but has recently gained attention for its role in protecting stalled replication forks20-22. In the presence of ICLs or a stalled fork, the FA core complex (comprised of FANCA, B, C, E, F, G, L, M, FAAP100, FAAP20, and FAAP24) is required for monoubiquitination and activation of the FANCD2-FANCI heterodimer by UBE2T and FANCL. Monoubiquitination then leads to recruitment of downstream factors that repair the lesion via nucleotide excision repair (NER) or specialized homologous recombination sub-pathways. Subsequent to repair, FANCD2/FANCI is recycled through deubiquitination by USP1 and WDR48. Deactivation of FANCD2/FANCI appears to be a key step in restoring homeostasis, as mutants in USP1 and WDR48 phenocopy classical FA mutants with an increased sensitivity to ICL-causing agents. Notably, our screen identified that SSTR depends upon genes that act in every functional category of the FA repair pathway [Figure 1D]. SSTR is therefore likely to be a central activity of the FA repair pathway as opposed to a moonlighting activity of one or more FA genes.
To further explore the genetic basis of SSTR, we used CRISPRi to stably knock down seven separate FA repair genes that operate at different places in the FA pathway and quantified the frequency of Cas9-mediated SSTR at multiple loci. Knockdown of FANCA, FANCD2, FANCE, FANCF, FANCL, HELQ, UBE2T, USP1, and WDR48 all substantially decreased SSTR at a stably integrated BFP reporter, as measured by flow cytometry [Figure 2A]. Stable cDNA re-expression of each factor restored wildtype levels of SSTR, demonstrating that CRISPRi was specific to the targeted gene and that ablation of each gene was solely responsible for the loss of SSTR. Re-expression of an FA factor in the context of its knockdown increased editing efficiency up to 8-fold. These results demonstrate that multiple genes in different parts of the FA repair pathway are required for SSTR editing, that their presence is necessary for efficient SSTR, and that re-expression is sufficient to restore SSTR.
In addition to the FA pathway’s well-characterized roles in ICL repair, there is an emerging view that it plays additional roles in preserving genome stability. FA genes protect against aberrant chromosomal structures and replication stress via specialized subcomplexes that in part depend upon particular helicases, including Bloom’s helicase (BLM) and the 3’-5’ ssDNA helicase HELQ. We found that siRNA knockdown of BLM had no effect on SSTR [Extended Data Figure 7], but knockdown of HELQ markedly reduced SSTR [Fig 2A, Extended Data Figure 7]. BLM and its interaction partner RMI2 exhibited strong phenotypes in the primary screen [Document S3, SSTR]. However, both of these factors were required (p<0.01) for survival of a Cas9-induced DSB [Document S3, SurviveDSB], which suggests a role for the BLM complex in surviving a DSB instead of SSTR itself. While BLM has been linked to FA-mediated resolution of replication stress23, HELQ directly interacts with FANCD2/FANCI with unknown functional significance24. HELQ also interacts with multiple recombination subcomplexes, including BCDX2 (RAD51B, RAD51C, RAD5D, and XRCC2) and CX3 (RAD51C-XRCC3). These complexes could promote recombination between the ssODN and genomic DNA, and we asked if these complexes also impact SSTR.
We found that RAD51C is required for SSTR, but RAD51B and XRCC2 are not. This suggests that BCDX2 does not play a role in SSTR [Extended Data Figure 7]. Conversely, both RAD51C and XRCC3 are required for SSTR, implicating the CX3 complex in SSTR [Extended Data Figure 7]. Intriguingly, CX3 has been reported to act downstream of RAD51 filament formation25, but we found that RAD51 itself is dispensable for SSTR [Figure 1B, Extended Data Figure 5]. We anticipate that future work to characterize how the FA pathway interacts with downstream effectors, especially polymerases and genes that mediate recombination, will provide valuable insights into the mechanism of SSTR and its interaction with other pathways that maintain genome stability.
Inhibition of SSTR by interfering with the FA pathway could work by globally reconfiguring DNA repair pathway preference or by specifically inhibiting SSTR. We investigated how the FA pathway influences repair pathway choice by inhibiting several FA genes and measuring editing outcomes using Illumina sequencing [Extended Data Figure 8]. When editing the endogenous hemoglobin β (HBB) locus at the causative amino acid (Glu6) for sickle cell disease, we found that all seven FA factors are required for SSTR editing [Figure 2B]. Notably, knockdown of FA genes decreased levels of SSTR while simultaneously increasing levels of NHEJ, such that total editing (SSTR + gene disruption) remained relatively constant [Figure 2B, Extended Data Figure 9]. However, when we edited the HBB locus in the absence of an ssODN, we found that knockdown of FA repair genes did not significantly increase NHEJ frequency on its own [Extended Data Figure 9]. We found similar results at the BFP locus when measuring editing outcomes by Illumina sequencing. These results imply that the FA repair pathway acts to divert repair events that would otherwise be repaired by NHEJ into SSTR outcomes. This model parallels proposed roles for the FA pathway in balancing NHEJ and HR repair frequencies during ICL repair26,27 and balancing Alt-EJ, NHEJ, and HDR repair outcomes near DSBs28.
To determine if FA repair genes are responsible for SSTR in primary human cells, we edited human neonatal dermal fibroblasts at HBB Glu6. These fibroblasts have previously been shown to be capable of SSTR repair, albeit at lower levels than many cell lines15. Untreated or mock siRNA treated fibroblasts exhibited approximately 5% SSTR at the HBB locus, as measured by Illumina sequencing. siRNA knockdown of either FANCA or FANCE led to an approximately five-fold reduction in SSTR [Figure 2C]. Therefore, the FA repair pathway is tightly linked to SSTR in at least one primary human cell type.
The sequence outcomes of genomic disruption (indels) following Cas9-induced DSBs are often nonrandom and surprisingly consistent at individual loci, leading to an emerging model that repair outcomes are determined by the intrinsic repair pathway preferences of the edited cell and the sequence immediately adjacent to the cut site29. To determine how FA pathway disruption affects the characteristic spectrum of indels as a Cas9-induced break, we characterized individual allele frequencies in unperturbed and FA knockdown CRISPRi cell lines using Illumina sequencing at both BFP and HBB Glu6. We also examined SSTR conversion tracts, a function of SNP integration relative to distance from the Cas9-induced break, by following the incorporation of several single nucleotide polymorphisms (SNPs) encoded by the ssODN into the genomic sequence.
In the absence of an ssODN to program SSTR, neither the overall frequency nor the pattern of indels at the Cas9 cut site was affected by disruption of the FA repair pathway [Figure 3A]. We furthermore observed no change in indel spectra upon FA knockdown when editing cells in the presence of an SSTR-templating ssODN. However, when editing with an ssODN, SSTR decreased dramatically upon disruption of the FA pathway. This decrease was remarkably uniform across SNPs within the ssODN and did not measurably alter the SNP conversion tracts. These results reinforce our earlier observation that FA repair pathway inactivation specifically inhibits SSTR without altering the frequency of indels. Additionally, the molecular sequence outcomes of NHEJ are unaffected by the FA pathway. Instead, the FA pathway is restricted to SSTR repair and the balance between NHEJ and SSTR, but does not play a direct role in error-prone end-joining pathways.
In sum, we have found that multiple functional complexes within the FA repair pathway are necessary for Cas9-mediated SSTR. Genome editing is commonly grouped into two categories, genetic disruption and genetic replacement, based on sequence outcomes2. Our results demonstrate that final genetic replacement outcomes using different templates (ssODN vs dsDonors) are identical at the sequence level but stem from completely different pathways [Figure 4]. Specifically, information from double stranded DNA templates and genomic DNA are incorporated using Rad51-dependent processes, but single stranded DNA templates are incorporated through the FA pathway. A great deal of work has focused on improving HDR during gene editing by activating Rad51-mediated processes, including Rad51 agonist small molecules30 and strategies to stimulate recruitment of Rad51 throughout the cell cyle31. Our results indicate that future efforts to the activate FA pathway could be invaluable during gene editing for research or therapeutic uses.
Cas9-mediated genome editing holds great promise for the treatment of genetic diseases such as sickle cell disease and Fanconi Anemia. High rates of gene editing are typically required for therapeutic editing applications, but editing efficiencies can differ greatly between cells. Without knowledge of the pathways responsible for genetic replacement outcomes and the activity of those pathways in the targeted cell type, it was previously difficult to rationalize why editing might fail in one application while succeeding in another. Our results predict that human cell types with intrinsic repair preferences that impact the FA pathway will be more or less capable of SSTR [Extended Data Figure 1]. The expression level of FA-related factors could in future be useful as a biomarker for patient cell “editability”, and treatments that enhance the activity of the FA pathway could be especially valuable in difficult to edit cells. For example, we found that complementing FA pathway knockdown yields up to 8-fold increase in editing efficiency in cell lines [Figure 2A]. This suggests that reactivating the FA pathway could be valuable in cases where it has been disrupted, such as in Fanconi Anemia itself. Small molecule activators of the FA pathway remain to be identified, but our results suggest that transiently increasing the levels of FA proteins could complement patient-specific defects to enable lasting gene editing cures. More broadly, our results suggest that patient genotype or transcriptome could increase or decrease the effectiveness of therapeutic treatments in previously unanticipated ways. Deeper understanding of the molecular basis of SSTR and dsDonor HDR is likely to suggest new biomarkers to ‘match’ patient genotype with therapeutic editing strategy.
Finally, our data imply that the default repair pathway for DSBs, especially DSBs introduced by Cas9, is end joining, and that the activity of the FA repair pathway determines whether many events will instead be repaired by SSTR [Figure 2A, Figure 4]. Cas9 is very stable on genomic targets7,32, and so it is possible that Cas9 itself is recognized as an interstrand crosslink or roadblock within the genome. However, we disfavor this hypothesis because FA knockdown only impacts SSTR without directly affecting indels. Instead, we hypothesize that Cas9-stimulated repair using an ssODN template mimics some substrate of the FA pathway, such as a stalled replication fork. We note that SSTR is much more efficient than HDR from a double stranded DNA template [Figure 1E], to the extent that in many cell lines, the most common single allele at an edited locus is the product of SSTR [Figure 3B]. This ability raises intriguing questions about genome integrity in the presence of single stranded DNA exposed by R-loops, replication crises, or viral infection. The FA pathway has already been implicated in replication crises, and future work to address remaining questions could provide insight into mechanisms by which human cells maintain their genomes.
Author Contributions
CDR and JEC designed experiments; CDR, AJS, and SF performed pooled screens; CDR, KRK, and SJF performed follow-up experiments; CDR, NLB, and JEC analyzed data; CDR and JEC wrote the manuscript.
Acknowledgements
We thank members of the Corn lab for helpful discussions about the manuscript. This work used the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 OD018174 Instrumentation Grant. We thank the Berkeley Macrolab for support with protein expression and purification. This work was supported by grants from the Li Ka Shing Foundation, the Heritage foundation, and the Fanconi Anemia Research Foundation.