Abstract
RNA interference (RNAi) is sequence-specific mRNA degradation guided by small RNAs (siRNAs) produced from long double-stranded RNA (dsRNA) by RNase Dicer. Proteins executing RNAi are present in mammalian cells but sustain a gene-regulating microRNA pathway while dsRNA-induced innate immunity relies on a sequence-independent interferon response. While striving to benchmark mammalian RNAi analysis, we report that the main RNAi constraint is siRNA production, which integrates Dicer activity, dsRNA structure, and siRNA targeting efficiency. Unexpectedly, increased expression of dsRNA-binding Dicer co-factors TARBP2 or PACT reduces RNAi but not microRNA function. Elimination of Protein Kinase R, a key dsRNA sensor for interferon response, had minimal positive effects in fibroblasts. Without increasing Dicer activity, RNAi can occur when the first Dicer cleavage of an abundant dsRNA produces an efficient siRNA. In mammals, efficient RNAi may effectively employ substrates, which have some features of microRNA precursors, hence bringing the two pathways mechanistically even closer. At the same time, Dicer substrate optimization, which viruses would avoid, represents an opportunity for evolving RNAi, yet unlikely as an antiviral system.
Introduction
Double-stranded RNA (dsRNA), a helical structure formed by complementary antiparallel RNA strands, has important biological effects. dsRNA can arise via (1) basepairing of complementary RNA molecules or by intramolecular RNA pairing or (2) RNA synthesis on an RNA template by an RNA-dependent RNA polymerase (RdRP). While mammals lack endogenous RdRPs (Stein et al., 2003a), dsRNA can still be produced by viral RdRPs in infected cells. dsRNA can enter different pathways in mammalian cells (reviewed in Gantier and Williams, 2007), most relevant for this work being the interferon (IFN) response and RNA interference (RNAi).
The interferon response is a complex innate immunity system where multiple sensors converge on a response involving activation of NFκB transcription factor and interferon-stimulated genes (Geiss et al., 2001). The key dsRNA sensor in the IFN response is protein kinase R (PKR, reviewed in Sadler and Williams, 2007), which is activated by dsRNA and inhibits translation initiation through phosphorylation of α-subunit of the eukaryotic initiation factor 2 (eIF2α) (Farrell et al., 1978; Meurs et al., 1990). PKR response is sequence-independent and affects translation universally although inhibition restricted to specific mRNAs was also observed (Ben-Asouli et al., 2002; Kaufman et al., 1989; Nejepinska et al., 2014). In addition to PKR, other factors sensing dsRNA may contribute to the IFN response, such as RIG-I, which recognizes blunt dsRNA ends (reviewed in Lassig and Hopfner, 2017), or oligoadenylate synthetases (OAS), which yield 2',5'-oligoadenylate triggers for global RNA destabilization by RNase L (reviewed in Kristiansen et al., 2011).
RNAi has been defined as sequence-specific RNA degradation induced by long dsRNA (Fire et al., 1998). During canonical RNAi, long dsRNA is cut by RNase III Dicer into ~22 nt small interfering RNAs (siRNAs), which are bound by an Argonaute endonuclease and guide sequence-specific mRNA recognition and endonucleolytic cleavage in the middle of base-paring between siRNA and mRNA molecules (reviewed in Nejepinska et al., 2012a). Additional factors participating in RNAi include dsRNA binding proteins (dsRBP). In Drosophila, R2D2 dsRBP restricts Dicer specificity to long dsRNA (Cenik et al., 2011; Fukunaga and Zamore, 2014; Nishida et al., 2013). In mammals, TARBP2 and PACT, which interact with Dicer during small RNA loading (Chendrimada et al., 2005; Haase et al., 2005), could hypothetically contribute to routing long dsRNA into RNAi and IFN pathways in vivo.
Mammalian genomes encode all proteins necessary and sufficient for reconstituting canonical RNAi in vitro (MacRae et al., 2008) or in yeast (Suk et al., 2011; Wang et al., 2013). However, mammalian proteins primarily function in a gene-regulating microRNA pathway (reviewed in Bartel, 2018) while only negligible amounts of siRNAs of unclear functional significance are typically found in mammalian cells (reviewed in Svoboda, 2014). At the same time, successful RNAi has been occasionally experimentally achieved in cultured cells with different types of long dsRNA molecules, including transfection of long dsRNA into embryonic stem cells (ESC) and embryonic carcinoma cells, (Billy et al., 2001; Paddison et al., 2002; Yang et al., 2001) and expression of various types of long dsRNA in ESCs, transformed, and primary somatic cells (Diallo et al., 2003; Gan et al., 2002; Gantier et al., 2007; Paddison et al., 2002; Shinagawa and Ishii, 2003; Wang et al., 2003; Yi et al., 2003).
One of the factors limiting mammalian RNAi is the mammalian Dicer, which does not produce siRNAs from long dsRNA substrates efficiently (reviewed in Svobodova et al., 2016). One of the ways to achieve high siRNA production is truncation of Dicer at its N-terminus, which increases siRNA generation in vitro (Ma et al., 2008) and in cultured cells (Kennedy et al., 2015). An N-terminally truncated Dicer variant occurs naturally in mouse oocytes (Flemr et al., 2013), the only known mammalian cell type where RNAi is highly active and functionally important. Another factor impeding RNAi is the presence of the IFN pathway as evidenced by enhanced siRNA production and RNAi activity upon inactivation of different components of the IFN pathway, such as dsRNA sensor PKR, RIG-I-like receptor LGP2 (Kennedy et al., 2015; van der Veen et al., 2018), or mediators MAVS or IFNAR1 (Maillard et al., 2016).
The mammalian RNAi has a known biological role in mouse oocytes where it suppresses mobile elements and regulates gene expression (Murchison et al., 2007; Tam et al., 2008; Tang et al., 2007; Watanabe et al., 2008). However, there is scarce and unclear evidence for biological significance of mammalian RNAi elsewhere. A part of Dicer loss-of-function phenotype in murine ESCs has been attributed to the lack of endogenous RNAi (Babiarz et al., 2008).
However, reported ESC endo-siRNAs, such as those derived from a hairpin-forming B1/Alu subclass of SINEs, resemble non-canonical microRNAs rather than a siRNA population produced from a long dsRNA (Flemr et al., 2013). Similarly, hippocampal endosiRNAs emerging from the SyngaP1 locus (Smalheiser et al., 2011) map to a sequence that is nowadays annotated as a microRNA locus (Kozomara and Griffiths-Jones, 2014).
It has been questioned whether or not could endogenous RNAi contribute to mammalian antiviral defense (reviewed in Cullen et al., 2013; Gantier, 2014). In contrast to invertebrates, data supporting direct involvement of mammalian RNAi in antiviral defense are rather inconclusive. While a few studies suggested that RNAi could provide an effective antiviral response in ESCs and in mouse embryos (Li et al., 2013; Maillard et al., 2013; Qiu et al., 2017) other studies did not support that notion. For example, no siRNAs of viral origin have been found in human cells infected with a wide range of viruses (Pfeffer et al., 2005) or present siRNAs were not sufficient to mediate effective RNAi (Tsai et al., 2018). While it seems unlikely that mammalian RNAi would be a substantial antiviral mechanism co-existing with long dsRNA-induced IFN response, conditions still remain unclear, under which mammalian RNAi could operate effectively.
To bring insights into disparities concerning RNAi in mammalian cells, we analyzed RNAi induction in mouse fibroblasts and ESCs using a set of plasmids expressing different dsRNAs, which target luciferase reporters with complementary sequences. We show that endogenous RNAi in mouse cells is severely restricted at multiple levels. Mouse cells contain minimal amount of endogenous dsRNA that could be converted into endo-siRNAs and, at the same time, inefficiently convert expressed dsRNA to siRNAs. This owes to low Dicer activity, which is able to yield only limited amounts of siRNA from dsRNA, unless the substrate contains a terminus resembling that of a miRNA precursor. In such a case, RNAi can be effective if the first cleavage yields an effective siRNA. While increased Dicer activity stimulates RNAi, increased expression of dsRNA binding proteins TARBP2 and PACT, which are Dicer-binding partners, reduces RNAi without affecting the miRNA pathway. Finally, we show that an optimal long dsRNA substrate can induce RNAi also in cells, which do not have high Dicer activity. However, this implies that RNA plays a secondary if any role in antiviral response because viruses could easily avoid RNAi through evolving sequences at their genomic termini to reduce Dicer cleavage, proper siRNA strand selection and siRNA-guided recognition of viral RNA.
Results and Discussion
To investigate mammalian RNAi, we expanded a long RNA hairpin expression system originally developed for transgenic RNAi in mice (reviewed in Malik and Svoboda, 2012). It combines (i) an inverted repeat producing a long (> 400 bp) dsRNA hairpin inserted into the 3’UTR of an EGFP reporter, and (ii) Renilla (RL) and firefly luciferase (FL) reporters for distinguishing sequence-specific and sequence-independent effects (Fig. 1A). The hairpin plasmids were derived from Mos, Elavl2 and Lin28a/b mRNA sequences (Fig. S1A) and, for brevity, are referred to as MosIR, Lin28IR, and Elavl2IR. The long hairpin RNA organization is similar to some naturally occurring long dsRNA hairpins, which give rise to endogenous siRNAs in Caenorhabditis elegans (Morse and Bass, 1999) and mouse oocytes (Tam et al., 2008; Watanabe et al., 2008). Importantly, all three hairpin transcripts could be efficiently immunoprecipitated with an anti-dsRNA antibody (Nejepinska et al., 2014) and their expression induced robust RNAi in oocytes in vivo (Chalupnikova et al., 2014; Flemr et al., 2014; Stein et al., 2003b). In a control plasmid CAG-EGFP-MosMos (Fig. 1A, referred to as MosMos hereafter), the Mos tandem sequence is oriented head-to-tail, hence the plasmid has the same size and nucleotide composition as MosIR but does not produce dsRNA. Targeted RL reporters were derived from a Renilla luciferase plasmid by inserting Mos, Lin28, or Elavl2 sequences in the 3’UTR. A common FL reporter serves as a non-targeted control (in sequence-specific context). dsRNA expression and RNAi activity were analyzed in mouse ESCs and NIH3T3 (referred to as 3T3 hereafter) mouse fibroblasts (Todaro and Green, 1963), which represent undifferentiated and differentiated cell types, respectively.
In a typical experiment, a dsRNA-expressing plasmid (e.g. MosIR) and two luciferase reporters (and, eventually, another tested factor) were transiently co-transfected. and luciferase activities were quantified 48 hours later. Sequence-specific and sequence-independent effects could be distinguished in samples transfected with MosMos (negative control), MosIR (targeting dsRNA), or Elavl2IR (non-targeting dsRNA - a positive control for non-specific dsRNA effects) by comparing RL-Mos (MosIR-targeted) and FL (non-targeted) reporter activities. Importantly, sequence-independent dsRNA effects, which strongly reduce raw activities of both luciferase reporters in transfected cells (Fig. 1B), are not apparent in normalized data, which are typically displayed as a targeted reporter signal divided by the non-targeted reporter signal (here RL-Mos/FL, Fig. 1C, S1B). We have shown previously that reporter expression from co-transfected plasmids is particularly sensitive in PKR-dependent manner to dsRNA expressed from a co-transfected plasmid (Nejepinska et al., 2014). The occurrence of this effect in raw data is thus a good indicator of dsRNA expression.
The absence of RNAi effect in 3T3 cells (Fig. 1C, S1B) could be expected as RNAi was absent in somatic cells of mice ubiquitously expressing MosIR (Nejepinska et al., 2012b) despite MosIR induced highly-specific RNAi effect in mouse oocytes (Nejepinska et al., 2012b; Stein et al., 2003b; Stein et al., 2005). At the same time, expression of a different type of long dsRNA in 3T3 cells was reported to induce RNAi (Wang et al., 2003) suggesting that conditions exists, under which RNAi could operate in 3T3 cells. Thus, MosIR expression in 3T3 cell culture seemed to be a good starting model for exploring constraints for functional mammalian RNAi.
Inefficient siRNA production from long dsRNA in mouse cells
To examine the cause of inefficient RNAi, we first examined Mos siRNA levels in 3T3 cells transfected with MosIR using small RNA sequencing (RNA-seq). We also co-transfected plasmids expressing either full-length Dicer expressed in somatic cells (denoted DicerS) or the truncated Dicer isoform supporting RNAi in mouse oocytes (denoted DicerO). The experiment yielded reproducible small RNA populations comparable to results of an earlier RNA-seq analysis of ESCs (Flemr et al., 2013) (Fig. S2). MosIR expression in normal 3T3 cells yielded only minimal amounts of 21-23 nt siRNAs (Fig. 2A, ~ 100 RPM when normalizing the read abundance to the entire small RNA library). Relative to 3T3 cells transfected with MosIR, co-expression of DicerS or DicerO increased Mos 21-23nt siRNA production 5.7x or 24.2x, respectively (Fig. 2A). siRNA levels in libraries from DicerO-expressing 3T3 cells were similar to the earlier analysis of DicerO-expressing ESCs (Fig. 2B).
Normal 3T3 cells thus have a minimal capability to produce siRNAs from long dsRNA, which has termini formed of longer single-stranded RNAs or a loop, hence inaccessible for Dicer. This is consistent with the previous observation that human Dicer cleaves efficiently dsRNA with blunt ends or 2nt 3’-overhangs from its termini and less efficiently inside the duplex (Zhang et al., 2002). siRNA production could be improved by increasing levels of full-length Dicer (Fig. 2A). Remarkably, relative siRNA distribution along the Mos sequence was almost identical in 3T3 cells expressing additional DicerS and cells expressing DicerO (Fig. S2B). The pattern was not specific to 3T3 cells because it was observed also in ESCs expressing DicerO (Fig. S2B). This implies that cells expressing high levels of full-length Dicer could also generate more siRNAs from long dsRNA. Dicer expression varies across cell types. Off note is that the highest Dicer mRNA levels were found in oocytes, lymphocytes and mast cells (Wu et al., 2016).
Another notable observation was that normal 3T3 cells essentially lacked endogenous siRNAs (Fig. 2C). We inspected loci in 3T3 cells giving rise to 21-23 nt sequences using the same algorithm as in our previous study in ESCs (Fig. 2D), which revealed a small number of loci giving rise to long dsRNA, which was converted to siRNAs by DicerO (Flemr et al., 2013). However, we did not find any locus in 3T3 cells that would produce an apparent population of endo-siRNAs from long dsRNA like in ESCs. Instead, all genomic loci with higher abundance of perfectly mapped 21-23 nt RNAs in DicerO-expressing cells were reminiscent of miRNA loci (e.g. Fig. S2C). This contrasts with ESCs expressing DicerO, where a distinct population of loci generating siRNA pools >100 RPM was observed (Fig. 2D).
We examined small RNAs derived from repetitive mobile elements separately, grouping all 21-23 nt reads according to mapping to specific retrotransposon groups (Fig. 2E). This analysis showed in several cases (SINE B2, LINE1, ERVK, ERVL) minor increase in DicerO-expressing cells (10-20%, up to several hundred RPM difference). However, it is unclear, what type of RNA substrates was responsible for this increase. We showed that transcribed inverted repeats of Alu/SINE B1 can form substrates producing small RNAs appearing more as non-canonical miRNAs than siRNA pools (Flemr et al., 2013). Furthermore, 21-23nt populations mapping to the same elements in ESCs show almost an order of magnitude higher abundance and much stronger increase in DicerO-expressing cells (Fig. 2F), which likely stems from open chromatin structure and dsRNA production in ESCs (Martens et al., 2005).
Altogether, these data show that 3T3 cells neither produce significant amounts of dsRNA nor posses robust Dicer activity that could process it. Regardless whether dsRNA absence in 3T3 cells is due to minimal dsRNA production or its efficient removal by other pathways, the endogenous RNAi is apparently not operating in 3T3 cells. RNAi can be revived in 3T3 cells through strongly increasing Dicer activity by expressing DicerO and providing a dsRNA substrate (Fig. 2G). However, in contrast to ESCs expressing DicerO, weaker (36%) sequence-specific repression of RL-Mos was observed despite Mos siRNA levels in 3T3 cells transfected with MosIR and DicerO-expressing plasmid were comparable to those observed in ESCs stably expressing DicerO (Fig. 2A, 2B).
Notably, combined ~8000 RPM abundance of 21 and 22nt Mos siRNAs in DicerO-transfected 3T3 cells (Fig. 2A) would reach RPM values equivalent to highly abundant miRNAs. Although RPM values from RNA-seq data are not a reliable predictor of miRNA abundance because RNA-seq protocols may introduce biases (Linsen et al., 2009), high RPM values generally indicate higher miRNA abundance. Among the most abundant miRNAs in 3T3 cells was the Let-7 family; abundance of all Let-7 miRNAs added up to 11,900 RPM, the most abundant member Let-7f was at ~4,300 RPMs, Let-7a reached ~2,100 RPM. Similarly, abundancies of all miR-30 family members added up to 6,100 RPM and the most abundant member miR-30c was at ~2,100 RPM (0.7% of RPM values of the first 30 most abundant miRNAs).
To examine small RNA-mediated cleavage of cognate RNAs in 3T3 cells, we analyzed ability of miR-30 to repress a target with a single perfectly complementary binding site. We previously produced luciferase reporters with a single miR-30c perfectly complementary binding site (miR-30 1xP) for monitoring RNAi-like cleavage by endogenous miRNAs (Ma et al., 2010). Earlier reporter testing included 3T3 cells but those experiments were done with minimal amounts of transfected reporters (1ng/well in a 24-well plate). Thus, it was unclear whether the reporter would be efficiently repressed when higher amounts would be transfected. Accordingly, we titrated the miR-30 1xP reporter up to 100 ng/well in a 24-well plate. As a control, we used a reporter with three mutated miR-30c binding sites (miR-30 3xM), which should not be repressed by endogenous miRNAs. Although we could observe less efficient repression with increasing amount of miR-30 reporters, a single perfect miR-30c binding site was still sufficient to lower the miR-30 1xP reporter to 22% of miR-30 3xM at 100ng/well (Fig. 2H). When considering that just a half of loaded MosIR siRNAs would be antisense siRNAs able to cleave the RL-Mos reporter, it is rather unexpected that ~4,000 RPM of 21-22 nt siRNAs were not sufficient to strongly knock-down the reporter (Fig. 2G).
The supply of AGO proteins is probably not the main limiting factor as endogenous AGO-loaded miR-30 is sufficient for repression of miR-30 1xP reporter. One possible factor is an overestimation of functional Mos siRNA levels in 3T3 RNA-seq data because only a minor fraction of Mos siRNAs may be engaged in RL-Mos repression. In principle, a half of AGO2-loaded Mos siRNAs would be antisense siRNAs complementary to the RL-Mos reporter and only a fraction of the antisense siRNAs would be able to effectively target RL-Mos mRNA, because secondary RNA structures prevent efficient targeting (Ameres et al., 2007).
Furthermore, we estimated Mos siRNA abundance from RNA-seq analysis of transiently transfected 3T3 cells, which have higher transfection efficiency than ESCs. Thus a similar RPM level in 3T3 and ESC RNA-seq libraries would mean that ESCs siRNA levels per cell, which are sufficient for mediating RNAi, are higher than in 3T3 cells.
In any case, the threshold for efficient level of MosIR-derived siRNA population appears relatively high in cultured cells. MosIR was not sufficient to induce efficient RNAi in normal 3T3 cells nor in ESCs. Introduction of DicerO, a truncated Dicer variant supporting RNAi in mouse oocytes, was sufficient to strongly enhance siRNA production and, in the case of ESCs, also to induce robust RNAi effect (~65% knockdown of RL-Mos reporter activity). At the same time, our data imply that cells expressing full-length Dicer may not be able to mount efficient RNAi in the presence of an excess of dsRNA substrate, which does not have Dicer-accessible termini. Importantly, we also observed sequence-independent dsRNA effects in ESCs transfected with MosIR despite ESCs are reported to lack the IFN response (D'Angelo et al., 2017).
Effects of expression dsRNA binding proteins on RNAi
To further investigate constrains for RNAi in mouse cells, we examined effects of different dsRNA binding proteins on RNAi. These included PKR, which was expected to interfere with RNAi, and Dicer binding partners TARBP2 and PACT (Chendrimada et al., 2005; Haase et al., 2005; Laraki et al., 2008), which could potentially support RNAi. As a negative control, we used expression of LacZ. We did not observe any positive effect of any dsRBP on RNAi (Fig. 3A) while immunoprecipitation showed approximately 50-fold enrichment of TARBP2 and PACT binding to MosIR RNA compared to a control protein (Fig. S3). Remarkably, expression of TARBP or PACT counteracted sequence-independent effects of dsRNA expression, which indicates that both expressed proteins are binding MosIR hairpin and compete with endogenous PKR binding (Fig. 3A, lower part). Consistent with this notion, ectopic PKR expression further increased sequence-independent repression of luciferase reporters including MosMos-transfected cells (Fig. 3A, lower part). While MosMos transcript should not fold into dsRNA, it is possible that cells expressing high PKR levels were sensitized to dsRNA such that some cryptic transcription from the plasmid backbone could cause the effect (Nejepinska et al. 2012c).
Effects of ectopic expression of dsRBPs in ESCs expressing normal Dicer (DicerS) were similar to 3T3 cells – none of the dsRPBs had an apparent stimulatory effect on RNAi (Fig. 3B). However, DicerO-expressing cells showed that ectopically-expressed TARBP2 and PACT suppressed RNAi comparably to or even more than PKR (Fig. 3C). This was counterintuitive because a homolog of TARBP2 and PACT is involved in RNAi in Drosophila (Cenik et al., 2011; Fukunaga and Zamore, 2014; Nishida et al., 2013) and TARBP2 was shown to stimulate siRNA production in vitro (Chakravarthy et al., 2010). TARBP2 and PACT could exert negative effects on RNAi in three ways: (i) by competing with Dicer’s own dsRBD in recognition of dsRNA substrate, (2) by a direct inhibition of Dicer, or (3) by squelching dsRNA recognition and Dicer cleavage such that Dicer would have a reduced probability to bind its dsRBP partner bound to dsRNA.
A simple masking appears counterintuitive as TARBP2 and PACT are Dicer binding partners, so they would be expected to recruit Dicer to dsRNA rather than prevent its processing by Dicer. An insight into the phenomenon could be provided by miRNA reporters, which could reveal if TARBP2 or PACT overexpression interferes with miRNA-mediated repression. We did not observe any effect of TARBP2 or PACT on miRNA-mediated repression in neither DicerS-expressing ESCs (Fig. 3D) nor in DicerO-expressing cells (Fig. 3E). These data suggests that RNAi inhibition observed in Fig. 3C is unlikely to involve direct Dicer inhibition and it neither affect non-processive cleavage of miRNA precursors nor the subsequent small RNA loading. We speculate that, apart from the masking effect (reduced substrate recognition), the inhibition may concern the initial internal dsRNA cleavage by Dicer or overexpressed TARBP2 affects Dicer processivity.
Effects of dsRNA sequence and structure on RNAi
Finally, we examined the structural context of expressed dsRNA as it plays a role in efficiency of siRNA production. As mentioned earlier, Dicer preferentially cleaves dsRNA at termini and prefers single-stranded two nucleotide 3’overhangs or blunt ends while longer single-stranded RNA overhangs have an inhibitory effect and siRNA biogenesis requires endonucleolytic cleavage that occurs with lower efficiency than cleavage at dsRNA ends (Provost et al., 2002; Vermeulen et al., 2005; Zhang et al., 2002). dsRNA can be expressed in three ways, which differ in probability of dsRNA formation (1) transcription of an inverted repeat yielding RNA hairpin, (2) convergent transcription of one sequence, and (3) separate transcription of sense and antisense strands. Furthermore, pol II and pol III transcription will yield different RNA termini, which could influence RNA localization and routing into different pathways. Therefore, we prepared a set of constructs for expression of different types of long dsRNA in addition to MosIR (CAG-EGFP-MosIR) and Lin28IR (CAG-EGFP-Lin28IR) plasmids (Fig. 4). CMV-MosIR and CMV-Lin28IR are simple pol II-expressed hairpin transcripts without coding capacity; a similar expression plasmid was successfully used to induce RNAi in human HeLa cells, embryonic carcinoma cells, melanoma cells, and primary fibroblasts (Diallo et al., 2003; Paddison et al., 2002). Separate expression of sense and antisense transcripts was used to induce RNAi in NIH 3T3 and HEK 293 cell lines (Wang et al., 2003). Polyadenylated pol II transcripts would carry a 5’ cap and a single stranded polyA 3’ overhang. To make dsRNAs with blunt (or nearly blunt) ends, we prepared also a set of pol III constructs, which included a hairpin, a convergent transcription system, and a separate sense and antisense expression. A pol III-driven hairpin and a separate sense and antisense RNA expression were previously used in MCF-7 mammalian cells with little if any induction of RNAi but strong effects on the IFN response (Gantier et al., 2007).
Experimental design for testing different long dsRNAs in transiently-transfected 3T3 cells was as described above (Fig. 1). In addition, in an attempt to increase efficiency of RNAi, we also examined effects in Pkr−/− 3T3 cells, which were produced using CRISPR-Cas9 (Fig. S4). Using Mos sequences, we only observed slight RNAi effects of U6-MosIR in Pkr−/− background (Fig. 4C). Remarkably, U6-Lin28aIR, which expressed the same type of dsRNA hairpin, had a strong RNAi effect even in normal 3T3 (Fig. 4C). Apart from U6-MosIR and U6-Lin28aIR and U6-driven convergent transcription of Lin28a fragment, none of the other dsRNA substrates showed stronger induction of RNAi than CAG-EGFP-MosIR and CAG-EGFP-Lin28IR (Fig. 4C).
To understand the basis of RNAi induction, we analyzed siRNAs produced in 3T3 cells transiently transfected with U6-MosIR, U6-Lin28aIR, CAG-EGFP-MosIR, or CAG-EGFP-Lin28IR (Fig. 5 and S5). Analysis of siRNAs originating from hairpin transcripts from these plasmids showed that U6 plasmids generate 3-4 times more siRNAs than CAG-EGFP plasmids, whose dsRNA sequence is longer (Fig 5A). Furthermore, U6 plasmids, which generate RNA hairpins with minimal if any single-stranded overhangs, yielded a completely different patterns of siRNAs targeting RL reporters than CAG-driven hairpins, which contain long single-stranded overhangs (Fig. 5B, C). U6-driven hairpins were apparently processed by Dicer from the end of the stem, which is consistent with Dicer activity in vitro (Zhang et al., 2002). The siRNA produced from U6-driven hairpins also show low processivity of the full-length mouse Dicer, where the majority of reporter-targeting siRNAs comes from the first substrate cleavage at the end of the stem (Fig. 5B). In contrast, siRNAs produced from CAG-driven hairpins are distributed along the hairpin suggesting that their biogenesis required at least two cleavage events – an endonucleolytic inside the stem, which produces optimal termini for a second, siRNA-producing cleavage. These two modes of siRNA production manifest distinct patterns in the phasing analysis of siRNAs produced from the two types of hairpins (Fig. 5D).
Endogenous RNAi triggered by endogenous dsRNA produced in the nucleus would typically involve dsRNA molecules with long single-strand overhangs. Efficient RNAi would thus either require a high Dicer activity, such as the one that evolved in mouse oocytes, or some RNase activity, which would remove the single-strand overhangs, allowing for efficient Dicer cleavage from a terminus. One candidate for such an RNase is Drosha, an RNase III producing Dicer substrates in the miRNA pathway. In this case, resulting siRNAs could be considered non-canonical miRNAs and, consequently, RNAi an extension of the miRNA pathway.
Importantly, when a blunt-end dsRNA triggers RNAi, silencing strongly depends on the first end-derived siRNA, which needs to be efficiently loaded onto AGO2 and effectively interact with the target sequence whose secondary structure can interfere with targeting (Ameres et al., 2007). These requirements argue against strong antiviral role of mammalian RNAi because mutations would lead to rapid selection of viral variants avoiding to produce functional siRNAs from their termini and co-evolution of cellular components of RNAi. However, vertebrate Dicer and AGO2 are well conserved (Murphy et al., 2008), arguing for their conserved role in the miRNA pathway. The known Dicer modification identified in mouse oocytes (Flemr et al., 2013), where the miRNA pathway is unimportant (Suh et al., 2010), appears to be an isolated event that occurred in the common ancestor of mice and hamsters, which is suppressing retrotransposons and regulating gene expression; there is no evidence it would play an antiviral role.
Taken together, we developed and examined a complex system for dsRNA expression in mammalian cells and monitoring sequence-specific and sequence-independent effects. Our plasmid collection and data it generated offer a framework for studies on RNAi. Our results accentuate key constraints, which influence canonical RNAi in mammalian cells and which would shape evolution of mammalian RNAi. In simplest terms, functional mammalian RNAi requires conditions, which are rather rarely met elsewhere than in mouse oocytes: enough substrate, Dicer activity necessary to yield enough effective siRNAs, and avoiding a strong clash with IFN pathway sensors.
Material and Methods
Plasmids
Schematic structures of the relevant parts of plasmid constructs used in the project are shown in Fig. 1A and 4A. Three dsRNA-expressing plasmids (MosIR, Lin28IR, and Elavl2IR), which efficiently induced RNAi in oocytes of transgenic mice (Chalupnikova et al., 2014; Flemr et al., 2014; Stein et al., 2003b) were modified by replacing the oocytes-specific ZP3 promoter with a strong ubiquitous CAG promoter as described previously (Nejepinska et al., 2014). pGL4-SV40 (Promega; for simplicity referred to as FL) and the parental plasmid for targeted Renilla reporter phRL-SV40 (Promega; for simplicity referred to as RL) are commercially available. All plasmids were verified by sequencing. Non-commercial plasmids depicted in Figures 1A and 4A and plasmids expressing HA-tagged TARBP2, PACT, PKR, and LacZ are available from Adgene with details about their construction and sequence. Plasmid sequences (annotated Genbank format) are also available in the supplemental file plasmids_sequences.zip.
Cell culture and transfection
Mouse 3T3 cells were maintained in DMEM (Sigma) supplemented with 10 % fetal calf serum (Sigma), penicillin (100 U/mL, Invitrogen), and streptomycin (100 µg/mL, Invitrogen) at 37 °C and 5 % CO2 atmosphere. Mouse embryonic stem cells were cultured in 2i-LIF media: DMEM supplemented with 15% fetal calf serum, 1x L-Glutamine (Invitrogen), 1x non-essential amino acids (Invitrogen), 50 µM β-Mercaptoethanol (Gibco), 1000 U/mL LIF (Millipore), 1 µM PD0325901, 3 µM CHIR99021, penicillin (100 U/mL), and streptomycin (100 µg/mL). For transfection, cells were plated on a 24-well plate, grown to 50 % density and transfected using TurboFect in vitro Transfection Reagent or Lipofectamine 3000 (Thermo Fisher Scientific) according to the manufacturer’s protocol. Cells were co-transfected with 50 ng per well of each FL and RL reporter plasmids and 250 ng per well of a dsRNA-expressing plasmid and, eventually, 250 ng per well of a plasmid expressing a tested factor. The total amount of transfected DNA was kept constant (600 ng/well) using pBluescript plasmid. Cells were collected for analysis 48 hours post-transfection.
Luciferase assay
Dual luciferase activity was measured according to Hampf M. & Gossen M. (Hampf and Gossen, 2006) with some modifications. Briefly, cells were washed with PBS and lysed in PPTB lysis buffer (0.2% v/v Triton X-100 in 100 mM potassium phosphate buffer, pH 7.8). A 3-5 µl aliquots were used for measurement in 96-well plates using Modulus Microplate Multimode Reader (Turner Biosystems). First, firefly luciferase activity was measured by adding 50 µl substrate (20 mM Tricine, 1.07 mM (MgCO3)4⋅Mg(OH)2⋅5H2O, 2.67 mM MgSO4, 0.1 mM EDTA, 33.3 mM DTT, 0.27 mM Coenzyme A, 0.53 mM ATP, 0.47 mM D-Luciferin, pH 7.8) and signal was integrated for 10 sec after 2 sec delay. Signal was quenched by adding 50 µl Renilla substrate (25 mM Na4PPi, 10 mM Na-Acetate, 15 mM EDTA, 500 mM Na2SO4, 500 mM NaCl, 1.3 mM NaN3, 4 µM Coelenterazine, pH 5.0) and Renilla luciferase activity was measured for 10 sec after 2 sec delay.
Western blotting
3T3 cells were grown in 6-well plates. Before collection, cells were washed with PBS and lysed in lysis buffer (20 mM HEPES (pH 7.8), 100 mM NaCl, 1 mM EDTA (pH 8.0), 0.5 % IGEPAL-25 %, 1 mM fresh DTT, 0.5 mM PMSF, 1 mM NaF, 0.2 mM Na3VO4, supplemented with 2x protease inhibitor cocktail set (Millipore), 2x phosphatase inhibitor cocktail set (Millipore), and RiboLock RNase inhibitor (Thermo Scientific)). Proteins were separated on 10 % polyacrylamide gel and transferred to PVFD membrane (Millipore). Anti-HA (Roche, 11867431001, clone 3F10, rat,1:2500), anti-PKR (Abcam #ab32052, 1:5000 dilution), and anti-tubulin (Sigma #T6074, 1:5000) primary antibodies and anti-rat HRP (1:50 000) secondary antibody were used for signal detection with SuperSignal West Femto or Pico Chemiluminescent Substrate (Pierce).
Pkr knock-out in 3T3 cells
For PKR knock-out cells, exons 2-5 (~ 7.5 kb region coding for dsRNA-binding domains) were deleted using CRISPR approach. sgRNAs targeting intron 1 (5’-CCTTCTTTAACACTTGGCTTC & 5’-CCTGTGGTGGGTTGGAAACAC)and intron 5 (5’-GTGGAGTTGGTGGCCACGGGG & 5’-CCTGTGTACCAACAATGATCC) were co-transfected with Cas9-expressing and puromycin selection plasmids. After 48 h, cells were selected with puromycin (f.c. = 3 µg/mL) for 2 days and individual clones were isolated and screened for the presence of deletion using PCR (forward primer: 5’-GCCTTGTTTTGACCATAAATGCCG and reverse primer: 5’-GTGACAACGCTAGAGGATGTTCCG). Expression of PKR lacking dsRNA binding domains was confirmed by qPCR and homozygote clones were used for further experiments.
RNA sequencing
Cells were plated on 6-well plates and grown to 50 % density. Cells were transfected with 2 μg/well of U6-MosIR, U6-Lin28aIR, CAG-EGFP-MosIR, or CAG-EGFP-Lin28IR plasmids, cultured for 48 hours, washed with PBS, and total RNA was isolated using RNAzol (MRC) according to the manufacturer’s protocol. RNA quality was verified by Agilent 2100 Bioanalyzer. The library construction and high-throughput sequencing of the RNA transcriptome were performed either from small RNA (< 200nt) fraction using SOLiD (version 4.0) sequencing platform (Seqomics, Szeged, Hungary) or libraries were constructed from total RNA using NEXTflex Small RNA-Seq Kit v3 (Bioo Scientific) according to manufacturer’s protocol and sequenced on the Illumina HiSeq2000 platform at the Genomics Core Facility at EMBL
Bioinformatic analyses
Bioinformatic analysis of SOLiD data was performed as described previously (Flemr et al., 2013; Nejepinska et al., 2012b). Briefly, SOLiD raw .csfasta and .qual files were quality filtered and trimmed using cutadapt 1.16 (Martin, 2011): cutadapt -e 0.1 -m 15 -c -z -a ‘CGCCTTGGCCGTACAGCAG’ -o ${FILE}.trim.fastq.gz $FILE.csfasta.gz $FILE.qual.gz Trimmed reads were mapped in colorspace onto indexed genome using SHRiMP 2.2.3 (David et al., 2011): gmapper-cs ${FILE}. trim.fastq.gz--threads 10 -L $REF_GENOME_INDEX -o 99999 -E --local --strata > ${FILE}.sam Illumina raw .fastq files were trimmed in two rounds using bbduk 37.95 (https://jgi.doe.gov/data-and-tools/bbtools/). First, adapter was trimmed from 3’ end or reads: bbduk.sh in=${FILE}.fastq.gz out=${FILE}.atrim.fastq.gz literal=TGGAATTCTCGGGTGCCAAGG ktrim=r k=19 rcomp=t mink=10 hdist=1 minoverlap=8
Next, 4 bases were trimmed from both 5’ and 3’ ends of adapter trimmed reads: bbduk.sh in=${FILE}.atrim.fastq.gz out=${FILE}.trim.fastq.gz forcetrimright2=4 forcetrimleft=4 minlength=15
Trimmed reads were mapped onto indexed genome using STAR 2.5.3a (Dobin et al., 2013): STAR --readFilesIn ${FILE}.trim.fastq.gz --genomeDir $ REF_GENOME_INDEX --runThreadN 10 --genomeLoad LoadAndRemove –limitBAMsortRAM 20000000000 --readFilesCommand unpigz –c --outFileNamePrefix ${FILE}. --outSAMtype BAM SortedByCoordinate -- outReadsUnmapped Fastx --outFilterMismatchNmax 2 --outFilterMismatchNoverLmax 1 -- outFilterMismatchNoverReadLmax 1 --outFilterMatchNmin 16 --outFilterMatchNminOverLread 0 --outFilterScoreMinOverLread 0 --outFilterMultimapNmax 99999 -- outFilterMultimapScoreRange 0 --alignIntronMax 1 --alignSJDBoverhangMin 999999999999
Both SOLiD and Illumina trimmed read files were mapped onto mouse genome version mm10/GRCm38 with plasmid sequences (available in the supplemental file as an annotated Genbank format) added to the genome prior to indexing.
External genomic annotations sets were used for the analysis. miRNA coordinates were downloaded from the miRBase, v22 (Kozomara and Griffiths-Jones, 2014). Exon coordinates were downloaded from Ensembl database, release 91 (Aken et al., 2017). Coordinates of repeats were downloaded as RepeatMasker (Smit et al., 2013-2015) track from UCSC genome browser (Kuhn et al., 2013). The downstream analysis was done in the R software environment (https://www.R-project.org). Unless noted otherwise, all downstream analyses were done with 21-23 nt long reads perfectly matching the genome sequence.
Small RNA read clusters (Fig. 2C, 2D) were identified following the algorithm used in previous study (Flemr et al., 2013) with few changes. In short:
Reads were weighted to fractional counts of 1/n where n represents the number of loci to which read maps
Reads were then collapsed into a unified set of regions and their fractional counts were summed
Clusters with less than 3 reads per million (RPM) were discarded
Clusters within 50 bp distance of each other were joined
Only clusters appearing in all replicates of the same genotype (intersect) were considered in the final set. Union of coordinates of overlapping clusters were used to merge the clusters between the samples. Clusters were then annotated, and if a cluster overlapped more than one functional category, the following classification hierarchy was used: miRNA > transposable elements > mRNA (protein coding genes) > misc. RNA (other RNA annotated in ENSEMBL or repeatMasker) > other (all remaining annotated or not annotated regions).
To visualize phasing of siRNAs derived from expressed hairpins on radar plots (Fig. 5D), start coordinates of all 21-23 nt reads were first scaled in regard to start of hairpin on plasmid (so that first nucleotide of hairpin defines register 1). Proportion of reads in each of 22 registers was calculated as a modulo-22 of scaled start coordinates divided by total number of reads belonging to all hairpin registers. This phasing analysis was adapted from (Maillard et al., 2013).
Declaration of Interests
The authors declare no competing interests
Author Contributions
Conceptualization, R.M., P.S.; Methodology, T.D., M.V., R.M.; Investigation, T.D., M.V., R.M., F.H., J.P., E.S., M.F; Data Curation, F.H., J.P.; Formal Analysis, T.D., M.V., R.M., F.H., J.P., E.S., M.F., P.S.; Supervision, R.M., P.S.; Writing – Original Draft, P.S.; Writing – Review and Editing T.D., M.V., R.M., F.H., J.P., E.S., P.S.; Funding acquisition, P.S.
Acknowledgments
We thank Vedran Franke for help with data analysis, Vladimir Benes and EMBL sequencing facility for help with RNA-seq experiments, and Kristian Vlahovicek for providing hardware support for bioinformatics analysis. This work was funded from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 647403, D-FENS). Additional support was provided by the Czech Science Foundation (CSF) grant P305/12/G034 and by the Ministry of Education, Youth, and Sports (MEYS) project NPU1 LO1419. Additional computational resources for PS lab were provided by the CESNET LM2015042 and the CERIT Scientific Cloud LM2015085 under the programme “Projects of Large Research, Development, and Innovations Infrastructures”.