ABSTRACT
MicroRNAs (miRNAs) are known to modulate gene expression, but their activity at the tissue specific level remains largely uncharacterized. In order to study their contribution we developed novel tools to profile miRNA targets in the C. elegans intestine and body muscle tissues, and studied their composition and function. We validated many previously described interactions and identified hundreds of novel targets. Overall the miRNA targets obtained are known modulators of tissue function. The intestine tissue being more transcriptionally complex and larger, contains significantly more miRNA targets and includes key metabolic enzymes. Within our datasets we detect an unexpected enrichment of RNA binding proteins targeted by miRNA in both tissues, with a specific abundance of RNA splicing factors. We tested several miRNA-RNA splicing factor interactions in vivo, and found that miRNA-based regulation of specific RNA splicing factors influences alternative splicing in the intestine tissue. These results highlight an unexpected role for miRNAs in modulating tissue specific gene expression, where post-transcriptional regulation of RNA splicing factors influences tissue specific alternative splicing.
INTRODUCTION
Multicellular organisms have evolved complex forms of gene regulation achieved at different stages throughout development, and equally executed at pre- co- and post-transcriptional level. Alternative splicing, which leads to the production of different protein isoforms using single mRNA precursors, fine tunes these regulatory networks, and contributes to the acquisition of adult tissue functions and identity. This mechanism ensures that each tissue possesses the correct gene expression needed to thrive (Baralle and Giudice 2017). In humans, more that 95% of genes undergo alternative splicing (Pan et al. 2008; Wang et al. 2008), and many aberrant alternative splicing events are linked to human diseases.
While several tissue specific splicing factors are known to directly promote RNA splicing, most of the alternative splicing events are achieved through differential expression of particular classes of RNA binding proteins (RBPs), which in turn bind specific cis-acting elements located within exon/intron junctions in a combinatorial manner, promoting or inhibiting splicing. Serine Arginine (SR) proteins recognize exon splicing enhancers (ESEs) and are important in promoting constitutive and alternative pre-mRNA splicing, while heterogeneous nuclear ribonucleoproteins (hnRNPs) are a large class of nuclear RBPs that bind exon splicing silencers (ESSs) and mostly promote exon retention (Matlin et al. 2005).
The relative expression levels of members of these two classes of splicing factors varies between tissues, and this imbalance is believed to promote the outcome of tissue specific alternative splicing events (Caceres et al. 1994; Zhu et al. 2001).
Tissue identity is also achieved through post-transcriptional gene regulation events, mostly occurring through 3′ Untranslated Regions (3′UTRs), which are portions of genes located between the STOP codon and the polyA tail of mature eukaryotic mRNAs. 3′UTRs have been recently subjected to intense study, since they were found to be targeted by a variety of factors that recognize small regulatory elements present in these regions, dosing gene output at the post-transcriptional level (Matoulkova et al. 2012; Oikonomou et al. 2014; Mayr 2017). While these regulatory mechanisms are still poorly characterized, and the majority of functional elements are unknown, disorders in the 3′ end processing of mRNAs have been found to play key roles not only in the loss of tissue identity, but also in diverse developmental and metabolic processes and in the establishment of major diseases, including neurodegenerative diseases, diabetes, and cancer (Conne et al. 2000; Mayr and Bartel 2009; Delay et al. 2011; Rehfeld et al. 2013).
3′UTRs are frequently targeted by a class of repressive molecules named microRNAs (miRNAs). miRNAs are short non-coding RNAs, ~22nt in length, that are incorporated into a large protein complex named the microRNA-induced silencing complex (miRISC), where they guide the interaction between the miRISC and the target mRNA by base pairing, primarily within the 3′UTR (Bartel 2009). The final outcome of mRNAs targeting by miRNAs can be context-dependent, however mRNAs targeted by the miRISC are typically held in translational repression prior to degradation of the transcript (Ambros and Ruvkun 2018; Bartel 2018). The pairing however does not require a perfect match between the sequences. For example, lin-4, the first miRNA discovered, can pair with seven short distinct degenerated elements located on its target lin-14 (Lee et al. 1993). Initial studies showed that although mismatches are common, the pairing requires a small conserved heptametrical motif located at position 2-7 at the 5′end of the miRNA (seed region), perfectly complementary to its target mRNA (Ambros and Ruvkun 2018; Bartel 2018).
Later findings showed that while important, the seed region may also contain one or more mismatches while pairing with its target RNA, and that this element alone is not a sufficient predictor of miRNA targeting (Ha et al. 1996; Reinhart et al. 2000; Didiano and Hobert 2006; Grimson et al. 2007). Compensatory base pairing at the 3L end of the seed region (nucleotides 10-13) can also play a role in target recognition (Shin et al. 2010; Chi et al. 2012), and have been implicated in conferring target specificity to miRNA that share the same seed regions (Broughton et al. 2016; Wolter et al. 2017).
miRNAs and their 3′UTR targets are frequently conserved and play a variety of roles in modulating fundamental biological processes across metazoans. Bioinformatic approaches, such as miRanda (Betel et al. 2008), TargetScan (Lewis et al. 2005) and PicTar (Lall et al. 2006), which use evolutionary conservation and thermodynamic principles to identify miRNA target sites, are the preferred tools for miRNA target identification, and based on these algorithms it was initially proposed that each miRNA controls hundreds of gene products (Chen and Rajewsky 2007). Recent high-throughput and wet bench approaches have validated and expanded these initial results, highlighting that miRNAs indeed target hundreds of genes, and regulate molecular pathways at multiple points in development and in disease (Selbach et al. 2008; Helwak et al. 2013; Wolter et al. 2014; Wolter et al. 2017).
In the past few years, several groups produced tissue-specific miRNAs localization data in mouse, rat, and human tissues (Eisenberg et al. 2007; Landgraf et al. 2007) and in cancer (Jima et al. 2010). In C. elegans, a recent low-throughput study has identified hundreds of intestine and muscle specific miRNAs and their targets, which are mostly involved in the immune response to pathogens (Kudlow et al. 2012). These studies used microarray-based approaches, which unfortunately do not provide enough depth to fully understand miRNA function in a tissue specific manner. In addition, these studies identified only a subset of miRNA targets, which rely on the scaffolding proteins AIN-1 and AIN-2, later found to be only present at specific developmental stages (Kudlow et al. 2012; Jannot et al. 2016). Taken together, these studies unequivocally show that there are indeed distinct functional miRNA populations in tissues, which are in turn capable of reshaping transcriptomes and contributing to cell identity acquisition and maintenance. Since most miRNAs targets are only predicted, it is still unclear how these events are initiated and maintained.
Our group has pioneered the use of the round nematode C. elegans to systematically study tissue specific gene expression (Blazie et al. 2015; Blazie et al. 2017). In the past years we developed a method to isolate and sequence high quality tissue specific mRNA from worms, and published several integrative analyses of gene expression in most of the C. elegans somatic tissues, including intestine and body muscles (Blazie et al. 2015; Blazie et al. 2017). In these studies, we found an abundance of several tissue specific SR and hnRNP proteins, which could explain the large number of tissue specific isoforms detected in these studies. For example, the RNA splicing factors asd-2 and sup-12, previously shown to switch alternative pre-mRNA processing patterns of the unc-60 gene in C. elegans body muscle (Ohno et al. 2012), and hrp-2, an hnRNP gene known to induce different muscle and intestine alternative splicing isoforms in three widely expressed genes; unc-52 and lin-10 and ret-1 (Kabat et al. 2009; Heintz et al. 2017). The human orthologue of hrp-2, HNRNPA1 has been shown to act in a dosage dependent manner to regulate the alternative splicing of the widely expressed gene PKM demonstrating the importance of regulating the dosage of hnRNPs (Chen et al. 2012). Studies performed using human cell lines have revealed that miRNA-based regulation of splicing factor dosage can drive tissue development (Makeyev et al. 2007).
In order to better understand the tissue specific contribution of miRNA-based regulation in gene dosage, RBP’s function, and tissue identity, we performed RNA immunoprecipitation of the C. elegans Argonaute ortholog alg-1, isolated and sequenced the tissue specific targets of miRNAs in C. elegans, and used them to identify miRNA targets from two of its largest and most well characterized tissues, the body muscle and intestine.
We found that the number of genes regulated in each tissue correlates with its transcriptome size, as expected. However there is a greater proportion of the transcriptome regulated in the intestine when compared to the body muscle. In addition, a large number of targets obtained possess RNA binding domains, suggesting an important role for miRNA in regulating RNA biogenesis and turnover. We also detected a network of regulation by which miRNAs may contribute to the tissue specific alternative splicing of genes, by regulating the expression of splicing factors selectively in the intestine tissue.
RESULTS
A method for the identification of tissue specific miRNA targets
In order to study the contribution of miRNA activity in producing and maintaining tissue identity, we performed RNA immunoprecipitations of miRNA target genes in two of the largest, morphologically different, and most well characterized tissues in C. elegans: the intestine(McGhee) and body muscle(Gieseler et al.) (Fig. 1A). We took advantage of the ability of the Argonaute protein to bind miRNA target genes, and cloned alg-1, one of the worm orthologs of the human Argonaute 2 protein, downstream of the green fluorescent protein (GFP). The expression of this construct was then driven by the endogenous promoter (alg-1p), or restricted to intestine (ges-1p) or muscle (myo-3p) tissues using tissue specific (TS) promoters (Fig. 1B).
We produced transgenic strains for each construct (Fig. 1C) using single copy integration technology (MosSCI) (Frokjaer-Jensen et al. 2012; Frokjaer-Jensen et al. 2014) to minimize the expression mosaics produced by repetitive extrachromosomal arrays. The strains were validated for integration using genomic PCRs and Western blots (Supplemental Fig. S1).
We then examined the functionality of our cloned alg-1 in rescue experiments using the alg-1 -/- strain RF54(gk214). This strain has a decrease in fertility caused by the loss of functional alg-1 (Bukhari et al. 2012), which was fully rescued by our cloned alg-1 construct in a brood size assay (Supplemental Fig. S2), suggesting that our cloned alg-1 is functional and able to fully mimic endogenous alg-1.
We then used our strains to perform tissue specific RNA immunoprecipitations. Each tissue specific ALG-1 IP was performed in duplicate using biological replicates (total 5 sequencing runs). We obtained ~25M reads on average for each tissue, of which ~80% were successfully mapped to the C. elegans genome (WS250)(Supplemental Fig. S3). The complete list of genes detected in this study is shown in Table S1.
Our study identified a total of 3,681 different protein coding genes specifically targeted by the miRISC using the endogenous alg-1 promoter or in body muscle and intestine tissues (Supplemental Table S1).
There are only 27 validated C. elegans miRNA-target interactions with strong evidence reported in the miRNA target repository miR-TarBase v7, and our study confirmed 16 of these interactions (59%), which is threefold enrichment when compared to a random dataset of similar size (p<0.05, chi square test) (Fig. 2A left panel).
When compared to genes present in the C. elegans intestine and body muscle transcriptomes (Blazie et al. 2017), 81% of the intestine and 56% of the body muscle targets identified in this study match to their respective tissues (Fig. 2A right panel).
A comparison between our hits and a previously published ALG-1 IP dataset in all tissues also support our results (Supplemental Fig. S4) (Zisoulis et al. 2010).
In order to further validate the quality of our hits, we decided to use GFP-based approaches to confirm the tissue localization of selected tissue-specific genes identified in our study, and found with the exception of one, all of our genes coincide with the expected tissue (Supplemental Fig. S5).
Taken together our results suggest that our immunoprecipitation approach was successful and able to isolate bona fide intestine and muscle miRNA targets (Fig. 2).
ALG-1 targets in the intestine regulate key metabolic enzymes
The C. elegans intestine is composed of 20 cells that begin differentiation early in embryogenesis and derive from a single blastomere at the 8-cell stage (McGhee). As the primary role of the intestine is to facilitate the digestion and the absorption of nutrients, many of the most highly expressed genes in this tissue are digestive enzymes, ion transport channels and regulators of vesicle transport (McGhee).
In our intestinal ALG-1 pull-down we identified 3,089 protein coding genes targeted by miRNAs. 2,367 of these genes were uniquely targeted by miRNAs in this tissue (Fig. 2b). As expected, and consistent with the function of the intestine tissue, we find a number of enzymes involved with glucose metabolism, such as enol-1 an enolase, ipgm-1 a phosphoglycerate mutase, and 3 out of 4 glyceraldehyde-3-phosphate dehydrogenases (gpd-1, gpd-2 and gpd-4). The human orthologue of the C. elegans gene enol-1, eno1 has been previously identified as a target of miR-22 in the context of human gastric cancer (Qian et al. 2017).
In addition, some of our top hits are the fatty acid desaturase enzymes fat-1, fat-2, fat-4 and fat-6, which are all involved with fatty acid metabolism, suggesting that these metabolic pathways are subjected to a high degree of regulation in the intestine tissue. All these genes contain seed elements in their 3′UTRs (Supplemental Table S1).
Moreover, we find 5 out of 6 vitellogenin genes (vit-1, vit-2, vit-3, vit-5 and vit-6) strongly targeted by miRNAs, with vit-2 and vit-6 being the most abundant transcripts in our immunoprecipitation (Supplemental Table S1). vit-2 was shown to be targeted by ALG-1 in a previous study(Kudlow et al. 2012), and both possess MiRanda (Betel et al. 2008; Betel et al. 2010) and/or PicTar (Lall et al. 2006) predicted binding sites (Supplemental Table S1). These vitellogenin genes produce yolk proteins, and are energy carrier molecules synthesized in the intestine and transported to the gonads into the oocytes to act as an energy source for the developing embryos (DePina et al. 2011). Accordingly, we also find a number of RAB family proteins that are responsible for intracellular vesicular transport (rab-1, rab-6.1, rab-7, rab-8, rab-21, rab-35, rab-39).
Several transcription factors were also identified as a miRNA targets in the intestine tissue. skn-1 is a bZip transcription factor that is initially required for the specification of cell identity in early embryogenesis, and then later plays a role in modulating insulin response in the intestine of adult worms (Blackwell et al. 2015). This gene has already been found to be targeted by miRNA in many past studies (Zisoulis et al. 2010; Kudlow et al. 2012) and contains many predicted miRNA binding sites and seed regions from both MiRanda (Betel et al. 2008; Betel et al. 2010) and PicTar (Lall et al. 2006) prediction software (Supplemental Table S1). A second transcription factor pha-4 is expressed in the intestine, where it has an effect on dietary restriction mediated longevity (Smith-Vikos et al. 2014). pha-4 is a validated target of let-7 in the intestine tissue(Grosshans et al. 2005), and along with skn-1, is also targeted by miR-228 (Smith-Vikos et al. 2014). Additionally, pha-4 is also targeted by miR-71 (Smith-Vikos et al. 2014).
We also find die-1, which has been associated with the attachment of the intestine to the pharynx and the rectum(Heid et al. 2001), and the chromatin remodeling factor lss-4 (let seven suppressor), which is able to prevent the lethal phenotype induced by knocking out the miRNA let-7(Grosshans et al. 2005). These two genes were also validated by others as miRNA targets(Grosshans et al. 2005).
The intestine plays an important role in producing an innate immune response to pathogens. The genes atf-7, pmk-1 and sek-1 were all identified as targets of miRNAs in this tissue. These three genes act together to produce a transcriptional innate immune response where the transcription factor atf-7 is activated through phosphorylation by the kinases pmk-1 and sek-1. Consistent with our findings, the role of miRNAs in regulating the innate immune response through the intestine and these genes has been reported in multiple studies(Ding et al. 2008; Kudlow et al. 2012; Sun et al. 2016).
Muscle ALG-1 targets modulate locomotion and cellular architecture
C. elegans possess 95 striated body wall muscle cells, which are essential for locomotion (Gieseler et al.). Its sarcomeres are composed of thick filaments containing myosin associated with an M-line, and thin filaments containing actin associated with the dense body. The pulling of actin filaments by myosin heads generates force that produces locomotion (Moerman and Williams).
Our ALG-1 pull-down identified 1,047 protein coding genes targeted by miRNAs in muscle tissue (Supplemental Table S1). Within this group, 348 genes were not present in our intestine tissue dataset, and are specifically restricted to the body muscle tissue (Fig. 2B). Consistent with muscle functions, we detected mup-2, which encodes the muscle contractile protein troponin T, myo-3, which encodes an isoform of the myosin heavy chain, dlc-1, which encodes dynein light chain 1 and F22B5.10, a poorly characterized gene involved in striated muscle myosin thick filament assembly. mup-2, myo-3 and dlc-1 were all found to be targeted by ALG-1 in previous studies (Zisoulis et al. 2010; Kudlow et al. 2012). Consistent with the function of this tissue, a GO term analysis of this dataset highlights an enrichment in genes involved in locomotion (Fig. 2c), suggesting a potential role for miRNAs in this biological process.
We also identified numerous actin gene isoforms (act-1, act-2, act-3 and act-4), which are required for maintenance of cellular architecture within the body wall muscle, unc-60, an actin-binding protein and a regulator of actin filament dynamics, and the Rho GTPase rho-1, which is required for regulation of actin filament-based processes including embryonic polarity, cell migration, cell shape changes, and muscle contraction (Fig. 2c). Small GTPase are a gene class known to be heavily targeted by miRNAs(Enright et al. 2003; Liu et al. 2012). The human orthologs of this rho-1 is a known target for miR-31, miR-133, miR-155 and miR-185(Liu et al. 2012).
Importantly, we also found several muscle-specific transcription factors including mxl-3, a basic helix-loop-helix transcription factor and K08D12.3, an ortholog of the human gene ZNF9. These genes are known to regulate proper muscle formation and cell growth (Fig. 2c). mxl-3 is targeted by miR-34 in the context of stress response(Chen et al. 2015). Both genes have been detected in past ALG-1 immunoprecipitation studies(Zisoulis et al. 2010).
Our top hit in this tissue is the zinc finger CCCH-type antiviral gene pos-1, a maternally inherited gene necessary for proper fate specification of germ cells, intestine, pharynx, and hypodermis(Farley et al. 2008). pos-1 contains several predicted miRNA binding sites in its 3′UTR (Table S1), and based on our GFP reporter validation study is strongly expressed in the body muscle (Supplemental Fig. S4). We also find the KH domain containing protein gld-1, which is the homolog of the human gene QKI, which is targeted by miR-214 (Wu et al. 2017), miR-200c and miR-375 (Pillman et al. 2018).
miRNA targeting is more extensive in the intestine than in the body muscle tissue
By comparing the percentage of tissue specific miRNA targets identified in our study to the previously published intestine and body muscle transcriptomes (Blazie et al. 2015; Blazie et al. 2017), we found that the hits in the intestine are almost twice the number of hits we obtained in the body muscle tissue (30.3% vs 18.2%) (Fig. 3a). The length of the 3`UTRs of genes identified as miRNA targets in the intestine and the body muscle tissues are on average longer and have more predicted miRNA binding sites than the overall C. elegans transcriptome (Fig. 3b).
Taken together, our results indicate that despite the similarity in average 3′UTR length, the intestine tissue is more interconnected with miRNA regulatory networks, when compared with the body muscle tissue.
MiRNAs target in the intestine and body muscle tissues are enriched for miR-355 and miR-85 binding sites
A bioinformatic analysis of the longest 3′UTR isoforms of the targeted genes showed there was no specific requirement for the seed regions in either tissue (Fig. 3c left panel).
However, the use of predictive software showed that in addition to others, there is an intestine-specific bias for miR-355 targets (Fig. 3c, right panel, green mark). This miRNA is involved in the insulin signaling and innate immunity (Zhi et al. 2017), which in C. elegans are both mediated through the intestine tissue.
In contrast, we observed an enrichment of targets for the poorly characterized miR-85 in the body muscle dataset (Fig. 3c, right panel, orange mark). These two miRNAs are uniquely expressed in the corresponding tissues (Martinez et al. 2008).
Intestine and body muscle miRNAs target RNA binding proteins
Surprisingly, we detected an unexpected enrichment of miRNA targets in genes containing RNA binding domains. Out of the ~887 defined C. elegans RNA binding proteins (RBPs)(Tamburino et al. 2013), our study identified almost half of them in both tissues (45%). RNA binding proteins are known to play an important role in producing tissue specific gene regulation and controlling gene expression both at co- and post-transcriptional level(Tamburino et al. 2013).
We found that out of the 599 known RBPs present in the intestine transcriptome (Blazie et al. 2015; Blazie et al. 2017), 63.6% (380) were also present in our intestine ALG-1 pull-down dataset, and are targeted by miRNAs (Fig. 4a). This is a notable enrichment when compared to non RBP genes found by Blaze et al. 2017, of which only 27.7% were identified in our study as miRNA targets.
A similar trend is also present in the body muscle tissue, with 53.5% of RBPs identified as miRNA targets (Fig. 4a left panel). Importantly, the larger pool of targeted RBPs were general factors (GF), such as translation factors, tRNA interacting proteins, ribosomal proteins, and ribonucleases (Fig. 4a right panel), suggesting extensive miRNA regulatory networks in place in this tissue. Zinc finger (ZF) domain containing proteins were the second largest group detected (Fig. 4a right panel). Zinc finger domains are small protein domains composed of an α-helix and β-sheet held together by a zinc ion (Font and Mackay 2010). Zinc fingers are typically DNA-binding proteins that can also bind to RNA (Lu et al. 2003).
In conclusion, we find an enrichment of RBPs as targets of miRNAs in our tissue specific ALG-1 pull-downs. The targeted RBPs span across a variety of subtypes, which may be a consequence of the varied biological functions of these two tissues.
The miRNA targets RNA splicing factors
A further analysis revealed that the most abundant class of RBPs detected in our ALG-1 pull-down in intestine and body muscle datasets was composed of RNA splicing factors (Fig. 4b). The C. elegans transcriptome contains at least 78 known RNA splicing factors involved in both constitutive and alternative splicing (Tamburino et al. 2013). 64 RNA splicing factors (82%) have been previously assigned by our group in the intestine tissue (Blazie et al. 2015; Blazie et al. 2017) and presumably are responsible for tissue specific RNA splicing. 31 RNA splicing factors (40%) were also previously assigned by our group in the body muscle tissue (Blazie et al. 2015; Blazie et al. 2017).
Our tissue specific ALG-1 pull-down identified 37 RNA splicing factors as miRNA targets in the intestine tissue (~47%) (Fig. 4b). 34 of these RNA splicing factors were also previously identified by our group as being expressed in this tissue (Blazie et al. 2015; Blazie et al. 2017). This is notable, considering that in the intestine tissue we have now identified almost half of all RNA splicing factors present in the C. elegans intestine transcriptome as being targeted by miRNAs. In contrast, we have detected only 9 RNA splicing factors targeted by miRNAs in our body muscle tissue ALG-1 pull-down, of which 5 were previously assigned by our group in the body muscle transcriptome (Blazie et al. 2015; Blazie et al. 2017) (Fig. 4b).
The difference in RNA splicing factors targeted by miRNA in these two tissues is significant; with the intestine tissue containing three orders of magnitude more miRNA targeted RNA splicing factors than in muscle tissue. This is in line with the multiple functions of the intestine tissue, where in C. elegans is not limited to digestion, but also involved in fertility, innate immune response and aging, while body muscle tissue function is limited to contraction.
A comparison with human homologues of the targeted RNA splicing factors in C. elegans identified many different sub-types of RNA splicing factors including snRNPs, hnRNPs and SR proteins (Fig. 4b).
Taken together, our results suggest that miRNAs may play an extensive role in regulating both constitutive and alternative RNA splicing in the intestine.
Expression of the RNA splicing factors asd-2, hrp-2 and smu-2 is modulated through their 3′UTRs
In order to validate that RNA splicing factors found in our ALG-1 pull-down are targeted by miRNAs in the intestine, we used the pAPAreg dual fluorochrome vector we developed in a past study (Blazie et al. 2017) (Fig. 5a). This vector uses a single promoter to drive the transcription of a polycistronic pre-mRNA where the coding sequence of the mCherry fluorochrome is separated from the coding sequence of GFP by a SL2 trans-splicing element (Blazie et al. 2017) (SE). The test 3′UTR is cloned downstream of the GFP gene. Since the mCherry transcript is trans-spliced, it reports transcription activation. The GFP gene instead reports translational activity; since its expression is dictated by the downstream tested 3′UTR. If a given miRNA targets the test 3′UTR, the GFP intensity decreases when compared with an untargeted 3′UTRs (ges-1). By comparing the ratio of the mCherry (indicating transcription) to the GFP (indicating translation) fluorochromes, we are able to define the occurrence of post-transcriptional silencing triggered by the tested 3′UTR Fig. 5b (Blazie et al. 2017).
We selected three RNA splicing factors identified in our study in the intestine tissue (asd-2, hrp-2 and smu-2) (Table 1) and prepared transgenic strains to validate their expression and regulation (Fig. 5b).
We used the ges-1 3□UTR as a negative control for miRNA targeting, as it is known to be strongly transcribed and translated in the intestine with minimal regulation (Egan et al. 1995; Marshall and McGhee 2001), and it was not significantly abundant in our intestine ALG-1 pull-down (Table S1). The presence of the ges-1 3□UTR in the pAPAreg vector led to the expression of both mCherry and GFP fluorochromes, indicating robust transcription and translation of the construct as expected (Fig. 5b i., ii., and iii.).
We then cloned asd-2, hrp-2 and smu-2 3□UTRs downstream of the GFP fluorochrome in our pAPAreg vector, prepared transgenic worms expressing these constructs, and studied the fluctuation of the expression level of the GFP fluorochrome in these transgenic strains.
asd-2 and hrp-2 3□UTR were both able to significantly lower GFP expression when compared to the control strain with the ges-1 3`UTR, while the mCherry signal was strong in all strains (Fig. 5b, compare iii. vs vi, ix). These results suggest that these two RNA binding proteins contain regulatory binding sites within their 3□UTRs able to repress their expression.
Interestingly smu-2 3`UTR led to an increase in the ratio of mCherry:GFP when compared to ges-1 implying that this gene may not be regulated similarly as asd-2 and hrp-2 (Fig. 5b, compare iii. vs xii). It is notable however that smu-2 was far less enriched in our intestine ALG-1 pull-down than asd-2 or hrp-2.
Taken together, our data suggests that the 3□UTRs of two of the three tested splicing factors were significantly repressed and may harbor sites for miRNA targeting (Fig. 5b, Table 1).
MiRNAs target intestine RNA splicing factors promoting tissue specific alternative splicing
We then tested changes to tissue specific alternative splicing in the intestine tissue caused by the hnRNP hrp-2, which was significantly abundant in our intestine tissue ALG-1 pull-down. Previous studies have shown that the hrp-2 direct alternative splicing of the genes ret-1, lin-10 and unc-52 (Kabat et al. 2009; Heintz et al. 2017). ret-1 codes for the Reticulon protein which regulates the structure of endomembrane of cells and is widely expressed across multiple tissues including neurons and intestine (Heintz et al. 2017; Torpe et al. 2017). ret-1 is alternatively spliced in a tissue and age dependent manner. In adult worms, the exon-5 of ret-1 is skipped in neurons, hypodermis, and body muscle tissues, while the longer isoform containing exon-5 is expressed in the intestine (Heintz et al. 2017).
We used a biochemical approach to test the alternative splicing of this gene in the context of miRNA regulation. We reasoned that if ALG-1 target hrp-2 3′UTR in the intestine lowering its expression, which in turn cause ret-1 exon-5 exclusion, we should be able to interfere with process by overexpressing the ret-1 3′UTR in this tissue and in turn test the role of the miRNA pathway in this process.
We first tested the ret-1 RNA isoform ratio in wt N2 worms. We extracted total RNA from N2 worms in triplicate and performed RT-PCR experiments using primer pair flanking the ret-1 exon-5 (Fig. 6a). As expected, the ret-1 longer isoform was predominantly expressed in wt worms (Fig. 6a). We then investigated if the miRNA pathway has a role in regulating these splicing events, by testing changes in exon-5 isoform abundance in the alg-1 and alg-2 knockout strains (RF54 (alg-1(gk214) X) and WM54(alg-2(ok304) II). These strains lack miRNA-based gene regulation. Interestingly, we found that in both strains there was an increase in ret-1 exon-5 skipping (Fig. 6a), suggesting that the miRNA pathway plays a role in enabling ret-1 exon-5 exclusion.
Previous reports suggested that the loss of function of hrp-2 induces ret-1 exon-5 skipping (Heintz et al. 2017). When we lowered hrp-2 expression using RNAi in N2 worms, we also detected an increase in exon-5 skipping (Fig. 6a), further highlighting the importance of hrp-2 in RNA alternative splicing.
We then extracted total RNA from transgenic worms expressing hrp-2 3′UTR in the intestine. We reasoned that if miRNA indeed regulate this process, the presence of exogenous hrp-2 3′UTR would deplete miRNAs from targeting the endogenous hrp-2 3′UTR, causing an increased expression of hrp-2, and leading to ret-1 exon-5 skipping. In agreement with this model, the over expression of the hrp-2 3`UTR in the intestine tissue increased ret-1 exon-5 skipping (Fig. 6a).
We then performed similar studies using lin-10 and unc-52, other genes that are alternatively spliced by hrp-2(Kabat et al. 2009) (Fig. 6b and Supplemental Fig. S6), and found that the alternative splicing of these genes is also modulated by miRNAs and follow the same trend observed in the ret-1 gene, albeit with less pronounced effects.
We then tested another alternative splicing event in the gene unc-60, which is controlled by a second RNA splicing factor asd-2 also found abundant in our intestine ALG-1 pull-down. unc-60 is expressed as two alternatively spliced isoforms in a tissue specific manner (Ohno et al. 2012) (Supplemental Fig. S7); unc-60a is expressed exclusively in the body muscle while unc-60b is expressed in many other tissues including the intestine (Ohno et al. 2012). alg-2 knockout worms, deficient in the miRNA pathway, showed a significant shift in the expression of the two unc-60 isoforms (Fig. 6c), demonstrating the importance of the miRNA pathway in regulating alternative splicing of this gene. Importantly, the overexpression of the asd-2 3□UTR in the intestine tissue also led to changes in the unc-60 alternative splicing pattern, indicating that post-transcriptional regulation of asd-2 through its 3□UTR is important for alternative splicing of unc-60 in the intestine.
Taken together, these results show that the dosage of hrp-2 and asd-2 in the intestine is modulated by miRNAs, and in turn dictates ret-1, lin-10, unc-52, and unc-60 tissue specific alternative splicing events.
DISCUSSION
In this manuscript we have developed tools and techniques to identify tissue specific miRNA targets and applied them to uniquely define the genes targeted by miRNAs in the C. elegans intestine and body muscle tissues. We validated previous findings and mapped hundreds of novel tissue specific interactions (Fig. 2 and Table S1).
In order to perform these experiments, we have prepared worm strains expressing ALG-1 fused to GFP and expressed this cassette in the intestine and body muscle tissues using tissue specific promoters. We validated the ALG-1 expression (Supplemental Fig. S1), and the viability of our ALG-1 construct in in vivo studies (Supplemental Fig. S2). We have then performed the ALG-1 immunoprecipitations in duplicate, separated the miRNA complex from their targets, and sequenced the resultant RNA using Illumina sequencing (Fig. 1). The hits we obtained were of high quality, as we were able to map >80% of the sequencing reads to the C. elegans genome (Fig. 2 and Supplemental Fig. S3). We validated selected hits with expression localization studies in both tissues (Supplemental Fig. S5). Importantly, our ALG-1 pull-down results are in line with previous findings (Fig. 2A and Supplemental Fig. S4), and are significantly enriched with predicted miRNA targets (Fig. 3b-c).
We were surprised to observe that while 81% of the hits identified in the intestine match the intestine transcriptome(Blazie et al. 2015; Blazie et al. 2017), only 56% of the hits in the body muscle tissue dataset matched to the body muscle transcriptome (Fig. 2A). This means that most of the hits in the intestine were previously mapped in this tissue, in contrast with our body muscle results, where 44% of the hits were novel. Perhaps, these novel targets are genes strongly down-regulated in this tissue, due to miRNA targets, leading to deadenylation and mRNA degradation that make them undetectable using PAB-1-based pull-down approaches (Blazie et al. 2015; Blazie et al. 2017).
Given the fact the body muscle transcriptome is significantly smaller than the intestine transcriptome, it may be also subjected to less regulation through miRNA. However, if we normalize the number of genes expressed in each tissue and study the proportion of the transcriptome targeted by miRNA, we still find significantly more regulation in the intestine (Fig. 3A), suggesting that this tissue may indeed employ miRNA-based gene regulation to a greater extent.
We also found an unexpected disparity in complexity between the two datasets. The majority of the targeted genes are unique to the intestine tissue, which share only a handful of genes with the body muscle tissue (23% of the total intestine dataset) (Fig. 2B). Most of the overlap includes housekeeping genes, involved in transcription elongation, chromatin assembly, protein folding, etc., that are most likely regulated similarly in both tissues. This is in contrast with the genes uniquely targeted in each tissue, which instead define specific tissue function and are related to cell identity (Fig. 2B-C).
Of note, this disparity demonstrates that there is minimal overlap between our two datasets, and our ALG-1 pull-down is indeed tissue specific with marginal cross-contamination. Intriguingly, when we look at the miRNA population predicted to target the genes in our datasets as from MiRanda (Fig. 3C Right Panel), we found a significant enrichment of known tissue specific miRNA targets in the correct tissue (Betel et al. 2010), which correlate with our tissue specific datasets (Fig. 3C). miR-85 was previously found in body muscle tissue (Martinez et al. 2008), while miR-355 was found expressed only in the intestine (Martinez et al. 2008) (Fig. 3C). This in turn suggests that there may be a tissue specific miRNA targeting bias in C. elegans, with unique tissue specific miRNAs expression targeting unique populations of genes.
One of the most surprising findings of this study is that a large number of targets obtained with our tissue specific ALG-1 pull-down are RBPs. 64% of intestinal RBPs were found in our intestine ALG-1 pull-down, and 54% in our muscle ALG-1 pull-down. This result was unexpected given the small number of RNA binding proteins previously identified in the C. elegans genome (n = 887) (Tamburino et al. 2013), which totaled to only 4% of the total C. elegans protein coding genes. However, previous studies have hinted at a strong regulatory network between miRNAs and RBPs, as the 3′UTRs of RBPs were found to contain on average more predicted miRNA binding sites than other gene classes (Tamburino et al. 2013). RNA binding domain containing proteins are involved in many biological processes, and their role is not limited to RNA biogenesis (Tamburino et al. 2013).
RBPs can bind single or double strand RNAs, and associate with proteins forming ribonucleoprotein complexes (RNPs). RNPs are fundamental blocks of the ribosome(Ban et al. 2014), they form the telomerase enzyme (Podlevsky and Chen 2016), the RNase P enzyme (Guerrier-Takada et al. 1983; Evans et al. 2006), the RISC complex(Schirle and MacRae 2012), the hnRNPs (Geuens et al. 2016), and the small nuclear RNPs (snRNPs) (Will and Luhrmann 2011). Many of the protein components of these complexes were recovered in our intestine and body muscle ALG-1 pull-down (Fig. 4A and Supplemental Table S1).
Importantly, longevity, fat metabolism, development, are all processes controlled by RBP-containing complexes (Lee and Schedl ; Masuda et al. 2009; Aryal et al. 2017), and in the context of miRNA regulation, the ability of miRNAs to control RBPs abundance and function allow for an increased control of fundamental cellular core processes. 147 RBPs are shared between our two datasets, with 234 RBPs uniquely detected in the intestine. Within this intestinal dataset we mapped a surprising amount of RBPs involved in RNA splicing (Fig. 4B).
We performed a literature search for known RNA splicing factors in C. elegans; out of the 72 total protein identified, 37 of them were detected at different level of strength in our intestine ALG-1 pull-down. Contrastingly, in the muscle tissue we do not observe this level of complexity, with only 9 RNA splicing factors identified in this dataset (Fig. 4B), suggesting less isoform density in this tissue. Unfortunately, there are no available datasets to further study tissue-specific isoform abundances in the context of miRNA targeting, and corroborate our results.
asd-2 and smu-2 are well-known RNA splicing factors that induce exon retention in a dosage dependent manner (Spartz et al. 2004; Ohno et al. 2012), while hrp-2 abundance controls exon skipping (Kabat et al. 2009). Here we show that all three RNA splicing factors possess regulatory targets within their 3′UTRs (Fig. 5) that dominate over their respective promoters in controlling their dosage in the intestine tissue (Fig. 5).
In addition, the miRNA pathway is needed in order to properly process selected exons by asd-2 and hrp-2 (Fig. 6). Furthermore, the depletion of miRNA targets in their 3′UTRs using sponge approaches lead to defects in their alternative splicing pattern.
Studying alternative splicing in vivo using multicellular animals is challenging, but taken together our results propose a role for miRNAs in regulating alternative splicing in the intestine, where their presence in a tissue specific manner may lead to alteration of the dosage balance of RNA splicing factor, to tissue specific alternative splicing (Fig. 7). miRNAs are known to alter gene expression dosage, more than induce complete loss of protein function (Wolter et al. 2017; Bartel 2018). On the other hand, many RNA splicing factors involved with constitutive and alternative splicing are ubiquitously expressed (Shin and Manley 2004), but somehow are able to induce tissue specific alternative splicing in a dosage dependent manner. In this context, it is feasible that miRNAs may alter the dosage of RNA splicing factors, leading to tissue specific alternative splicing (Fig. 7).
MATERIALS AND METHODS
Preparing MosSCI vectors for generating GFP::ALG-1 strains
The strains used for the ALG-1 pull-down were prepared using a modified version of the previously published polyA-pull construct (Blazie et al. 2015; Blazie et al. 2017). We produced a second-position Entry Gateway vector containing the genomic sequence of alg-1 tagged at its N-terminus with the GFP fluorochrome. Briefly, we designed primers flanking the coding sequence of alg-1 and performed a Polymerase Chain Reaction (PCR) amplification to clone the alg-1 locus from genomic DNA extracted from N2 wt worms (primer 1 and 2 in Supplemental Table S2). The resulting PCR product was analysed on a 1% agarose gel, which displayed a unique expected band at ~3,500 nucleotides. This band was then isolated using the QIAquick Gel Extraction Kit (QIAGEN, cat. 28704) according to the manufacturers protocol. Upon recovery, we digested the purified PCR product with the restriction enzymes SacI and BamHI and cloned it into the modified polyA-pull construct (Blazie et al. 2015; Blazie et al. 2017), replacing the gene pab-1. The ligation reaction was performed using the NEB Quick Ligation Kit (cat. MS2200S) according to the manufacturers protocol. We used the QuikChange II Site-Directed Mutagenesis Kit (Agilent, cat. 200523) to remove the unnecessary C-terminal 3xFLAG tag from the polyA-pull vector (primers 3 and 4 in Supplemental Table S2). We then cloned the previously described endogenous alg-1 promoter (Vasquez-Rifo et al. 2012) by designing primers to add Gateway BP cloning elements, and then performed PCR using N2 wt genomic DNA as a template (primers 5 and 6 in Supplemental Table S2). Using the resulting PCR product, we performed a Gateway BP cloning reaction into the pDONR P4P1R vector (Invitrogen) according to the manufacturers protocol. To assemble the final injection clones, we performed several Gateway LR Clonase II plus reactions (Invitrogen, cat. 12538-013) using the Destination vector CFJ150 (Frokjaer-Jensen et al. 2012), the tissue specific or endogenous promoters (alg-1 for endogenous, ges-1 for intestine tissue and myo-3 for body muscle tissue), the gfp tagged alg-1 coding sequence, and the unc-54 3′UTR as previously published(Blazie et al. 2017).
Microinjections and screening of transgenics
To prepare single copy integrated transgenic strains we used the C. elegans strain Eg6699 [ttTi5605 II; unc-119(ed3) III; oxEx1578](Frokjaer-Jensen et al. 2012), which is designed for MosI mediated single copy intergration (MosSCI) insertion, using standard injection techniques. These strains were synchronized by bleaching(Porta-de-la-Riva et al. 2012), then grown at 20°C for 3 days to produce young adult (YA) worms. YA worms were then picked and subjected to microinjection using a plasmid mix containing; pCFJ601 (50ng/μl), pMA122 (10ng/μl), pGH8 (10ng/μl), pCFJ90 (2.5ng/μl), pCFJ104 (5ng/μl) and the transgene (22.5ng/μl)(Frokjaer-Jensen et al. 2008). Three injected worms were isolated and individually placed into single small nematode growth media (NGM) plates (USA Scientific, cat 8609-0160) seeded with OP50-1 and were allowed to grow and produce progeny until the worms had exhausted their food supply. The plates were then screened for progenies that exhibited wild type movement and proper GFP expression, and single worms exhibiting both markers were picked and placed onto separate plates to lay eggs overnight. In order to select for single copy integrated worms, an additional screen was performed to select for worms that lost the mCherry fluorochrome expression (extrachromosomal injection markers).
Genotyping of transgenic strains
Single adult worms were isolated and allowed to lay eggs overnight and then genotyped for single copy integration of the transgene by using single worm PCR as previously described(Broughton et al. 2016) (primers 7-9 in Table S2). Progeny from worms that contained the single copy integrations were propagated and used for this study. A complete list of worm strains produced in this study is shown in Table S3.
Validating expression of the transgenic construct
To validate the expression of our transgenic construct, and to evaluate our ability to immunoprecipitate GFP tagged ALG-1, we performed an immunoprecipitation (as described below) followed by a western blot. For the western blot we used a primary anti-GFP antibody (Novus, NB600-303) (1:2000) and a fluorescent secondary antibody (LI-COR, 925-32211)(1:5000), followed by imaging using the ODYSSEY CLX system (LI-COR Biosciences, NE).
In vivo validation of GFP::ALG-1 functionality by brood size assay
In order to validate the in vivo functionality of our transgenic GFP tagged ALG-1, we used a genetic approach. It was previously shown that the knock out alg-1 strain RF54 [alg-1(gk214) X] lead to a decrease in fertility in this strain(Bukhari et al. 2012). We rescued this decrease in fertility in the alg-1 knockout strain RF54[alg-1(gk214) X] by crossing it into our strain MMA17 (Table S3), which expresses our GFP tagged transgenic ALG-1, driven by the endogenous alg-1 promoter. The resulting strain MMA20 [alg-1(gk214)X; alg-1p::gfp::alg-1::unc-54 II] only expresses our cloned alg-1 gene tagged with the GFP fluorochrome. We validated the genotype of MMA20 using single worm PCRs as previously described(Broughton et al. 2016) (primers 10 and 11 in Table S2 and Supplemental Fig. S2). The brood size assay was used to evaluate the ability of our transgenic GFP tagged ALG-1 to rescue the loss in fertility seen in the alg-1 knockout strain (RF54). The brood size assay was performed by first synchronizing N2, RF54 and MMA20 strains to arrested L1 larvae, through bleaching followed by starvation in M9 solution. We then plated the L1 arrested worms on NGM plates seeded with OP50-1 and allowed the worms to develop to the adult stage for 48 hours after which single worms were isolated onto small OP50-1 seeded plates. The adult worms were left to lay eggs overnight (16 hours) after which they were removed. The eggs were allowed to hatch and develop for 24 hrs and the number of larvae in each plate was counted.
Worm Preparation and Crosslinking
0.5ml of mixed stages C. elegans strains were grown in five large 20 cm plates (USA Scientific, cat 8609-0215) and harvested by centrifugation at 400rcf for 3 minutes. The pellets were initially washed in 15ml dH2O water and spun down 400rcf for 3 minutes and then resuspended in 10ml of M9+0.1%Tween20 and cross-linked 3 times on ice, with energy setting: 3000 x 100 μJ/cm2 (3kJ/m2) (Stratalinker 2400, Stratagene)(Moore et al. 2014). After the crosslinking, the C. elegans strains were recovered by centrifugation at 400 rcf for 3 minutes, and resuspended in two volumes (1ml) of lysis buffer (150mM NaCl, 25mM HEPES(NaOH) pH 7.4, 0.2mM DTT, 10% Glycerol, 25 units/ml of RNasin® Ribonuclease Inhibitor (Promega, cat N2611), 1% Triton X-100 and 1 tablet of protease inhibitor for every 10ml (Roche cOmplete ULTRA Tablets, Sigma, cat 5892791001). The lysed samples were subjected to sonication using the following settings: Amplitude 40%; 5x with 10sec pulses; 50sec rest between pulses (Q55 Sonicator, Qsonica). After the sonication, the cell lysate was cleared of debris by centrifugation at 21,000rcf at 4°C for 15 min and the supernatants were then transferred to new tubes.
GFP-TRAP bead preparation and Immunoprecipitation
25μl of GFP-TRAP beads (Chromotek, gtma-10) (total binding capacity 7.5μg) per immunoprecipitation were resuspended by gently vortexing for 30 seconds, and washed three times with 500μl of cold Dilution/Wash buffer (10 mM Tris/Cl pH 7.5; 150 mM NaCl; 0.5 mM EDTA). The beads were then resuspended in 100μl/per IP of Dilution/Wash buffer. 100μl of resuspended beads were then incubated with 0.5ml of lysate for 1hr on the rotisserie at 4°C. At the completion of the incubation step, the beads were collected using magnets. The unbound lysate was saved for PAGE analysis. The beads containing the immunoprecipitatred alg-1 associated to the target mRNAs were then washed three times in 200μl of Dilution/Wash buffer (10 mM Tris/Cl pH 7.5; 150 mM NaCl; 0.5 mM EDTA), and then the RNA/protein complex was eluted using 200μl of Trizol (Invitrogen, cat 15596026) and incubated for 10min.
Trizol/Driectzol RNA Purification
The RNA purification was performed using the RNA MiniPrep kit (Zymo Research, cat ZR2070) as per the manufacturers protocol. All centrifugation steps were performed at 21,000g for 30 seconds. We added an equal volume ethanol (95-100%) to each sample in Trizol and mixed thoroughly by vortexing (5 seconds, level 10). The samples were then centrifuged, recovered using a magnet, and the supernatant was transferred into a Zymo-Spin IIC Column in a Collection Tube and centrifuged. The columns were then transferred into a new collection tube and the flow through were discarded. 400 µl of RNA wash buffer was added into each column and centrifuged. In a separate RNase-free tube, we added 5 µl DNase I (6 U/µl) and 75 µl DNA Digestion Buffer, mixed and incubated at room temperature (20-30°C) for 15 minutes. 400 µl of Direct-zol RNA PreWash (Zymo Research, cat ZR2070) was added to each sample and centrifuged twice. The flow-throughs were discarded in each step. 700 µl of RNA wash buffer was then added to each column and centrifuged for 2 minutes to ensure complete removal of the wash buffer. The columns were then transferred into RNase-free tubes, and the RNAs were eluted with 30 µl of DNase/RNase-Free Water added directly to the column matrix and centrifuged.
cDNA library preparation and sequencing
Each cDNA library was prepared using a minimum of 500pg of immunoprecipitated RNA for each tissue. The cDNA synthesis was performed using the SPIA (Single Primer Isothermal Amplification) kit according to the manufacturers protocol(Mardis and McCombie 2017). Briefly, the total RNA was reverse transcribed using the IntegenX’s (Pleasanton, CA) automated Apollo 324 robotic preparation system using previously optimized conditions(Blazie et al. 2015). The cDNA synthesis was performed using a SPIA (Single Primer Isothermal Amplification) kit (IntegenX and NuGEN, San Carlos, CA)(Kurn et al. 2005). The cDNA was then sheared to approximately 300 base pair fragments. The cDNA shearing was performed using the Covaris S220 system (Covaris, Woburn, MA). We used the Agilent 4200 TapeStation instrument (Agilent, Santa Clara, CA) to quantify the abundance of cDNAs and calculate the appropriate amount of cDNA necessary for library construction. Tissue-specific barcodes were then added to each cDNA library, and the finalized samples were pooled and sequenced using the HiSeq platform (Illumina, San Diego, CA) with a 1×75bp HiSeq run.
Data analysis
We obtained ~15M unique reads per sample (~130M reads total). Common alignment software such as Bowtie 2(Langmead et al. 2009), BWA(Li and Durbin 2009) with default parameters, and custom Perl scripts were used to map the reads obtained from the sequencer, and to study the differential gene expression between our samples using common software such as Cufflinks (Trapnell et al. 2010). The sequencing reads were mapped to the C. elegans genome (WS250) and analyzed by using the Cufflinks suite (Trapnell et al. 2012). A summary of the results is shown in (Supplemental Fig. S3). Mapped reads were further converted into a bam format and sorted using SAMtools software run with generic parameters(Li et al. 2009), and used to calculate Fragments Per Kilobase Million (FPKM) values, as an estimate of the abundance of each gene in the immunoprecipitations. We used an FPKM ≥ 1 on the median from each replicate as a threshold for identifying targeted genes.
Molecular cloning and assembly of expression constructs
The promoters of candidate genes were extracted from genomic DNA using genomic PCR and cloned into Gateway-compatible entry vectors. We designed Gateway-compatible primers (primers 12-19 in Supplemental Table S2) targeting 2,000bp upstream of a given transcription start site, or up to the closest gene. Using these DNA primers, we performed PCRs on C. elegans genomic DNA, amplified these regions, and analysed the PCR products by gel electrophoresis. Successful DNA amplicons were then recombined into the Gateway entry vector pDONR P4P1R using Gateway BP Clonase reactions (Invitrogen).
The reporter construct pAPAreg has been previously described in Blazie et al., 2017. The coding region of this construct was prepared by joining the coding sequence of the mCherry fluorochrome to the SL2 splicing element found between the gpd-2 and gpd-3 genes, and to the coding sequence of the GFP gene. The entire cassette was then PCR amplified with Gateway-compatible primers and cloned into pDONR P221 by performing Gateway BP Clonase reactions (Invitrogen).
The 3□UTRs of the genes in this study were cloned by anchoring the Gateway-compatible primers at the translation STOP codon of each tested gene, to the longest annotated 3`UTR. We have included 50 base pairs downstream of the annotated PAS site to include 3□end processing elemens (primers 20-27 in Table S2). The PCR products were analysed using gel electrophoresis analysis and used to perform Gateway BP Clonase reactions (Invitrogen, cat. 11789020) into pDONR P2RP3 as per the manufacturers protocol. The unc-54 3□UTR used in this study was previously described in Blazie et al., 2017.
The constructs injected were assembled by performing Gateway LR reactions (Invitrogen) with each promoter, reporter, and 3□UTR construct as per the manufacturers protocol into the MosSCI compatible destination vector CFJ150. We then microinjected each reporter construct (100ng/ul) with CFJ601 (100ng/ul) into MosSCI compatible C. elegans strains using standard microinjection techniques (Evans (ed.)).
Fluorescent imaging and analysis of nematodes
Transgenic strains were synchronized by bleaching and transferred to NGM plates seeded with OP50-1. Worms were grown at room temperature for 24 hours and L1/L2 larvae were extracted and washed, before imaging using a Leica DMI3000B microscope with a 40x magnification objective lens. The transgenic strains were then imaged (Leica DFC345FX mounted camera) with Gain = 1x, Gamma = 0.5, and 1 sec exposure. The fluorescence of GFP and mCherry were quantified individually using the integrated density (ID) function of ImageJ software(Schneider et al. 2012). Fluorescence ratios were calculated for each worm (n=10) by dividing the ID for mCherry by the ID for GFP and the reported result for each strain is the average fluorescence ratio.
Bioinformatic analysis of tissue specific miRNA targeting biases
The tissue specific miRNA studies were performed in two steps. First, we utilized custom-made Perl scripts to scan across the longest 3□UTR of each C. elegans protein coding gene (WS250) in our dataset, searching for perfect sequence complementarity to the seed regions of all C. elegans miRNAs present in the miRBase database (release 21)(Griffiths-Jones 2004; Griffiths-Jones et al. 2006; Griffiths-Jones et al. 2008; Kozomara and Griffiths-Jones 2011; Kozomara and Griffiths-Jones 2014). This result was then used to calculate the percentage of seed presence in muscle and intestine dataset. To calculate the percentage of predicted targets, we extracted both predicted target genes, and their target miRNA name from the miRanda database(Betel et al. 2008) and compared the results with our study. A complete list of miRNA prediction for each tissue profiled is shown in Table S1.
Comparison with other datasets
We extracted the WormBase IDs of genes in the intestine and body muscle transcriptomes previously published by our group(Blazie et al. 2017), and most abundant miRNA targets (transcript names) identified by Kudlow et al., 2012 in these tissues (Kudlow et al. 2012). We then translated the transcript names from Kudlow et al., 2012 into WormBase IDs using custom Perl scripts, and compared how many genes in each of these groups overlap with our ALG-1 pull-downs. The results are shown in Supplemental Fig. S4.
RNA extraction for detection of intestine specific splicing variants
We extracted total RNA using the Direct-zol™ RNA MiniPrep Plus kit (Zymo Research, cat ZR2070) from (1) N2 wt worms, (2) RF54 (alg-1(gk214) X) strain, (3) WM53 (alg-2(ok304) II) strain, (4) N2 strain subjected to RNAi as previously described(Ahringer (ed.)) for the hrp-2 gene, and (5) transgenic worms overexpressing the hrp-2 3′UTR under control of intestine promoter (ges-1p::pAPAreg::hrp-2). Each strain was synchronized and grown in M9 media to L1/dauer stage then transferred to plates containing HT115. To perform RNAi to hrp-2, N2 worms were similarly synchronized, grown in M9 until L1/dauer stage then transferred to agar plates containing HT115 with L4440 hrp-2 RNAi(Kamath and Ahringer 2003). RNA was extracted from each strain at the adult stage in triplicates. After the RNA extractions, the cDNA was synthesized from each sample with SuperScript III RT (Life Technologies, cat 18080093) according to the manufacturers protocol. In short, 200ng of each RNA sample was incubated with 1 µL of 50mM poly dT anchor, 1 µL of 10mM dNTP mix and brought to a total volume of 14 µL with nuclease free H2O and incubated for 5 minutes at 60°C then iced for 1 minute. 4 µL of 5x first strand buffer, 1 µL of 0.1M DTT and 1 µL (200 units) of SuperScript III reverse transcriptase were added to each sample and incubated at 50°C for 60 minutes then heat inactivated at 70°C for 15 minutes. 200ng of cDNA from each sample was used in PCR consisting of 34 cycles using HiFi Taq Polymerase (Invitrogen, cat 11304011) according to manufacturer protocols. Primers used to test alternative splicing of ret-1, lin-10, unc-52, and unc-60 were designed to flank the alternatively spliced exons and were adapted from previous studies(Kabat et al. 2009; Ohno et al. 2012; Heintz et al. 2017)(Table S2 primers 28-36). We then acquired images of the PCR amplicons (5 µL) separated by agarose gel electrophoresis and studied the alternatively spliced isoforms using the ImageJ software package(Schneider et al. 2012). We measured the integrated density values for each band by defining regions of interest in the images, and compared the integrated density values by normalizing the smaller bands to the larger bands. Each strain was quantified in triplicate. The resulting isoform ratios are displayed in Supplemental Fig. S6-S7.
Author Contribution
MM and KK designed the experiments. KK executed the experiments. AS executed a portion of the experiments. HS assisted with the experiments and the imaging of the C. elegans transgenic lines, and performed the analysis in Supplemental Fig. S4. MM and KK performed the bioinformatics analysis and uploaded the results to the UTRome.org database. MM and KK led the analysis and interpretation of the data, assembled the Fig.s, and wrote the manuscript. All authors read and approved the final manuscript.
Data Access
Raw reads were submitted to the NCBI Sequence Read Archive (http://trace.ncbi.nlm.nih.gov/Traces/sra/). The results of our analyses are available in Excel format as Supplementary Table S1, and in our APA-centric website www.APAome.org.
Funding
This work was supported by the National Institutes of Health [grant number 1R01GM118796].
Conflict of Interest
The authors declare that they have no competing interests.
Acknowledgements
We thank Stephen Blazie for the cloning of the alg-1 coding sequence used as the backbone for the development of the vectors used in this manuscript. We also thank Heather Hrach and Megan McCaughan for their review of the manuscript. We thank Gabriel Richardson, for her diligent work in anticipating and maintaining the needs of the lab.