Summary
In animals, piRNAs guide PIWI-proteins to silence transposons and regulate gene expression. The mechanisms for making piRNAs have been proposed to differ among cell types, tissues, and animals. Our data instead suggest a single model that explains piRNA production in most animals. piRNAs initiate piRNA production by guiding PIWI proteins to slice precursor transcripts. Next, PIWI proteins direct the stepwise fragmentation of the sliced precursor transcripts, yielding tail-to-head strings of phased pre-piRNAs. Our analyses detect evidence for this piRNA biogenesis strategy across an evolutionarily broad range of animals including humans. Thus, PIWI proteins initiate and sustain piRNA biogenesis by the same mechanism in species whose last common ancestor predates the branching of most animal lineages. The unified model places PIWI-clade Argonautes at the center of piRNA biology and suggests that the ancestral animal—the Urmetazoan—used PIWI proteins both to generate piRNA guides and to execute piRNA function.
INTRODUCTION
Uniquely, animals produce PIWI-interacting RNAs (piRNAs), a special class of small RNAs that guides PIWI proteins to silence transposons and regulate gene expression (Girard et al., 2006; Aravin et al., 2006; Vagin et al., 2006; Saito et al., 2006; Grivna et al., 2006; Houwing et al., 2007; Batista et al., 2008; Das et al., 2008). piRNAs complementary to transposons ensure genomic stability in the male or female germline in animals as diverse as scorpions, honey bees, and mice; in many arthropods, piRNAs also protect somatic tissues from both transposons and RNA viruses (Siomi et al., 2011; Czech and Hannon, 2016; Morazzani et al., 2012; Miesen et al., 2015; Vodovar et al., 2012; Lewis et al., 2017). Remarkably, in silk moth oocytes, a single piRNA helps determine sex (Kiuchi et al., 2014). In the testes of mammals, distinct classes of piRNAs (1) direct DNA and histone methylation to transposon sequences during embryogenesis (Aravin et al., 2008; Kuramochi-Miyagawa et al., 2008; Pezic et al., 2014), (2) post-transcriptionally repress transposons later in spermatogenesis (Reuter et al., 2011; Di Giacomo et al., 2013; Inoue et al., 2017), and (3) ensure completion of meiosis and successful spermiogenesis (Aravin et al., 2006; Girard et al., 2006; Grivna et al., 2006; Vourekas et al., 2012; Watanabe et al., 2014; Gou et al., 2014; Goh et al., 2015; Zhang et al., 2015).
Studies of oogenesis in flies and spermatogenesis in mice suggest that germline piRNA biogenesis can be divided into primary and secondary pathways. Primary piRNAs (Figure 1, at left) are generated from long, single-stranded RNAs that are transcribed from genomic loci called piRNA clusters (Brennecke et al., 2007; Li et al., 2013). These long precursor transcripts are fragmented by endonucleolytic cleavage—hypothesized to be catalyzed by the mitochondrial protein Zucchini/PLD6—producing tail-to-head, phased precursor piRNAs (pre-piRNAs; Figure 1, at left; Ipsaro et al., 2012; Nishimasu et al., 2012; Mohn et al., 2015; Han et al., 2015; Ding et al., 2017). Each pre-piRNA begins with a 5′ monophosphate, a prerequisite for loading RNA into nearly all Argonaute proteins (Nykanen et al., 2001; Ma et al., 2005; Parker et al., 2005; Wang et al., 2009; Frank et al., 2010; Boland et al., 2011; Kawaoka et al., 2011; Elkayam et al., 2012; Schirle and MacRae, 2012; Schirle et al., 2014; Cora et al., 2014; Wang et al., 2014; Matsumoto et al., 2016). Once bound to a PIWI protein, the 3′ ends of pre-piRNAs are trimmed by the single-stranded-RNA exonuclease, Trimmer/PNLDC1, to a length characteristic of the receiving PIWI protein (Kawaoka et al., 2011; Tang et al., 2016; Izumi et al., 2016; Ding et al., 2017; Zhang et al., 2017; Nishimura et al., 2018). (Flies lack a Trimmer/PNLDC1 homolog, and instead piRNA 3′ ends are resected by the miRNA-trimming enzyme Nibbler; [Han et al., 2011; Liu et al., 2011; Feltzin et al., 2015; Hayashi et al., 2016].) Finally, the small RNA methyltransferase Hen1/HENMT1 adds a 2′-O-methyl moiety to the 3′ ends of the mature piRNAs (Kirino and Mourelatos, 2007b; Horwich et al., 2007; Saito et al., 2007; Kirino and Mourelatos, 2007a; Ohara et al., 2007; Lim et al., 2015).
In the secondary piRNA biogenesis pathway or “ping-pong” cycle (Brennecke et al., 2007; Gunawardane et al., 2007), piRNA-directed, PIWI-catalyzed slicing of a target transcript creates an RNA fragment bearing a 5′ monophosphate (Wang et al., 2014). This RNA fragment acts as a pre-piRNA precursor (pre-pre-piRNA), binds to a PIWI protein, and ultimately generates a new secondary piRNA with 10 nt complementarity to the piRNA that produced it.
According to the current model, distinct mechanisms produce piRNAs in different cell types and tissues and in different animals (Reuter et al., 2011; Beyret et al., 2012; Vourekas et al., 2012; Nishida et al., 2018). For example, fly oocytes and mouse embryonic male germ cells produce both primary and secondary piRNAs, and the secondary pathway counterintuitively initiates biogenesis of primary piRNAs: piRNA-directed, PIWI-catalyzed slicing of a long precursor transcript creates a secondary piRNA followed by a series of phased pre-piRNAs, which then mature into primary piRNAs (Mohn et al., 2015; Han et al., 2015; Senti et al., 2015; Wang et al., 2015; Yang et al., 2016). In contrast, the current model for mouse pachytene piRNA biogenesis proposes that “piRNA biogenesis in post-natal male germ cells differs strikingly from that in embryonic [germ] cells, because the majority of piRNAs are produced only by primary biogenesis after birth” (Nishimura et al., 2018; Figure 1, at left). What initiates primary piRNA production in the post-natal mammalian testis is unknown. Finally, recent work on cultured silk moth BmN4 cells reported that “Bombyx produces no phased piRNAs” (Nishida et al., 2018), implying that piRNAs in the silk moth are produced solely through the secondary pathway, i.e., by ping-pong amplification.
Here, we report that a single mechanism can explain piRNA biogenesis in mammals, insects, and likely all other animal germ cells (Figure 1, at right). We show that the secondary piRNA pathway—piRNA-guided slicing by a PIWI protein—initiates primary piRNA biogenesis in the post-natal mouse testis. Thus, the same fundamental strategy explains how piRNA production begins in the female germline in flies and the pre- and post-natal male germline in mammals. Moreover, both in mice and in flies, PIWI proteins directly participate in primary piRNA production by directing the stepwise fragmentation of long piRNA precursor transcripts: the site of endonucleolytic cleavage that simultaneously establishes the 3′ end of a pre-piRNA and the 5′ end of the next primary piRNA is determined by the PIWI protein bound to the precursor transcript’s 5′ end. Analysis of piRNA sequences detects evidence for this same piRNA biogenesis strategy across an evolutionary broad range of animals. Thus, a common biogenesis architecture describes how PIWI proteins initiate and sustain production of their own piRNA guides in animals separated by almost a billion years of evolution (Hedges et al., 2015; Kumar et al., 2017). We hypothesize that the piRNA pathway in the ancestral animal consisted only of PIWI-clade Argonautes that both generated their own piRNA guides and executed piRNA function.
RESULTS
Phased Pre-piRNAs are a General Feature of Primary piRNA Biogenesis in Bilateral Animals
In the dipteran insect Drosophila melanogaster (fruit fly) and the mammal Mus musculus (house mouse), primary piRNA biogenesis pathway produces tail-to-head strings of phased pre-piRNAs, which are further 3′-to-5′ trimmed to yield mature primary piRNAs (Mohn et al., 2015; Han et al., 2015; Ding et al., 2017). We asked whether this strategy for making piRNAs was broadly conserved, analyzing piRNA sequencing data from 33 non-model species spanning ~950 million years of evolution (Figure 2). In mice and flies, genetic mutation or RNAi depletion of papi/Tdrkh, nibbler, or Trimmer/Pnldc1 blocks piRNA maturation, allowing the detection of tail-to-head strings of untrimmed prepiRNAs, the hallmark of the phased, primary piRNA pathway (Saxe et al., 2013; Mohn et al., 2015; Han et al., 2015; Hayashi et al., 2016; Ding et al., 2017). Such loss-of-function strategies are typically not available for non-model species. However, studies in flies have shown that phasing can be detected among mature piRNAs when the extent of pre-piRNA trimming is small, i.e., when untrimmed pre-piRNAs and trimmed mature piRNAs are similar in length (Mohn et al., 2015; Han et al., 2015).
We estimated the length of pre-piRNAs in each of the 33 species by calculating the most frequent distance between mature piRNA 5′ ends (Figure 2). Consistently, if the estimated length range of pre-piRNAs overlapped the length range of mature piRNAs, we observed a tail-to-head arrangement of piRNAs: the most frequent distance between piRNA 3′ and 5′ ends was 0 (Figure 2). Strings of phased pre-piRNAs were detectable in an evolutionarily broad range of animals (red in Figure 2; Z0 ≥ 1.96, i.e., p ≤ 0.05): the primate Macaca fascicularis (crab-eating macaque); the teleost fish Danio rerio (zebrafish); three dipteran insects, Drosophila virilis, Musca domestica (house fly), and Aedes aegypti (yellow fever mosquito); the coleopteran insect Tribolium castaneum (red flour beetle); and the hymenopteran insect Bombus terrestris (buff-tailed bumblebee). In other animals, pre-piRNA length was estimated to be significantly longer than that of mature piRNAs, suggesting that in these species, phased pre-piRNAs undergo extensive 3′-to-5′ trimming during maturation into primary piRNAs, making the detection of phasing infeasible in wild-type animals.
To test whether additional members of the group of 33 species used a phased primary biogenesis mechanism to produce piRNAs, we analyzed the 5′-to-5′ distances of mature piRNAs as a surrogate for detecting tail-to-head strings of pre-piRNAs (Figure S1). Such a strategy is expected to detect phasing when the distribution of pre-piRNA lengths is essentially unimodal. Species with phased piRNAs are predicted to display mature primary piRNAs whose 5′ ends are separated by multiples of the modal prepiRNA length, yielding a repeating pattern of 5′-to-5′ piRNA distances (Figures 2 and S1). Autocorrelation analysis of the data allowed the quantitative detection of periodic peaks—evidence for phased pre-piRNA production—for ten additional animal species (orange in Figure 2; raw data in Figure S1): the arachnid Centruroides sculpturatus (bark scorpion); the centipede Strigamia maritima; the hemipteran insect Acyrthosiphon pisum (pea aphid); the coleopteran insect Nicrophorus vespilloides (burying beetle); four lepidopteran insects, Trichoplusia ni (cabbage looper), Bombyx mori (silk moth), Heliconius melpomene (postman butterfly), and Plutella xylostella (diamondback moth); the bird Gallus gallus (chicken); and the primate Homo sapiens (human).
Our analysis of silk moth adult ovary piRNAs as well as earlier reporter transgene experiments in cultured silk moth BmN4 cells (Homolka et al., 2015) suggest that the primary piRNA pathway collaborates with the secondary pathway to produce piRNAs in B. mori. However, a recent study reported that BmN4 piRNAs are made solely by the secondary pathway (Nishida et al., 2018). To further test the idea that B. mori, like the three other lepidopteran species we analyzed, produces phased primary piRNAs, we reanalyzed published sequencing data from BmN4 cells in which the piRNA maturation enzyme Trimmer/PNLDC1 was depleted by RNAi (Izumi et al., 2016). These data further support our conclusions: the most frequent 3′-to-5′ distance for piRNAs in Trimmer/PNLDC1 depleted BmN4 cells was zero (Figure S2A).
Finally, we detected significant piRNA ping-pong—the hallmark of the secondary piRNA biogenesis pathway—in 32 of the 33 non-model animals analyzed. Together, our data suggest that the secondary and primary piRNA biogenesis pathways likely collaborated to make piRNAs in the last common ancestor of all bilateral animals, an evolutionary distance of ~800 million years (Hedges et al., 2015; Kumar et al., 2017).
M. musculus MILI and MIWI Participate in Phased Pre-piRNA Production
Our finding that phased primary piRNA production is a deeply conserved feature of piRNA biogenesis prompted us to reexamine how pachytene piRNAs are produced in the male germline of post-natal mice (M. musculus). The current model for piRNA biogenesis in post-natal male mouse germ cells presumes that mature piRNA length heterogeneity reflects different extents of PNLDC1-mediated trimming for MILI- and MIWI-bound pre-piRNAs (Figure 1, at left; Kawaoka et al., 2011; Izumi et al., 2016). The model predicts that (1) in the absence of the 3′-to-5′ trimming enzyme PNLDC1, piRNAs should be replaced by longer, untrimmed pre-piRNAs, and (2) untrimmed pre-piRNAs bound to MILI or MIWI should have similar length distributions (Figure 1, at left).
To test these predictions, we generated Pnldc1−/− mutant mice and sequenced small RNAs from post-natal spermatogonia, primary spermatocytes, secondary spermatocytes and round spermatids purified by fluorescent-activated cell sorting (FACS). Consistent with previous studies using whole testis (Ding et al., 2017; Zhang et al., 2017; Nishimura et al., 2018), the purified cell types all contained 25–40 nt small RNAs rather than the normal complement of mature 25–31 nt piRNAs (Figure 3A). We note that the length of Pnldc1−/− small RNAs is shorter than the pre-piRNA length estimated from the most frequent 5′-to-5′ distance of mature piRNAs (28–50 nt; Han et al., 2015). Nonetheless, our analyses show that the majority of Pnldc1−/− small RNAs are bona fide pre-piRNAs. First, the probability of 3′-to-5′ distances for Pnldc1−/− small RNAs peaked sharply at 0, the distance expected for pre-piRNAs which are produced tail-to-head, one after another (Figure 3A). Second, the genomic nucleotide immediately 3′ to Pnldc1−/− small RNAs was most often uridine (Figure 3A), consistent with both the 5′ U bias of primary piRNAs (Aravin et al., 2006; Girard et al., 2006) and with prepiRNAs being produced tail-to-head.
Third, a pre-piRNA is expected to be followed tail-to-head by a long RNA not yet converted to a pre-piRNA, a pre-pre-piRNA. Such pre-pre-piRNAs would have 5′ monophosphorylated ends and be longer than pre-piRNAs. To identify pre-prepiRNAs, we sequenced 5′ monophosphorylated, single-stranded RNA fragments ≥150 nt long from wild-type primary spermatocytes. As predicted for authentic pre-prepiRNAs, the most likely distance from the 3′ ends of Pnldc1−/− small RNAs to the 5′ ends of the long, 5′ monophosphorylated RNAs was 0 (Figure S2B). We conclude that while most Pnldc1−/− small RNAs are bona fide pre-piRNAs, some likely correspond to prepiRNAs trimmed by an exonucleolytic activity present in Pnldc1−/− mutant germ cells. To exclude these off-pathway pre-piRNA products, we confined our analyses here only to those Pnldc1−/− small RNAs that are followed tail-to-head by another small RNA. Such bona fide pre-piRNAs account for ~60% of all Pnldc1−/− small RNAs in primary spermatocytes.
The standard model for mouse post-natal piRNA biogenesis postulates that untrimmed pre-piRNAs bound to MILI or MIWI will have similar length distributions (Figure 1, at left). To test this prediction, we examined pre-piRNAs bound to MILI and MIWI in Pnldc1−/− primary spermatocytes—a cell type expressing both PIWI proteins (Figures 3B, S2C, and S2D; Table S1). Surprisingly, pre-piRNAs bound to MILI (mode = 31 nt) and MIWI (mode = 34 nt) had different length distributions (Figure 4A). How could a common machinery generate distinct lengths of pre-piRNAs for MILI and MIWI? One idea is that pre-piRNAs are sorted by length between the two PIWI proteins. A sorting model predicts that both short and long pre-piRNAs should be produced throughout spermatogenesis, regardless of presence of MILI or MIWI. Our data do not support a sorting model: the length of pre-piRNAs estimated from the most frequent 5′-to-5′ distance of wild-type piRNAs increases from 35 nt in spermatogonia, where only MILI is present, to 38 nt in primary spermatocytes, where both MIWI and MILI are expressed (Figure S2E).
Another explanation for the difference in the modal length of MILI- and MIWI-bound pre-piRNAs is that PIWI proteins themselves position the endonuclease that converts pre-pre-piRNAs to pre-piRNAs. This model predicts that a MILI-bound prepiRNA should be shorter than a MIWI-bound pre-piRNAs when the two pre-piRNAs have the same 5′ end. To test this, we asked whether the genomic positions of the 5′ or 3′ ends of the pre-piRNAs bound to MILI differ from those of the pre-piRNAs bound to MIWI. For both piRNAs and pre-piRNAs, we calculated the probabilities of the 5′ ends of MILI-bound RNAs residing before, after, or coinciding with the 5′ ends of MIWI-bound RNAs. Similarly, we calculated the probabilities for MILI- or MIWI-bound piRNA and prepiRNA 3′ ends. In wild-type primary spermatocytes, MILI- and MIWI-bound mature piRNAs were more likely to share 5′ ends than 3′ ends (0.42 for 5′ ends vs. 0.11 for 3′ ends; Figure 4B). Similarly, in Pnldc1−/−, MILI- and MIWI-bound pre-piRNAs were more likely to share 5′ ends than 3′ ends (0.34 for 5′ ends vs. 0.19 for 3′ ends; Figure 4B). Consistent with the idea that the length of a mature piRNA is defined by the footprint of its PIWI partner (Kawaoka et al., 2011; Izumi et al., 2016), the 3′ ends of MILI-bound piRNAs were more likely to be upstream of the 3′ ends of MIWI-bound piRNAs in wild-type primary spermatocytes (0.55 upstream vs. 0.34 downstream; Figure 4B). Surprisingly, in Pnldc1−/− mutants, the probability of the 3′ ends of MILI-bound prepiRNAs lying upstream of the 3′ ends MIWI-bound piRNAs was also higher (0.49 upstream vs. 0.32 downstream; Figure 4B). We obtained similar results analyzing only pre-piRNAs from pachytene piRNA loci—i.e., excluding those pre-piRNAs bound to MILI before the onset of meiosis (data not shown). Thus, MILI-bound pre-piRNAs are likely to share 5′ ends with MIWI-bound pre-piRNAs, but those pre-piRNAs bound to MILI are generally shorter than MIWI-bound pre-piRNAs.
These analyses compared all combinations of 3′ ends of pre-piRNAs, including those that do not share a common 5′ end. To examine the 3′ ends of pre-piRNAs produced from a common pre-pre-piRNA, we grouped together all unambiguously mapping pre-piRNAs with the same 5′, 25-nt prefix and used their read abundance to identify the most frequent 3′ end for each group (Figure S2F). We then paired the groups of MILI- and MIWI-bound wild-type piRNAs or Pnldc1−/− mutant pre-piRNAs that had the same 5′, 25-nt prefix. Analysis of the paired groups of MILI- and MIWI-bound pre-piRNAs showed that the most frequent 3′ end of the MILI-bound pre-piRNA group was either the same (63% of paired groups; Figure 4C) or upstream of the most frequent 3′ end of the corresponding MIWI-bound pre-piRNA group (30% of paired groups; Figure 4C). Thus, when MILI binds the 5′ end of a pre-pre-piRNA, it is rare for the resulting pre-piRNA to be longer than the pre-piRNA generated when MIWI binds the same pre-pre-piRNA (7% of paired groups; Figure 4C). Confining the analysis to pachytene piRNA loci produced identical results (data not shown).
We took advantage of the 5′ U bias of pre-piRNAs (Figure S3A) and the fact that pre-piRNAs are produced tail-to-head to further test the hypothesis that MILI and MIWI position the endonucleolytic cleavage that generates the 3′ ends of pre-piRNAs. Inspection of individual pre-piRNAs from pachytene piRNA loci suggested that when MILI- and MIWI-bound pre-piRNAs share the same 5′ end but differ in their 3′ ends, both 3′ ends are followed by a uridine in the genomic sequence (Figure 5, at left). For MILIand MIWI-bound pre-piRNAs with identical 5′ and 3′ ends, the U following their shared 3′ end is the only uridine present in that genomic neighborhood.
To quantify these observations, we sorted the paired groups of pre-piRNAs bound to MILI and MIWI into cohorts according to the number of nucleotides separating the 3′ ends of the most frequent MILI- and MIWI-bound pre-piRNAs (Figure S2F). Thus, cohort 0 contained paired groups of pre-piRNAs whose most frequent 3′ end was identical for MILI- and MIWI-bound pre-piRNAs. In cohort 1, the most frequent MILI-bound pre-piRNA 3′ end lay 1 nt upstream of the 3′ end of the pre-piRNA bound to MIWI (Figure S2F). For each cohort, we measured the uridine frequency at each position in the genomic neighborhood of the 3′ ends. Strikingly, whenever two separate peaks of high uridine frequency resided in a genomic neighborhood, the most frequent 3′ end of MILI-bound pre-piRNA groups lay immediately before the upstream uridine, while the most frequent 3′ end of MIWI-bound pre-piRNA groups lay before the downstream uridine (Figure 5, at right). When a single peak of uridine was surrounded by a uridine desert, the 3′ ends of the MILI- and MIWI-bound pre-piRNAs coincided at the nucleotide immediately before the single uridine peak (Figure 5). We obtained essentially indistinguishable results analyzing all genome-mapping pre-piRNAs (data not shown).
Together, these data suggest that MILI and MIWI directly participate in the production of pre-piRNAs by positioning the endonuclease generating pre-piRNA 3′ ends.
MILI and MIWI Bind a Pre-pre-piRNA 5′ End, then Position the Endonuclease
For the pachytene piRNA loci, 63% of paired groups of MILI- and MIWI-bound pre-piRNAs belonged to cohort 0 (Figure 4C). Even though a range of 3′ ends was present for these paired groups, the most frequent 3′ end for a MILI-bound pre-piRNA group was the same as the most frequent 3′ end for the corresponding MIWI-bound pre-piRNA group. In theory, this observation could reflect an underlying non-random distribution of uridines in piRNA precursor transcripts. Pachytene piRNA precursor transcripts are 27.4% U, which is expected for a near random uridine distribution. However, these transcripts could hypothetically contain many uridine-free regions, forcing the pre-prepiRNA cleaving endonuclease to cut 5′ to the only uridine available in the genomic region. To test this idea, we used random sampling to estimate the probability of uridine surrounded by non-uridine stretches (VVVVUVVVV) and the probabilities of the two closest uridines separated by a given length of non-uridine sequence (UU, UVU, UVVU, UVVVU, etc.; Figure S3B). The data obtained from random sampling fit well to the geometric distribution, P(x) = p(100− p)x−1, where p = 27.4 (Pearson’s = 0.996; data not shown). Thus, uridines are spread randomly across the pachytene piRNA transcripts. Uridines were also found to be randomly distributed when the analysis was confined to the genomic neighborhoods around the paired groups of MILI- and MIWI-bound pre-piRNAs (data not shown). Therefore, the distribution of uridine in pachytene piRNA precursor transcripts cannot explain why, for 63% of paired groups of MILI- and MIWI-bound pre-piRNAs, their most frequent 3′ ends coincide (cohort 0 in Figure S3B). We conclude that the underlying mechanism of pre-piRNA production, not the sequence of pachytene piRNA precursor transcripts, determines the distribution of pre-piRNA 3′ ends.
If PIWI proteins position the endonuclease during piRNA precursor transcript fragmentation, they may do so by binding the 5′ end of pre-pre-piRNA. In this case, much of the sequence of the prospective pre-piRNA would be occluded by the PIWI protein and inaccessible for cleavage. Only uridines located 3′ to the PIWI protein’s footprint could be recognized by the endonuclease. Based on pre-piRNA length data, the footprint of MILI is expected to be ~3 nt smaller than that of MIWI (Figures S2E and 4A), giving the endonuclease access to more upstream uridines when it is positioned by MILI rather than MIWI. In this view, the endonuclease is constrained by the PIWI protein to cleave at the nearest uridine not masked by the protein’s footprint. Thus, when only a single exposed uridine is present locally, the resulting MILI-bound and MIWI-bound prepiRNAs share a common 3′ end (cohort 0 in Figure 5).
This model makes two predictions. Imagine a MIWI-bound pre-piRNA corresponding to the shortest length permitted by MIWI’s footprint (Figure S3C). By definition, when this pre-piRNA is a member of cohort 0, the corresponding MILI-bound pre-piRNA has the same 3′ end—i.e., the two pre-piRNAs are identical. The model predicts that no uridines will be found in the last three nucleotides of the pre-piRNA, but that there may be uridines ≥4 nt upstream of the pre-piRNA 3′ end. These uridines are inaccessible to the endonuclease, because they are concealed within the footprint of MILI. In other words, because of MILI’s footprint, cohort 0 is expected to incorporate all instances in which uridines are ≥4 nt upstream of the uridine used by the endonuclease positioned by MIWI (Figure S3C). To test this prediction, we used uridines randomly sampled in pachytene piRNA clusters as the 5′ ends of simulated pre-piRNAs. We set the 3′ ends of the simulated pre-piRNAs immediately before the first uridine occurring >31 nt (simulated MILI footprint) or >34 nt (simulated MIWI footprint) downstream of the simulated pre-piRNA 5′ end (Figure S3D). We then sorted the pairs of simulated MILIand MIWI-bound pre-piRNAs sharing their 5′ ends into cohorts based on the number of nucleotides separating their 3′ ends. The result of the simulation fit well to the biological data (Pearson’s ? = 0.94; Spearman’s ? = 0.92; Figure S3D) further supporting the idea that MILI and MIWI differentially direct endonucleolytic cleavage after they bind the prepre-piRNA 5′ end.
The model makes a second prediction: the MILI-bound pre-piRNAs present in cohorts 4 and greater (Figure 5) will be paired with atypically long MIWI-bound prepiRNAs (Figure S3E). In fact, the median length of MIWI-bound pre-piRNAs in cohorts 4–9 was longer than that in cohorts 0–3 (35–38 nt vs. 32–34 nt; Figure S3F). Thus, the data support the idea that the footprints of MILI and MIWI restrict which uridines can be used to generate pre-piRNA 3′ ends.
We conclude that MILI and MIWI position the pre-pre-piRNA cleaving endonuclease when it establishes pre-piRNA 3′ ends. When PIWI proteins bind the 5′ ends of pre-pre-piRNAs to initiate this process, the distinct sizes of the MILI and MIWI footprints differentially restrict the uridines available to the endonuclease. Both MILI and MIWI direct the endonuclease to cleave 5′ to the nearest uridine, but MILI positions the endonuclease upstream of the site dictated by MIWI.
The Number of piRNAs per Cell Corresponds to the Number of PIWI proteins
The model proposed here also posits that a PIWI protein must bind the 5′ end of a piRNA precursor before the pre-piRNA can be liberated and trimmed into a mature piRNA. Thus, the number of PIWI proteins per cell is expected to be similar to the number of mature piRNAs.
To quantify the abundance of piRNAs in FACS-purified mouse male germ cells, we added known amounts of 18 synthetic RNA oligonucleotides to the total mouse RNA before preparing sequencing libraries. By comparison with the synthetic oligoribonucleotides, we estimated that the number of 24–33-nt RNAs (i.e., mature piRNAs) in primary spermatocytes (7.8 ± 0.6 × 106 RNAs/cell), secondary spermatocytes (3.9 ± 0.1 × 106 RNAs/cell), and round spermatids (2.5 ± 0.2 × 106 RNAs/cell) corresponds to 5.7 ± 0.2–7.2 ± 0.6 µM (Figure S4A). Notably, the 100 most abundant piRNA species are present at ~2,400–19,000 molecules per cell, comparable to the most abundant miRNAs in these cells: ~2,500 molecules of let-7a; ~2,500 molecules of miR-449a; and ~23,500 molecules of miR-34c per cell.
Next, we assessed the number of MILI and MIWI proteins in the FACS-purified germ cells by western blotting, using recombinant, SNAP-tagged PIWI proteins as concentration standards (Figures S4B and S4C). The total number of PIWI proteins per cell (5.5 ± 1.5 × 106/cell in primary spermatocytes, 3.0 ± 0.4 × 106/cell in secondary spermatocytes, 1.5 ± 0.5 × 106/cell in round spermatids) correlated well with the total number of piRNAs at different stages (Pearson’s ? = 0.99), and the total cellular concentration of PIWI proteins (4 ± 1–4.8 ± 1 µM; Figure S4A) was similar to that of piRNAs (5.7 ± 0.2–7.2 ± 0.6 µM; Figure S4A).
If virtually all piRNAs are bound to a PIWI protein, the length distribution of total piRNAs in cells should be explained by the combination of MILI- and MIWI-bound piRNA lengths. In primary spermatocytes, a cell type expressing both PIWI proteins, the simulated length profile of total piRNAs fit well to the biological data when the ratio of MILI to MIWI was based on our estimate of their absolute abundance (16 MILI molecules per 84 MIWI molecules; Pearson’s = 0.99; Figure S4D).
Taken together, these data support a model in which piRNA biogenesis requires a PIWI protein to bind the 5′ end of each piRNA precursor transcript to initiate pre-piRNA production.
D. melanogaster Piwi and Aub Also Participate in Phased Pre-piRNA Production
To test if PIWI-protein binding similarly positions the endonuclease that establishes the 3′ ends of pre-piRNAs in D. melanogaster, we analyzed data from fly ovaries. D. melanogaster makes three PIWI proteins—Piwi, Aubergine (Aub), and Argonaute3 (Ago3). Ago3 and Aub collaborate to generate secondary piRNAs via the ping-pong cycle, whereas Piwi and, to a lesser extent, Aub, bind primary piRNAs (Mohn et al., 2015; Han et al., 2015; Senti et al., 2015; Wang et al., 2015). The modal lengths of Piwiand Aub-bound piRNAs differ by just one nucleotide (26 and 25 nt, respectively).
Fly piRNAs are less extensively trimmed than those in mice, so the sequences of mature piRNAs more readily reveal the mechanics of pre-piRNA biogenesis. Analysis of the 3′-to-5′ distance between mature piRNAs showed that Piwi- and Aub-bound piRNAs are often found tail-to-head in any combination of the two proteins (Figure S5A). In addition, the probability of common 5′ ends between Piwi- and Aub-bound piRNAs was higher than the probability of common 3′ ends (0.34 for 5′ vs. 0.20 for 3′ ends; Figure S5B). To analyze the 3′ ends of piRNAs produced from a common pre-pre-piRNA, we grouped Piwi- and Aub-bound piRNAs sharing a common 5′, 23-nt prefix. After pairing the Piwi and Aub groups, we found that the most frequent 3′ ends of Aub-bound piRNA groups often coincided (66% of paired groups; Figure S5C) or lay upstream (26% of paired groups; Figure S5C) of the most frequent 3′ ends of the corresponding Piwi-bound piRNA groups. For just 8% of paired groups, the most frequent 3′ ends of Aub-bound piRNAs were downstream of the most frequent 3′ ends of the corresponding Piwi-bound piRNAs.
Next, we divided the paired Piwi- and Aub-bound piRNA groups into cohorts according to the distance between the most abundant 3′ ends for Piwi- and Aub-bound piRNAs. When the most frequent 3′ ends of Piwi- and Aub-bound piRNA groups were identical, they lay immediately before a single uridine in a uridine-depleted genomic neighborhood (Figure 6A). Conversely, when there were two peaks of uridine, the most frequent 3′ ends of Aub-bound piRNA groups lay immediately before the upstream peak, while the most frequent 3′ ends of Piwi-bound piRNA groups were immediately before the downstream peak (Figure 6A). Confining the analyses to piRNAs arising from germline-specific transposons (Wang et al., 2015) produced essentially the same results (data not shown).
Like mice, flies require Papi for pre-piRNA trimming: without Papi, fly pre-piRNAs have a median length 0.35 nt longer than wild-type piRNAs (Han et al., 2015; Hayashi et al., 2016). However, flies lack a PNLDC1 homolog (Hayashi et al., 2016). Instead, the miRNA-trimming enzyme Nibbler (Han et al., 2011; Liu et al., 2011) trims fly pre-piRNAs (Feltzin et al., 2015; Wang et al., 2016; Hayashi et al., 2016). In papi−/− and nibbler−/− mutant ovaries, Piwi- and Aub-bound pre-piRNAs are phased (Figure S5D). As in mice, phasing occurs between Aub- and Piwi-bound, Aub- and Aub-bound, Piwi- and Piwi-bound, and Piwi- and Aub-bound piRNAs, supporting the idea that a single pre-prepiRNA molecule can generate both Aub- and Piwi-bound pre-piRNAs. Piwi- and Aub-bound pre-piRNAs are more likely to share 5′ ends than 3′ ends (papi−/−: 0.23 for 5′ ends vs. 0.09 for 3′ ends; nibbler−/−: 0.29 for 5′ ends vs. 0.15 for 3′ ends). Moreover, the 3′ ends of Aub-bound pre-piRNAs are more likely to be found upstream than downstream of the 3′ ends of Piwi-bound pre-piRNAs (papi−/−: 0.51 upstream vs. 0.40 downstream; nibbler−/−: 0.49 upstream vs. 0.36 downstream).
Grouping and pairing Piwi- and Aub-bound pre-piRNAs with common 5′, 23-nt prefix revealed that the most frequent 3′ end of Aub-bound pre-piRNA groups either coincided with the most frequent 3′ end of the corresponding Piwi-bound pre-piRNA group (papi−/−: 41% of paired groups; nibbler−/−: 63% of paired groups) or lay upstream of the most frequent 3′ end of the Piwi-bound pre-piRNA group (papi−/−: 43% of paired groups; nibbler−/−: 28% of paired groups). For only a small minority of paired groups are Aub-bound pre-piRNAs longer than Piwi-bound pre-piRNAs (papi−/−: 16% of paired groups; nibbler−/−: 9% of paired groups). As in mice, when a fly genomic neighborhood contained a single uridine frequency peak, the most frequent 3′ ends of Piwi- and Aub-bound pre-piRNA groups were identical and mapped immediately before the peak (Figures 6B and 6C): i.e., the most frequent Piwi- and Aub-bound pre-piRNAs had the same sequence and genomic coordinates. In contrast, when two uridine frequency peaks were present, the most frequent 3′ ends of Aub-bound pre-piRNA groups preceded the upstream peak, while the most frequent 3′ ends of the corresponding Piwi-bound pre-piRNA groups lay before the downstream peak.
Collectively, these data suggest that, as for MILI and MIWI in mice, Aub and Piwi in flies position the endonuclease that simultaneously generates the pre-piRNA 3′ end and the 5′ end of the succeeding pre-pre-piRNA. Thus, mammalian and insect PIWI proteins directly participate in establishing the pattern of phased pre-piRNA biogenesis.
MILI and MIWI Slicing of Long piRNA Precursor Transcripts Initiates Pre-PrepiRNA Biogenesis in Post-Natal Mice
Pre-piRNAs appear to be less stable than mature piRNAs in post-natal mice: primary spermatocytes, secondary spermatocytes and round spermatids from Pnldc1−/− testes contain approximately 2–3 times less 24–45 nt RNAs than wild-type cells, whereas miRNA abundance is unchanged (Figure S4A). Although piRNA cluster transcript abundance, measured by RNA-seq and RT-qPCR of RNA from whole testes, has been reported to be unchanged in Pnldc1−/− mice (Zhang et al., 2017), our RNA-seq data from FACS-purified germ cells show that without PNLDC1, the steady-state levels of many piRNA cluster transcripts increase. Absolute transcript abundance increased both in secondary spermatocytes (median increase = 3.0×; FDR ≤ 0.1) and round spermatids (median increase = 4.5×; FDR ≤ 0.1) for the ten loci that produce the most pachytene piRNAs and which account for ~50% of all piRNAs in meiotic and post-meiotic wild-type cells. At the same time, the absolute abundance of piRNAs from these loci decreased both in secondary spermatocytes (median decrease = 2.5×; FDR ≤ 0.1) and round spermatids (median decrease = 3.7×; FDR ≤ 0.1).
A possible explanation for the increased steady-state levels of pachytene piRNA cluster transcripts and the decreased amount of pre-piRNAs in Pnldc1−/− mutants is that piRNAs themselves are required to process piRNA cluster transcripts into phased prepiRNAs. Indeed, in the D. melanogaster female germline, biogenesis of primary phased piRNAs is initiated by a slicing event directed by an Ago3-bound secondary piRNA generated via the ping-pong amplification pathway (Mohn et al., 2015; Han et al., 2015; Senti et al., 2015; Wang et al., 2015). Thus, secondary piRNAs initiate phased primary piRNA production by cleaving a piRNA cluster transcript to generate a pre-pre-piRNA. Although MILI initiates phased piRNA production in the neo-natal mouse testis (Yang et al., 2016), such a mechanism is not thought to play a role in piRNA biogenesis in the post-natal germline of male mice. We reexamined this presumption.
To identify the pre-pre-piRNAs from which phased pre-piRNAs are generated in mouse primary spermatocytes, we selected from our data set of ≥150-nt, 5′ monophosphorylated RNA sequences from wild-type cells, those RNAs derived from the pachytene piRNA loci. Many of these shared 5′ ends with mature piRNAs, consistent with their corresponding to pre-pre-piRNAs (Figure S6A; 79% shared 5′ ends with MILI-bound mature piRNAs, 77% shared 5′ ends with MIWI-bound mature piRNAs). We separated the putative pre-pre-piRNAs—the long RNAs sharing their 5′ end with a mature piRNA—from those for which no piRNA with the same 5′ end could be found (control; Figure 7A). As expected for precursors of pre-piRNAs, most putative pre-prepiRNAs began with uridine (65% of pre-pre-piRNAs sharing 5′ ends with MILI-bound piRNAs, 67% of those sharing 5′ ends with MIWI-bound piRNAs; Figure 7B). Unexpectedly, putative pre-pre-piRNAs also showed a significant enrichment for adenine at their tenth nucleotide (39% of pre-pre-piRNAs sharing 5′ ends with MIWI-bound piRNAs and 38% of those sharing 5′ ends with MILI-bound piRNAs; Figure 7B). Such a 10A bias is the hallmark of PIWI-protein catalyzed target slicing and reflects the intrinsic preference of some PIWI proteins for adenine at the t1 position of their target RNAs (Wang et al., 2014; Matsumoto et al., 2016).
To test the idea that MILI- or MIWI-bound piRNAs direct slicing of piRNA cluster transcripts to generate pre-pre-piRNAs, we searched for MILI- and MIWI-bound piRNAs that could have guided production of the 5′ ends of the putative pre-pre-piRNAs. Unlike siRNAs and miRNAs, the extent of base-pairing required between a piRNA and its target RNA to support PIWI-protein catalyzed target slicing is incompletely understood. Minimally, nucleotides 2 to 10 of the guide piRNA (g2–g10) are expected to pair with the target RNA (Reuter et al., 2011; Gou et al., 2014; Goh et al., 2015; Zhang et al., 2015). We observed a statistically significant overlap between nucleotides g2–g10 of MILI- and MIWI-bound piRNAs and putative pre-pre-piRNAs, but detected no such overlap with the control RNAs (Figures 7C and 7D). This result is unlikely to reflect canonical cis ping-pong, in which the piRNA and target overlap in the genome sequence: ≫99% of piRNAs contributing to the overlap do not map to the same genomic location as their targets. That is, the majority of piRNAs act in trans through partial complementarity to cleave targets transcribed from a piRNA-producing locus different from their own.
To account for variance in sample sizes and the different contributions of species with high and low read abundance, we repeated the analyses using either data subsampling or random resampling of the data with replacement (bootstrapping). These analyses confirmed the presence of the ping-pong signature for the putative pre-prepiRNAs but not for the control RNAs (Figures S6B and S6C).
Transposon-Derived piRNAs Direct Pre-Pre-piRNA Biogenesis in Post-Natal Mice
Transposons are a possible source of piRNAs capable of targeting piRNA precursor transcripts, because the sequence of a transposon can potentially bear short but fruitful stretches of complementarity with other genomic copies of the same or a related transposon family in another piRNA locus. For the pachytene piRNA loci, we found that more than twice as many transposon-derived piRNAs contributed to the overlap with putative pre-pre-piRNAs than expected by chance: 34.2% observed vs. 16.9% expected, based on the fraction of transposon-mapping piRNAs for MILI-bound piRNAs, and 46.1% observed vs. 20.1% expected for MIWI-bound piRNAs. This finding is particularly striking given that the transcribed regions of the pachytene piRNA loci contain fewer transposon sequences than the genome as a whole (31.9% vs. 41.9%).
Collectively, these data show that PIWI slicing plays a central role in initiating phased primary piRNA biogenesis in animals as evolutionarily distant as flies and mice. They also suggest that transposon-derived piRNAs can function in the production of piRNAs not participating in transposon silencing.
PIWI Slicing Initiates Phased Primary piRNA Biogenesis in Most Animals
Mice and flies are separated by ~800 million years of evolution, and yet both species use secondary piRNAs to direct PIWI slicing to initiate the production of primary piRNAs. We explored whether the same strategy is employed for piRNA production in other animals. Because there are no available data sets of 5′ monophosphorylated prepiRNAs or pre-pre-RNAs from non-model organisms, we used an alternative approach. We assumed that piRNAs with a ping-pong partner on the opposite genomic strand were produced by PIWI slicing (putative secondary piRNAs) and that the remaining piRNAs were generated by phased primary piRNA production (putative primary piRNAs; Figure 7E). For each of 33 non-model animals, we grouped piRNAs as either putative secondary or primary piRNAs. If PIWI slicing initiates phased piRNA production, the most frequent position of the 5′ ends of putative primary piRNAs is predicted to lie at regular intervals downstream of the 5′ ends of putative secondary piRNAs.
We detected periodic peaks of putative primary piRNA 5′ ends downstream of the 5′ ends of putative ping-pong pairs in data from 12 species (Figure 7E): the primates Macaca fascicularis (crab-eating macaque) and Callithrix jacchus (white-tufted-ear marmoset); the teleost fish Danio rerio (zebrafish); two dipteran insects Musca domestica (house fly) and Aedes aegypti (yellow fever mosquito); three lepidopteran insects, Trichoplusia ni (cabbage looper), Heliconius melpomene (postman butterfly), and Plutella xylostella (diamondback moth); the coleopteran insect Nicrophorus vespilloides (burying beetle); the hemipteran insect Acyrthosiphon pisum (pea aphid); the flatworm Schmidtea mediterranea (freshwater planarian); and the cnidarian Hydra vulgaris (fresh-water polyp). Together with flies and mice, we detect secondary piRNA-initiated phased primary piRNA production in 14 species spanning four phyla—Cnidaria, Platyhelminthes, Arthropoda, and Chordata—including four vertebrates and eight insects. Together, these data suggest that (1) the secondary pathway initiates phased primary piRNA production in species representing the major animal phyla including non-bilateral animals and (2) the last common ancestor of all animals produced piRNAs very much as animals make them today.
Origins of the 5′ U Bias of Primary piRNAs
The 5′ U bias of primary piRNAs is thought to arise from the specificity of the pre-prepiRNA cleaving endonuclease producing phased pre-piRNAs. However, the 5′ U bias could also reflect the preference of PIWI proteins to bind guide RNAs beginning with uridine (Kawaoka et al., 2011; Matsumoto et al., 2016). If PIWI proteins contribute to 5′ U bias of primary piRNAs, then selection for initial uridines should occur when MILI or MIWI binds the 5′ end of a pre-pre-piRNA. The frequency of 5′ U for pre-piRNAs bound to MILI or MIWI and sharing their 5′ ends with putative pre-pre-piRNAs is 91%, higher than the frequency of 5′ U for putative pre-pre-piRNAs (65% for those sharing 5′ ends with MILI-bound pre-piRNAs, 67% for those sharing 5′ ends with MIWI-bound prepiRNAs). These data suggest that not all pre-pre-piRNAs are converted to pre-piRNAs with equal efficiency.
To determine whether first nucleotide identity influences the potential of pre-prepiRNAs to produce pre-piRNAs, we asked if highly abundant pre-piRNAs were more likely to derive from pre-pre-piRNAs beginning with uridine. First, we sorted pre-piRNA species by their abundance into 10 equally sized bins. We then calculated the percent 5′ U for the putative pre-pre-piRNAs sharing their 5′ ends with the pre-piRNAs in each bin. For each bin, we also determined the ratio of pre-piRNA abundance to the abundance of the corresponding pre-pre-piRNAs. Consistent with the idea that the binding preference of PIWI proteins contributes to the 5′ uridine bias of piRNAs, putative pre-pre-piRNAs were more likely to produce pre-piRNAs when the pre-pre-piRNAs started with uridine (Figure S6D).
Notably, the frequency of 5′ U for mature piRNAs bound to MILI or MIWI and sharing their 5′ ends with putative pre-pre-piRNAs was the same as for untrimmed prepiRNAs (91%). That no change in 5′ U bias is introduced at the trimming step agrees well with the current piRNA maturation model in which trimming of pre-piRNAs occurs after their loading into PIWI proteins.
DISCUSSION
The analyses presented here suggest a unified model for piRNA biogenesis in the germ cells of mammals, insects and probably most other animals (Figure 1, at right). In the male germline of mice and the female germline of flies, PIWI proteins initiate piRNA biogenesis: PIWI-catalyzed, piRNA-guided slicing of a long piRNA precursor transcript creates a pre-pre-piRNA, whose monophosphorylated 5′ end serves as an entry point for further pre-piRNA production (Mohn et al., 2015; Han et al., 2015; Senti et al., 2015; Wang et al., 2015; Yang et al., 2016).
PIWI proteins also control downstream primary piRNA biogenesis: a PIWI protein bound to the pre-pre-piRNA 5′ end positions an endonuclease—perhaps Zucchini/PLD6—to carry out stepwise fragmentation of the pre-pre-piRNA into phased pre-piRNAs. The essential role of 5′ phosphate recognition for loading guide RNAs into PIWI and other Argonaute proteins (Nykanen et al., 2001; Ma et al., 2005; Parker et al., 2005; Wang et al., 2009; Frank et al., 2010; Boland et al., 2011; Kawaoka et al., 2011; Elkayam et al., 2012; Schirle and MacRae, 2012; Schirle et al., 2014; Cora et al., 2014; Wang et al., 2014; Matsumoto et al., 2016) makes it unlikely that a PIWI protein can bind a pre-pre-piRNA downstream of its 5′ end. Binding of a PIWI protein to the newly generated pre-pre-piRNA 5′ end defines the 3′ end of the prospective pre-piRNA: the endonuclease cleaves 5′ to the first accessible uridine following the footprint of the PIWI protein. This cut simultaneously creates the 3′ end of the pre-piRNA and 5′ end of the next pre-piRNA. Our data suggest that the 5′ U bias of phased pre-piRNAs reflects the combined action of the endonuclease cleaving 5′ to uridines and the preference of PIWI proteins for pre-pre-piRNAs beginning with uridine.
Collectively, the proposed model explains how (1) PIWI-protein slicing initiates and (2) PIWI-protein binding controls the production of phased pre-piRNAs. The resulting phased pre-piRNAs, still bound to PIWI proteins, are then trimmed to their characteristic length and 2′-O-methylated, generating a mature, functional piRNA ready to guide the PIWI protein. In some animals trimming plays a minor role in piRNA maturation, allowing their tail-to-head arrangement to be detected in the wild type, i.e., without removing the pre-piRNA trimming enzyme (Figure 2; Mohn et al., 2015; Han et al., 2015; Senti et al., 2015; Wang et al., 2015; Hayashi et al., 2016). Contrary to the conclusions of the recently published work by Nishida et al. (Nishida et al., 2018), our analyses revealed the presence of phased pre-piRNAs in an evolutionarily broad range of animals including Lepidoptera. Our data also argue against the idea that only the transcriptional silencing machinery relies on phased pre-piRNA production (Nishida et al., 2018). First, two cytoplasmic PIWI proteins bind phased pre-piRNAs in post-natal mouse testis. Second, among those animals we examined, phased pre-piRNAs were detected in 10 of the 20 animals possessing just two PIWI genes (Figure 2).
Although the revised model explains the data presented here, the details of individual steps remain to be established. Assigning specific functions to all the proteins known to act in processing single-stranded piRNA precursor transcripts into mature piRNAs remains a formidable challenge. Our model no doubt underestimates the complexity of piRNA biogenesis, whose individual steps occur in at least two cellular compartments: perinuclear nuage and mitochondria. We also do not understand what determinants destine pachytene piRNA cluster transcripts in mice, which are generated by conventional RNA Pol II transcription from euchromatic loci, to become piRNAs. One possible explanation is that enrichment of pachytene piRNA cluster transcripts with piRNA target sites could render them more likely targets of PIWI slicing.
The data presented here establish the essential role of PIWI-clade Argonautes in the generation of their own guide piRNAs and suggest a possible evolutionary trajectory for the piRNA pathway (Figure S7): PIWI-clade Argonautes emerged first, generating their own guides via the ping-pong cycle, perhaps assisted by co-opting a 3′-to-5′ exonuclease to trim the guide piRNA to a length more fully protected by the PIWI protein (Nishida et al., 2018). Later in evolution, PIWI proteins came to be aided by an endonuclease specialized for producing phased pre-piRNAs, allowing the production of additional piRNAs from the otherwise discarded 3′ product generated by the production of secondary piRNAs.
The Nematoda represent a particularly bizarre branch in the evolution of the piRNA pathway, because most lineages of this phylum appear to have lost PIWI proteins and piRNAs altogether (Wang et al., 2011; Sarkies et al., 2015; Holz and Streit, 2017). Remarkably, the PIWI-protein-producing nematode Caenorhabditis elegans, generates each 21U-RNA, as worm piRNAs are called, from a separate transcript. It is not known what mechanism generates the 5′ ends of 21U-RNAs, although the downstream steps of piRNA biogenesis are likely similar to animals of other phyla: loading of 21U-RNA precursor into PRG-1, 3′-to-5′ trimming to establish mature piRNA length (Tang et al., 2016), and 2′-O-methylation of the piRNA 3′ end (Kamminga et al., 2012; Montgomery et al., 2012; Billi et al., 2012).
The most diverged components of the piRNA pathway act in piRNA precursor transcription, while downstream proteins are conserved (Grimson et al., 2008; Klattenhoff et al., 2009; Handler et al., 2011; Cecere et al., 2012; Li et al., 2013; Hayashi et al., 2016; Andersen et al., 2017). Such deep conservation suggests that the core of the piRNA-producing machinery—with the secondary pathway (ping-pong) initiating the production of phased primary pre-piRNAs—probably appeared prior to the divergence of most animal lineages. Thus, despite differences among animals in the sources and functions of piRNAs, piRNAs are produced by a common strategy that reflects a common descent of the underlying machinery in metazoan evolution.
ACCESSION NUMBERS
Sequencing data are available from the National Center for Biotechnology Information Small Read Archive using accession number PRJNA421205.
AUTHOR CONTRIBUTIONS
I.G., C.C., A.A., and P.D.Z. conceived and designed the experiments. I.G., C.C., A.A., and K.C. performed the experiments. I.G. analyzed the sequencing data. I.G., C.C., and P.D.Z. wrote the manuscript.
ACKNOWLEDGEMENTS
We thank S. Pechhold, T. Giehl, B. Gosselin, Y. Gu, and T. Krumpoch at UMass FACS Core for help sorting mouse germ cells; J. Gosselin and J. Gallant at UMass Transgenic Animal Modeling Core for help generating Pnldc1−/− mice; A. Boucher, C. Tipping, and G. Farley for technical assistance; Z. Weng, Y. Fu, and members of the Zamore laboratory for discussions and critical comments on the manuscript. This work was supported in part by National Institutes of Health grants GM65236 and P01HD078253 to P.D.Z.