ABSTRACT
Phased, secondary siRNAs (phasiRNAs) represent a class of small RNAs in plants generated via distinct biogenesis pathways, predominantly dependent on the activity of 22 nt miRNAs. Most 22 nt miRNAs are processed by DCL1 from miRNA precursors containing an asymmetric bulge, yielding a 22/21 nt miRNA/miRNA* duplex. Here we show that miR1510, a soybean miRNA capable of triggering phasiRNA production from numerous NB-LRRs, previously described as 21 nt in its mature form, primarily accumulates as a 22 nt isoform via monouridylation. We demonstrate that in Arabidopsis, this uridylation is performed by HESO1. Biochemical experiments showed that the 3’ terminus of miR1510 is only partially 2’-O-methylated, because of the terminal mispairing in the miR1510/miR1510* duplex that inhibits HEN1 activity in soybean. miR1510 emerged in the Phaseoleae ~41 to 42 MYA with a conserved precursor structure yielding a 22 nt monouridylated form, yet a variant in mung bean is processed directly in a 22 nt mature form. This analysis of miR1510 yields two observations: (1) plants can utilize post-processing modification to generate abundant 22 nt miRNA isoforms to more efficiently regulate target mRNA abundances; (2) comparative analysis demonstrates an example of selective optimization of precursor processing of a young plant miRNA.
INTRODUCTION
Plant miRNAs are capable of triggering phased, secondary siRNAs (phasiRNAs) from long noncoding RNAs (lncRNA) or mRNAs (1). These phasiRNAs participate in both plant development and immunity. A number of 22 nt miRNAs, such as miR482/2118, miR1507, miR2109, miR5300, and miR6019, trigger phasiRNAs from the nucleotide-binding leucine-rich repeat (NB-LRR) gene family, which constitutes the majority of plant disease resistance (R) genes (12–15). NB-LRR- derived phasiRNAs have been confirmed to reinforce the efficacy of these 22 nt miRNA triggers in NB-LRR suppression. For instance, 22 nt miR9863 targets Mla transcripts triggering phasiRNAs, which, in concert with miR9863, represses Mla in barley (16). Consistent with this study on disease resistance, another report demonstrated that more widespread and efficient silencing was observed for a 22 nt artificial miRNA (amiRNA), relative to a 21 nt version, because of the generation of phasiRNAs (17). Consequently, it is postulated that NB-LRR-derived phasiRNAs act as an essential layer to fine-tune R gene expression (18). miR1510 is a notable miRNA in this same class; it’s a legume-specific miRNA that is the predominant trigger of phasiRNAs from NB-LRRs in soybean, yielding abundant phasiRNAs from its targets (19). Yet, given its annotated length as 21 nt, it has been unclear why this miRNA processed has such a substantial activity as a trigger of phasiRNAs.
Plant small RNAs, including both miRNAs and siRNAs, are extensively subject to the modification of 3’ terminal 2’-O-methylation by the methyltransferase HUA ENHANCER 1 (HEN1) (20, 21), preventing small RNAs from 3’ uridylation and subsequent degradation (22). In a hen1 mutant background, and thus in the absence of 3’ methylation protection, miRNAs tend to be truncated from the 3’ end prior to uridylation mediated by different nucleotidyltransferases, such as HEN1 SUPPRESSOR 1 (HESO1) and UTP:RNA URIDYLYLTRANSFERASE 1 (URT1) (23–27). Consequently, miRNA abundances are generally reduced in the hen1 mutant, resulting in pleotropic developmental defects (28), while different miRNAs display distinct patterns of truncation and tailing (23, 24). One unusual gain-of-function in the hen1 mutant background was observed: miR171a triggers phasiRNA production from target transcripts, because the typically 21 nt mature miRNA is abundantly tailed to 22 nt by URT1 in absence of 2’-O-methylation (24, 25). This observation supports that the 22 nt length of miRNAs is important for phasiRNA production.
Here we show that in wildtype soybean, 21 nt miR1510 is partially methylated and subsequently uridylated to 22 nt by HESO1, likely bestowing on miR1510 the ability to trigger phasiRNA production from target transcripts. We found that the mismatch adjacent to the 2 nt 3’ overhang in the miR1510/miR1510* duplex inhibits HEN1 activity in vitro, resulting in its 3’ monouridylation by HESO1. Interestingly, the position of the mismatch is conserved across the Phaseoleae tribe of legume species, and high levels of uridylated miR1510 in 22 nt form were also observed in other Phaseoleae species, including common bean and pigeon pea. Therefore, we propose that the Phaseoleae have evolved to employ this mechanism to generate a 22 nt miRNA and its consequential phasiRNAs to fine-tune R gene expression.
RESULTS
In soybean, 21 nt miR1510 is predominantly uridylated to 22 nt
miR1510 targets transcripts of over 100 NB-LRRs genes in soybean, far more than any other miRNA, triggering abundant phasiRNAs from their transcripts (19, 29). The mature miRNA is generated from two MIR1510 loci in the soybean genome, copies that likely originated from the genome duplication during soybean evolution (30) (Fig. 1A). Based on these analyses of the precursor, miR1510 is processed into a 21 nt mature miRNA (Fig. S1), likely by DICER-LIKE1 (DCL1) as in other species; yet, based on numerous previous studies, a length of 22 nt is typically required for phasiRNA biogenesis. We therefore investigated why this 21 nt miRNA is capable of triggering phasiRNAs. Unexpectedly, a search of miR1510 reads in small RNA sequencing data showed that the most abundant form of miR1510 is a 22 nt isoform (Fig. 1A). This 22 nt miR1510 does not map to the soybean genome, because of the additional 22nd nucleotide (the 3’ end), a uridine (Fig. 1A), perhaps explaining why it was previously overlooked. We assessed whether this isoform could have been generated from a precursor missing (i.e. in a gap) in the current soybean genome assembly. To do so, we examined RNA-seq data that would include pri-miRNA transcripts. However, we found no RNA-seq reads containing the 22 nt isoform of miR1510, suggesting that it is not generated from the genome; instead, the ‘U’ at the 22nd position is more likely the result of uridylation, possibly by a member of the nucleotidyltransferase family.
We next examined how broadly the monouridylated form of miR1510 exists in different tissues of soybean. We employed a previously described method for analysis of truncation and tailing of miRNAs (31). We analyzed the published data that comprises an atlas of soybean small RNAs (29); we observed that as with the leaf tissue miR1510 is uridylated to 22 nt in other tissues, including nodule, flower, and anther, although the degree of uridylation varies (Fig. 1B). This result indicated that the 22 nt form of miR1510 accumulates abundantly in a variety of tissues of wildtype soybean, consistent with its role as a dominant trigger of phasiRNAs from NB-LRR targets. In contrast, other miRNAs in wildtype soybean, such as miR172, miR396, miR398 and miR482, have barely measurable levels of truncated or tailed forms (Fig. S2), suggesting that miR1510 is unique among soybean miRNAs in its tailing.
Non-species-specific uridylation of miR1510 in plants
Considering that miR1510 was the only miRNA for which we observed significant uridylation in soybean, we hypothesized that this miRNA may have attributes, such as a specific precursor structure, that facilitate the uridylation, and thus perhaps it would be uridylated in other plant species. To this end, we transformed both MIR1510a and MIR1510b into Arabidopsis to make stable transgenic lines, and also transiently expressed both precursors in Nicotiana benthamiana. Small RNA gel blotting showed that MIR1510a generated abundant 22 nt miRNAs when expressed in Arabidopsis, as in soybean (Fig. S3A). However, mature miR1510 was not detected from MIR1510b transformants, likely because it was not processed by DCL1 in Arabidopsis, as RT-PCR experiments verified the expression of MIR1510b in Arabidopsis (Fig. S3B). We speculate that the processing of MIR1510b by Arabidopsis DCL1 is inhibited by two mismatches at the sites that if cut would release the miR1510b/miR1510b* duplex (Fig. 1A), which is due to the divergence of DCL1 among species. In the N. benthamiana transient expression assays, as in soybean and Arabidopsis, considerable amount of 22 nt miR1510 was also detected, suggesting that miR1510 is monouridylated in both Arabidopsis and tobacco.
To rule out the possibility that the 22 nt isoform of miR1510 was generated via imprecise processing by DCL1, we sequenced the small RNAs from the Arabidopsis leaf sample transformed with MIR1510a. Sequencing data showed that 22 nt miR1510 was indeed generated by uridylation, and its abundance was > 2 fold higher than the 21 nt form (Fig. S3C). Collectively, our data showed that the uridylation of miR1510 is non-species-specific, and we inferred that its precursors encode features that trigger monouridylation of the mature miRNA.
miR1510 is uridylated by HESO1, but not URT1
Two nucleotidyltransferases in Arabidopsis, including HESO1 and URT1, have a demonstrated ability to uridylate the 3’ terminus of miRNAs (23, 25–27). These two nucleotidyltransferases show different preferences in miRNA substrates in vitro, and their substrate preference is largely determined by the 3’ terminal nucleotide of the miRNA (25). We subsequently explored whether the 22nd ‘U’ of miR1510 is added by HESO1, URT1, or possibly some other nucleotidyltransferase yet to be characterized. To investigate this, MIR1510a was transformed separately into heso1-1 and urt1-1 homozygous mutant backgrounds, and we sequenced the small RNAs from the leaf tissue of the screened T1 plants. Sequencing data showed that the 22 nt form of miR1510 was substantially decreased in a heso1-1, but not a urt1-1 mutant background (Fig. 2). This genetic analysis showed that HESO1 is responsible for the monouridylation of miR1510 in Arabidopsis in vivo, and potentially by its orthologs in soybean and tobacco.
miR1510 is partially methylated in soybean
Plant miRNAs are extensively methylated at the 3’ terminus by HEN1 to avoid uridylation and subsequent miRNA turnover (22). The uridylation of miR1510 suggests that this miRNA is possibly partially methylated at the 3’ terminus. To test this, β-elimination was employed to assess the methylation status of miR1510 in soybean. In parallel with the oxidative treatment of NaIO4 for miR1510, we tested miR166 using total RNA samples from the Arabidopsis hen1-8 loss-of-function mutants, in which miRNAs are predominantly unprotected by 2’-O-methylation at the 3’ terminus (22). The treated samples showed a shifted band running ~2 nt faster than the untreated miR166 band in the hen1-8 background (Fig. S4), confirming that the β-elimination treatment was successful. For soybean miR1510, we found that an additional 19 nt band, although weak, appeared after the treatment, revealed by small RNA gel blotting (Fig. 3). This result suggested that the 21 nt isoform of miR1510 is partially methylated Unexpectedly, β-elimination did not generate an increased intensity of the 20 nt band for miR1510 in treated samples (Fig. 3), indicating that the 22 nt monouridylated miR1510 variant is protected by 2’-O-methylation, presumably by the methyltransferase activity of HEN1. This result also suggests that the methylation after uridylation occurs to AGO-bound miRNAs, because of observations that uridylated miRNAs are bound by Argonaute proteins (24, 25, 32). Previous studies in Drosophila melanogaster showed that its HEN1 homolog, DmHen1, methylates PIWI-bound piRNAs and Ago2-bound siRNAs (33, 34), implying that HEN1 may methylate AGO-bound small RNAs in plants.
miR1510 is a young miRNA, specific to the Phaseoleae tribe of legumes
The unique features of miR1510 led us to investigate the evolution of this miRNA in plants. Using MIR1510a/b precursor sequences from soybean to search in the currently-available, sequenced genomes of legume species, we found that only a few legume species contain precursor sequences, including wild soybean (Glycine soja), common bean (Phaseolus vulgaris), pigeon pea (Cajanus cajan), and mung bean (Vigna radiata). A further check of these species revealed that they all belong to the Phaseoleae tribe of the Papilionoideae subfamily, which diverged ~41 to 42 million years ago (MYA) (Fig. 4A) (35). In miRBase (release 21), we observed that records exist for miR1510a/b in Medicago truncatula, and therefore we compared their precursor sequences with those in soybean. Sequence alignment showed that the miR1510 precursor sequences and even the mature miRNAs in Medicago and soybean differ substantially, indicating that miR1510 in Medicago has a different origin and is essentially a distinct miRNA (Fig. S5). Compared with other NB-LRR-targeting miRNAs such as the ancient miR482/2118 superfamily (36, 37), miR1510 is an evolutionary young miRNA, specific to the Phaseoleae tribe of legumes.
Abundant 22 nt form of miR1510 in Phaseoleae species
The conservation of miR1510 in Phaseoleae species gave rise to the question whether this miRNA is commonly 22 nt because of uridylation. We examined this by checking both published small RNA sequencing data for common bean and preparing small RNA libraries from the leaf tissue for species without published datasets, including pigeon pea and mung bean. Truncation and tailing analysis in common bean indicated that a considerable proportion of miR1510 is uridylated to 22 nt (Fig. 4B). In contrast, pigeon pea has a larger fraction (~50%) of monouridylated miR1510 compared with common bean (Fig. 4B). The variation of the levels of 22 nt miR1510 in diverse tissues and species possibly result from differences in levels of miR1510 and HESO1 across tissues, or from altered activities of HEN1 in these species.
The processing of miR1510 in mung bean seems different from that other Phaseoeae species. Small RNA sequencing data showed that the mature miR1510 in mung bean contains a 2 to 3 nucleotides shift compared with other species. Interestingly, in mung bean, miR1510 is likely processed directly into 22 nt by DCL1, because the 21 nt miR1510 read was absent in the small RNA data for mung bean. In addition, the 22 nt forms of both miR1510 and miR1510* form a duplex with a 3’ single nucleotide overhang, which is atypical for plant miRNAs (Fig. S6). Therefore, miR1510 in mung bean is 22 nt in length, which derives from the direct processing of DCL1, without uridylation. Taken together, the 22 nt forms of miR1510 are universally abundant in Phaseoleae species and predominantly generated by uridylation, although miR1510 is likely directly processed into 22 nt by DCL1 in mung bean.
A previous study integrating small RNA and PARE data reported that miR1510 triggers phasiRNAs from 20 NB-LRRs in soybean (29). We reexamined the data and checked how the 22 nt miR1510 pairs with these targets. Among the 20 reported genes, 11 passed the default filter of psRNATarget (38), which assesses base pairing between miRNAs and target mRNAs. Eight out of the 11 targets of 22 nt miR1510 showed 3’ terminal pairing, including both A:U and G:U wobble (Table S1). This result is consistent with previous reports that the terminal pairing of a 22 nt miRNA is important for phasiRNA production (15, 39), suggesting that 22 nt miR1510 might be under selection for the capacity to trigger phasiRNAs.
Terminal Mispairing in the miR1510/miR1510* Duplex Inhibits HEN1 Methyltransferase Activity in vitro
Knowing that 21 nt miR1510 is incompletely methylated at its 3’ terminus, we speculated that the activity of HEN1 might be inhibited, resulting in uridylation. Previous studies showed that HEN1 recognizes miRNA/miRNA* duplexes and deposits a methyl group at the 3’ terminus (20, 21). Thus, we reasoned that miR1510/miR1510* structure might hamper the activity of HEN1. In examining the secondary structure of the miRNA precursors, we found that all secondary structures predicted for miR1510 precursors contain a mismatch at the 5’ terminal nucleotide of miR1510* (Fig. S7). This mismatch would introduce a terminal mispairing in the miR1510/miR1510* duplex after DCL1 processing that could potentially interfere with HEN1 activity.
To further test if the terminal mispairing of the miR1510/miR1510* duplex inhibits HEN1 activity, we conducted in vitro assays. We annealed synthetic RNA oligonucleotides of miR1510 (oligo #1) and miR1510* (oligo #2); one variant included a mutated miR1510* (oligo #3) to form a terminal pairing structure with miR1510 (Fig. 5A). These two different types (“mismatch” and “match”) of duplexes were then incubated with purified recombinant GST-HEN1 protein (Fig. S8A) in the presence of S-adenosyl methionine (SAM). We digested the RNA into single nucleotides using nuclease P1 followed by dephosphorylation with alkaline phosphatase, and then analyzed the ratios of 2’-O-methylated cytidine (Cm) to guanosine (G) by liquid chromatography with tandem mass spectrometry (LC-MS/MS). We found that the HEN1 activity was considerably lower (~50%) for the “mismatch” form of the duplex (Fig. 5B), which is the natural soybean miR1510/miR1510* duplex structure, than the mutated “match” form, indicating that the terminal mismatch can indeed inhibit HEN1 methyltransferase activity. In addition, we observed that HEN1 activity is extremely low but detectable for single-stranded RNA oligos (Fig. S8B), consistent with prior work (20).
Because both miR1510 and miR1510* possess cytidines at the 3’ terminal, it was indistinguishable from which strand the Cm originated. We therefore investigated whether HEN1 methyltransferase activity was equally inhibited by the terminal mismatch at both strands because of (a) possible weakened HEN1 binding, or (b) reduced methyltransferase activity exclusively at the miR1510 strand. This experiment utilized oligo #4, which incorporated a terminal adenosine instead of cytidine (Figure 5C). LC-MS/MS results showed similar ratios of Cm/G for both “mismatch” and “match” forms of duplexes at all examined time points (Figure 5D, left panel), demonstrating that the methylation of the miR1510* strand was not disrupted by the mismatch on the other terminus. In contrast, Am/G ratio was consistently much lower (~10%) in the “mismatch” form (Figure 5D, right panel), demonstrating that the natural terminal mispairing largely reduced the catalytic rate of HEN1 on the miR1510 strand, consistent with the gel blot result demonstrating that 21 nt miR1510 is partially methylated (Fig. 3). Moreover, our experiments demonstrated that the terminal mismatch specifically inhibits HEN1 methyltransferase activity at the corresponding duplex overhang, instead of weakening HEN1 binding to the RNA duplex to inhibit methylation at both strands.
DISCUSSION
Soybean is a paleopolyploid and experienced two rounds of genome duplications about 59 and 13 million years ago (30), resulting in over 500 NB-LRRs in the genome (19). miR1510 targets NB-LRRs in soybean, and with >100 predicted targets is the major miRNA targeting this family, triggering the production of phasiRNAs from transcripts of many genes (19). The 21 nt isoform of miR1510 was believed to be the trigger of phasiRNAs in soybean; however, this is inconsistent with previous studies showing that the 22 nt length of miRNAs is required for phasiRNA production (10, 11). We found that soybean contains high levels of a 22 nt isoform of miR1510, a length that is able to trigger phasiRNA production. The 22 nt isoform was largely missed in previous analyses, probably because it does not map to the soybean genome because of the additional, 22nd nucleotide, a uracil (Fig. 1).
We investigated the biogenesis of the 22 nt form of miR1510, and found that it results from monouridylation. β-elimination showed that the 21 nt, processed form of miR1510 is partially methylated, making it susceptible to 3’ modification, and its subsequent uridylation (Fig. 3). Genetic analysis by transforming MIR1510a to both hesol and urtl mutant background revealed that HESO1, but not URT1, is responsible for the monouridylation of miR1510 in vivo. We found that miR1510 is generally uridylated among Phaseoleae species, and that the secondary structures of miR1510 precursors in different species generate a terminal mispairing in the miR1510/miR1510* duplex that inhibits HEN1 methyltransferase activity, resulting in HESO1-mediated uridylation. Indeed, HEN1 methyltransferase activity assays in vitro showed that the innate terminal mispairing of miR1510/miR1510* predominantly inhibits HEN1 methyltransferase activity at the miR1510 3’ terminus, but not miR1510*. However, the methyltransferase activity was partially (10%) maintained for miR1510, consistent with the endogenous methylation levels of 21 nt miR1510 in soybean (Fig. 3). We have integrated these observations in a model in which the precursors of miR1510 in Phaseoleae species contain this mispairing at a conserved position, yielding a 3’ terminal mispairing in miR1510/miR1510* duplexes after DCL1 processing (Fig. 6). The terminal mispairing in a duplex inhibits HEN1 methyltransferase activity for miR1510, but not the miR1510* strand. The methylation, Argonaute loading, and uridylation steps are tightly coordinated. miR1510 lacking 2’-O-methylation or those undergoing delayed methylation by HEN1 are loaded to AGOs, and thereby uridylated by HESO1, converting miR1510 to 22 nt, as prior work showed that HESO1 uridylates AGO-bound miRNAs in vitro (25). The AGO-bound 22 nt miR1510 resulting from uridylation might undergo methylation by HEN1, maintaining its stability, because we observed that the 22 nt isoform of miR1510 is fully methylated (Fig. 3). Therefore, the pathway that we demonstrate here is a coordinated process, coupling miRNA biogenesis and modifications.
Systematic characterization of MIR1510 genes in sequenced plant genomes showed that miR1510 is lineage-specific to the Phaseoleae tribe of legumes, which diverged ~41 to 42 MYA among the Papilionoideae subfamily. Thus, miR1510 is a relatively young miRNA in evolutionary terms, and variation in its processing/maturation steps may reflect a process of ‘optimization’. While most tested Phaseoleae species, including soybean, common bean, and pigeon pea, display the same mechanism to generate the abundant 22 nt isoform of miR1510, the biogenesis of miR1510 in mung bean seems distinct. Sequencing data showed that miR1510 in mung bean (“vra-miR1510”) is processed directly to 22 nt due to a shift in cleavage sites, although the mispairing at the same position was still retained in mung bean (Fig. S6). The direct processing of vra-miR1510 to 22 nt is supported by the absence of detectable levels of its 21 nt isoform, which is found at considerable levels as the un-uridylated form in other Phaseoleae species we examined. The recent evolutionary origin of miR1510 may explain its plasticity in DCL1 processing observed across Phaseoleae species, specifically the loss of a requirement for uridylation in mung bean to produce the 22 nt isoform.
miRNAs are important regulators of plant NB-LRR disease resistance genes. Evolutionary analysis demonstrates that MIRNA genes are occasionally generated from their target genes as a result of small-scale genome rearrangements such as duplications forming inverted repeats (44). Expanded counts of NB-LRR in plant genomes may be balanced by the emergence of miRNAs that target them (37), perhaps to minimize fitness costs and to avoid autoimmune responses (1). These miRNAs are often 22 nt, a length endowed with the property of triggering phasiRNAs, thereby reinforcing the efficacy of silencing via the secondary siRNAs which may have additional targets and mobility, potentially functioning to maintain a basal level of NB-LRR gene expression (16, 17). The 22 nt isoform of miR1510 found in the Phaseoleae is special in terms of its biogenesis, requiring monouridylation to achieve its length. Considering that this pathway involves multiple steps and that a considerable abundance of the 21 nt isoform remains detectable (i.e. the pathway is inefficient), an evolutionary advance would be the direct processing by DCL1 of MIR1510 precursors into a 22 nt isoform – exactly as observed in mung bean. Therefore, we believe that via comparative genomic analysis, we have captured evidence of selection optimizing the processing of a plant miRNA precursor.
EXPERIMENTAL PROCEDURES
Plant materials, vector construction and transformation, β-elimination, RT-PCR, small RNA sequencing and bioinformatics analysis, HEN1 purification and in vitro methyltransferase activity assays, LC-MS/MS, and primers and probes used in this study are described in SI Materials and Methods.
AUTHOR CONTRIBUTIONS
B.C.M. and X.C. conceived the experiments. Q.F., Y.Y., L.L., P.B., and Q.D. performed experiments. Q.F. and Y.Z. conducted bioinformatics analyses. Q.F. and B.C.M. wrote the manuscript, with contributions from all authors.
ACKNOWLEDGEMENTS
We thank Mayumi Nakano, S. Deepthi Ramachandruni, and Parth Patel for assistance with data handling and data visualization. We thank Dr. Scott Jackson for sharing the mung bean seeds. This work was supported by the US National Science Foundation award #1257869 from the Division of Integrative Organismal Systems (IOS) in the Meyers lab. Research in the Chen lab is supported by grants CA-R-BPS-5084-H and 2010-04209 from USDA-NIFA.