Abstract
In a recent past, Transposable Elements (TEs) were referred as selfish genetic components only capable of copying themselves with the aim to increase the odds that will be inherited. Nonetheless, TEs have been initially proposed as positive control elements acting in synergy with the host. Nowadays, it is well known that TE movement into genome host comprise an important evolutionary mechanism capable to produce diverse chromosome rearrangements and thus increase the adaptive fitness. According to as insights into TE functioning are increasing day to day, the manipulation of transposition has raised an interesting possibility to setting the host functions, although the lack of appropriate genome engineering tools has unpaved it. Fortunately, the emergence of genome editing technologies based on programmable nucleases, and especially the arrival of a multipurpose RNA-guided Cas9 endonuclease system, has made it possible to reconsider this challenge. For such purpose, a particular type of transposons referred as Miniature Inverted-repeat Transposable Elements (MITEs) has demonstrated a series of interesting characteristics for designing functional drivers. Here, recent insights into MITE elements and versatile RNA-guided CRISPR/Cas9 genome engineering system are given to outline an effective strategy that allows to deploy the TE potential for control of the host transcriptional activity.
Introduction
Over last decades, diverse genome engineering strategies manipulating the functioning of Transposable Elements (TE) have been intended to achieve site-specific integration of foreign DNA into host genomes. With this aim, the naturally occurring mechanism of transposition was exploited to develop corresponding vectors that allow efficient gene transference and effective integration into genome insertion sites1-3. Moreover, the zinc-finger (ZF) and transcription activator-like effector (TALE) programmable nucleases, which can be efficiently redesigned to target specific genome sequences, have been also employed for the development of tools that efficiently achieve genome engineering integration. ZF and TALE-based constructs display important applications in genome engineering, reverse genetics and targeting transgenic integration strategies4,5. Even more, one RNA-guide system based on prokaryotic Type II clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated Cas9 endonuclease (CRISPR/Cas9) immune system has recently emerged as more efficient, easy to use and low cost genome editing tool able to produce desired genomic modifications6-8 (Figure 1).
The CRISPR/Cas9-mediated genome editing system use a synthetic single guide RNA (sgRNA) to produce targeted Cas9 DNA Double Strand Breaks (DSBs) that are subsequently repaired by endogenous DNA repair mechanisms9,10. As consequence of this, the DSBs generated are repaired by either both non-homologous end-joining (NHEJ) or homologous recombination (HDR) DNA repair pathways. Depending on repair mechanism activated, site-specific modifications involving gene disruption, gene replacement, or nucleotide substitution may be efficiently generated. NHEJ is prone to produce indel mutations at target sites, while by use of DNA templates harboring desired sequences it is possible generate specific genome modifications by activation of HDR-mediated repair11. Alternative genome engineering strategies consisting of a nuclease-dead Cas9 (dCas9) and Cas9 nickases fused with modular transcriptional domains (activators and/or repressors), chromatin-remodelers and fluorophores also enable efficient transcriptional control, site-specific chromatin modifications and visualization of loci, respectively12. The RNA-guided CRISPR/Cas9 system has dramatically improved our ability for both in vitro and in vitro genome editing from many organisms, thereby being increasingly employed in biotechnology and therapeutics.
RNA-guided Cas9 engineering system for design of homing endonuclease gene drives
Diverse endogenous genome elements including, among others, homing endonuclease genes (HEG) and Transposable Elements (TEs) are capable to exploit the host machinery in order to increase the odds that they will be inherited. Such elements were previously referred as “gene drives”, or well parasitic genome players able to spread in the genome host by copying themselves into target sequences [13-15]. Since the beginning, it was believed that these elements might be used to design effective genome engineering strategies that allow to co-opt endogenous host molecular mechanisms. With this aim, Burst13 suggested an interesting, multi-talent and promising procedure based on homing endonuclease gene (HEG) drives that would allow to address a wide range of ecological and biotechnological issues (Figure 2); however, several technical constraints hindered its effective design. Even though the Burst’ proposal had not be implemented, his revolutionary idea predicted the development of future outstanding applications. Fortunately, the emergence of the adaptable CRISPR/Cas9 genome editing system has overcome methodological restrictions, thereby this gene drives already successively engineered to bias inheritance in favor of particular yeast (Saccharomyces cerevisiae) Cas9-based drives16, which exhibited remarkable high transmission rates (99%). Even more importantly, strategies based on CRISPR/Cas9-mediated gene drives has demonstrated have potential to address the evolution of natural populations17, whereby displaying the outstanding power that these emerging technologies possess and, besides, underlying an imperative need for public debate before each effective use18,19.
The CRISPR system is associated with the functioning of Transposable Elements
Several parallelisms between the CRISPR/Cas9 system and the eukaryotic RNA interference (RNAi) mechanism have been interestingly discovered20-22. In nature, CRISPR loci contain arrays of Direct Repeats (DR) associated with spacers sequences which likely derived from bacteriophages or plasmids (see Figure 1a). The CRISPR DRs act as Cas9-mediated cleavage sites, whereas invasion-acquired spacers mediate immunologic responses to host reinvasion, thereby mimicking eukaryote posttranscriptional gene silencing (PTGS) RNAi mediated by small RNAs (sRNAs). Likewise CRISPR DRs, TEs may also be flanked by cleavage sites recognized by particular enzymes (transposases) that mediate their own transposition. Essentially, there are two types of TE flanking sequences referred as Long Tandem Repeats (LTRs) or Terminal Inverted Repeats (TIRs), which allow their classification into Class I and Class II TEs, respectively. Alternatively, TEs are also categorized according to their transposition intermediates into RNA-mediated (Class I) and DNA-mediated (Class II) elements, which are transposed through “copy-paste” and “cut-paste” mechanisms, respectively (for review see Casacuberta and Santiago23). Remarkably, flanking CRISPR DRs derived from insertions of particular Class II TEs known as Miniature Inverted-repeat Transposable Elements (MITEs) have been discovered24. MITEs are non-autonomous DNA (Class II) TEs usually composed by ~100-800 pb, flanked by TIRs and adjacent to Tandem Duplication Sites (TDSs) (Figure 3). Resembling to the functioning of CRISPR systems during bacteriophage infection, MITEs also can generate sRNAs that mediate the RNAi silencing pathway during stress responses and environmental (hormone) signals25. Altogether, it suggests that both evolution and functioning of natural CRISPR systems are related to the behavior of TEs in relation to their hosts. Indeed, MITE-derived CRISPR DRs indicate that site-specific TE insertions have contributed to evolution of CRISPR arrays24. Diverse types of TEs exhibit preference insertional targets, whereby hot spots for many TE families have been correspondingly characterized26-28. In the case of MITEs, these elements often exhibit preferential transposition into both AT dinucleotide and ATT trinucleotide genome signatures29,30.
From an evolutionary point of view, the transposition mechanism by which TEs are scattered into genome hosts challenged the mistaken concept that considered to genome as one fixed, immutable entity. Nowadays, it is well known that transposition may comprise an important adaptive mechanism capable of providing both hereditable and non-hereditable variability, source of critical phenotypic plasticity. Under particular environmental conditions, active transposases mediate transposition and it may positively affect gene expression in the host, thereby generating more adaptable phenotypes. For example, Class I retrotransposons sequences are activated during human neuronal differentiation and consequent amplification triggers chromosome arrangements capable of conferring somatic plasticity31,32. In maize, Qüesta and colleagues33 suggested that UV-B- radiation induce the transposition via modulation of chromatin structure and thereby generating genome variation. Moreover, in Arabidopsis, it has been showed that stress heat conditions induce the transposition of the ONSEN (Class I) copia-like retrotransposon, and the TE accumulation was encouraged in plant small intereference RNA (siRNA) deficient mutants34,35 and, therefore, evidenced the critical role of RNAi pathways during stress-mediated transposition. Indeed, TE mobilization and epigenetic gene regulation mechanisms are deeply interconnected to each other, and nowadays it is well known that transposition may result in epigenetic modifications associated with more adaptable stress-tolerant phenotypes36-38.
Control of host transcription by induced TE drives
The targeted insertional mutagenesis has emerged as an important strategy for deciphering the gene function by inducing large-scale mutations into loci of interest. In cancer modeling, DNA Sleeping Beauty TE-based systems have been used to induce somatic-specific mutagenesis and thus to identify essential genes involved in tumorgenesis39. In plants, a transposon-tagging tool for genome-wide analysis based on the rice mPing MITEs was used in transgenic soybean40, showing this system preferential insertion for both nearby genes and AT-rich sequences. Hancock and colleagues40 observed an increased transpositional activity during specific developmental stages (cotyledon vs. globular stage), suggesting interestingly that insights into the developmental regulation pathways involved might be used to control transposition. Both results indicate how the genome fluidity can be efficiently manipulated via TE-based systems. In this sense, new molecular breeding strategies based on the induced activation of Class I retrotransposons have also already been suggested41. As mentioned before, TEs represent an important source of phenotype plasticity, which is particularly interesting in view of the potentiality for addressing transposition through application of external stimulus. In this sense, we could, for example, engineer TE-based strategies in order to induce transcriptional control of target genes (see Figure 3b). For such purpose, the CRISPR/Cas9 genome editing system would allow HDR-mediated preferential insertion upstream of target genes whose transcriptional regulation is desired. Alternatively, RNA-guided genome engineering constructs based on the use of Cas9 fused with transposase enzymes might be likewise employed to direct the insertion of target sequences. Interestingly, targeted transposition by using ZF/TALE-piggyBac transposase complexes have been already reported42-44. In conclusion, RNA-guided Cas9 genome editing tools can be used to address the mechanism of transposition at same time that inducible (environment/stress) conditions may allow simultaneous TE activation, thereby being possible the design of targeted and switchable genome engineering strategies to control the transcription.
In order to understand how TE-based strategies could be used for effective control of transcription, insights into the particular type of TE should be mandatory. In this work, MITE elements are described, since these Class II DNA transposons exhibit a series of particular characteristics that make them especially suitable for designing TE-based transcription drives. 1- MITEs are abundant repeat elements in eukaryote genomes and they play critical roles during genome evolution45,46. For example, MITE elements represent the most usual type of TEs in rice genome47. 2- Insertion of MITEs can both upregulate and downregulate the expression of nearby genes48. 3- The high number of MITE elements as well as the high level of sequence conservation among MITE subfamilies suggests that they have been amplified from few elements, which is characteristic of Class I TEs23. 4- Several active MITEs including, among others, mPing30 and mGing49 in rice, Stowaway50 in potato AhMITE151 in peanut have been discovered. 5- MITE amplification can be efficiently induced under appropriate conditions49. 6- MITEs exhibit targeted integration and are preferentially inserted close to genes45,52. For instance, the Tourist-like and Stowaway-like plant MITE superfamilies exhibit preferential insertion into particular TA and TAA target sequences, respectively47. 7- Since they do not possess the enzymes required for their own transposition, MITEs are usually shorter compared to Class I and autonomous DNA TEs, thereby being more easily to be manipulated for designing efficient TE-based genome engineer strategies.
In silico analyses show that MITE sequences are involved with RNA-mediated gene regulation53, it either by hairpin-like miRNA precursors (pre-miRNA)54 or by siRNA biogenesis55. In addition, MITE insertions can also mediate gene silencing by epigenetic mechanisms such as, for example, repressive DNA and histone methylation. Alternatively, MITE-mediated gene upregulation have been also reported, thereby gene activation through regulatory motifs or epigenetic-dependent mechanisms has been proposed36,56-59. In crops, these elements can modulate the transcriptional activity of essential genes25,38,58,60,61 and, therefore, MITE insertions may represent an important source of genetic variation to be considered for the improvement of agronomic traits (eg., hydric stress tolerance, yield, disease resistance, quality, etc.). Moreover, genome engineering technologies as, for example, transgenesis, mutagenesis, etc., have been also widely adopted as useful tools for crop improvement. In contrast, MITE-based transcription drives might represent an important advantage because this strategy is the only one that would show induced non-hereditable control of transcriptional activity (both up and down), being applicable to different cells, tissues and organs.
Concluding remarks
Most eukaryotic genomes are littered with TEs, and nowadays it is well known such elements have played critical roles in the evolution of genome hosts. Although their mode of action resemble to selfish parasitic elements, TE dispersal may also be beneficial for their hosts. Recent insights into the biology of TEs retake an initial position which postulated that environment-induced TE dispersion might comprise an important adaptive mechanism62 and we now known that it is a critical regulatory mechanism for both allocating gene motifs and diverse types of genome sequences involved in the establishment of epigenetic profiles. Indeed, the amplification of TEs induced under certain conditions might leads to concerted control of non-linked genes involved in the same gene regulatory pathways, even also having the potential to generate de novo regulatory networks63. The currently active mPing elements represent a clear example of how TEs can act synergistically with hosts to modulate their functioning. The rice mPing family is present in high copies in the genome and exhibit preferential insertion into AT-gene-rich regions, avoiding exons while simultaneously choosing promoter regions. These MITEs are able to both upregulate or downregulate the expression of genes according to the localization of the insertions, although more than 80% of mPing insertions did not exhibit detectable effects on expression of nearby genes64. Interestingly, Naito and colleagues65 proposed that active mPing elements should be benign to hosts, whereby consequent amplification might result in the selective control of gene expression. On the other hand, it has been shown that transgenic mPing-based transposon tagging systems can also remain active in the soybean genome40 and, therefore, the functional characterization of rice mPing elements will be certainly useful in order to design effective genome engineering strategies involving not only endogenous but also exogenous control of the host functions.
Transposon-based vectors are effective tools to achieve targeted mutagenesis (gene disruption/replacement). Moreover, the emergence of the versatile CRISPR/Cas9 genome editing system has enabled to overcome long-standing technical limitations in order to give the next step towards the design of more efficient TE-based targeting systems. The preferential integration induced through application of particular stress/environmental treatments represent a key milestone to maximize the versatility of such systems, since it allows to exploit at will the natural potential that TEs have to control the host. Further advances on CRISPR research surely will allow to develop new CRISPR/Cas9-based technologies that overcome the challenges associated to design of these strategies and also those that the RNA-guided system themselves pose (off-target effects, targeting specificity, etc). Even though there is still much to learn and achieve, it should be noted that MITEs present a series of desirable characteristics worthy to be considered the engineering design of environment-induced TE-based drives. In particular, it will depend on the ability to induce effective activation of TEs, whereby mPing elements could be rough diamonds, since it have been shown that amplification of this MITE family can be efficiently triggered by application of different treatments such as cell culture30,66, hybridization67 or pressurization68. In resemblance with HEG-mediated gene drive strategies, issues related to cutting specificity, copying efficiency and stability of TE drives should be also considered, being certainly the CRISPR/Cas9 genome engineering tools appropriate to address all these methodological challenges. In the case to achieve cell-specific regulation of gene expression, TE-based drives might represent an efficient and non-hereditable method of transcriptional control.
It should be noted that engineering strategies based on transcription drives would allow to address important ecological and biotechnological issues. In crops, the climatic fluctuations associated with the global warming require the design of molecular breeding strategies tailored to optimize plant development under such scenarios of change37. In this regard, it has been already proposed that the TE amplification may represent a solution for generating functional genetic diversity in the face of ever-changing environments64. Thus, induced activation of TE-based gene drives might represent a useful approach to control the transcriptional activity of target genes whose expression has been positively correlated with the development of quantitative (complex) traits such as biomass, seed composition, yield, disease resistance, etc. In addition, it would be feasible to engineer ecological crop TE-based drives which activated under excessive precipitation conditions might upregulate the activity of target evapotranspiration genes, thereby designing a biologically versatile and useful strategy to drain the surplus of water within floodplain areas. On the other hand, since it has been found that TEs are associated with diverse genome host mechanisms including, among others, chromatin rearrangements, telomere maintenance or gene duplication, similar strategies could be eventually applied in order to exploit such functions. At root the impulse to develop these outstanding applications, it is certainly likely that transposon-mediated drives will be designed as to the extent that our knowledge on the functioning of TEs gain success to address it. In conclusion, even though TE-based transcription drives and relative upcoming technologies present important challenges to be faced and overcome, the emergence of effective genome engineering strategies that exploit such systems will be only a matter of time.