Abstract
Phosphorus is an essential nutrient for all plants, but is one of the least mobile, and consequently least available, in the soil. Plants have evolved a series of metabolic and developmental adaptations to increase the acquisition of phosphorus and to maximize the efficiency of use within the plant. In Arabidopsis ( Arabidopsis thaliana), the PHO1 protein regulates and facilitates the distribution of phosphorus within the plant. To investigate the role of PHO1 in maize (Zea mays), we searched the B73 reference genome for homologous sequences and identified four genes that we designated ZmPho1;1, ZmPho1;2a, ZmPho1;2b and ZmPho1;3. ZmPho1;2a and ZmPho1;2b are the most similar to AtPho1, and represent candidate co-orthologs that we hypothesize to have been retained following a whole genome duplication. Tissue-and phosphate-specific differences in the accumulation of ZmPho1;2a and ZmPho1;2b transcripts suggest functional divergence. The presence of phosphate-regulated anti-sense transcripts derived from both ZmPho1;2a and ZmPho1;2b, suggest the possibility of regulatory crosstalk between paralogs. To characterize fully functional divergence between ZmPho1;2a and ZmPho1;2b, we conducted a Ds transposon mutagenesis and describe here the generation of novel insertion alleles.
Introduction
Phosphorus (P) is an essential nutrient for all plants and a limitation on productivity in many agricultural systems [1]. Current levels of agricultural phosphorus inputs are recognized to be both unsustainable and environmentally undesirable [2]. Rational strategies to improve P efficiency in agricultural systems demand a greater understanding of P relations in crop plants, both in terms of P uptake from the soil and P translocation and use within the plant.
The protein PHO1 has been characterized from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) to play a key role both in the export of inorganic P (Pi) into the xylem apoplast for translocation [3] and in the modulation of long-distance signals underlying the P-deficiency response [4]. The Arabidopsis Atpho1 mutant exhibits severe Pi deficiency in the shoots but normal Pi content in the roots, displaying associated symptoms of phosphate deficiency, including reduced growth rate, thinner stalks, smaller leaves, very few secondary inflorescences, delayed flowering and elevated levels of anthocyanin accumulation [3]. The ortholog of AtPHO1 in rice has been designated OsPho1;2 [5]. Disruption of OsPHO1;2 results in phenotype very similar to that of the Atpho1 mutant [5], suggesting that the two genes are functionally equivalent. Indeed, expression of OsPHO1;2 in the Atpho1 background will partially complement the mutant phenotype [6]. A feature distinguishing the rice OsPHO1;2 gene from its Arabidopsis ortholog is the production of a cis-Natural Antisense Transcript (cis-NAT) [5] [7]. Cis-NATs are defined as RNA transcripts generated from the same genomic loci as their associated sense transcript, but on the opposite DNA strand, thus generating a transcript containing a region with perfect complementarity to the sense transcript [8]. Accumulation of cis-NATOsPHO1;2 increases under P limitation promoting translation of the OsPHO1;2 sense RNA by enhancing the efficiency of polysome loading [9].
The PHO1 protein contains two domains: the N-terminal hydrophilic SPX domain (named for the yeast proteins Syg1 and Pho81, and the human Xpr1) and the C-terminal hydrophobic EXS domain (named for the yeast proteins ERD1 and Syg1 and the mammalian Xpr1) [10]. The SPX domain is subdivided into three well-conserved sub-domains, separated from each other by regions of low conservation. SPX domain containing proteins are key players in a number of processes involved in P homeostasis, such as fine tuning of Pi transport and signaling by physical interactions with other proteins [7]. Following the SPX domain there are a series of putative membrane-spanning α-helices that extend into the C-terminal EXS domain [10]. In AtPHO1, the EXS domain is crucial for protein localization to the Golgi/trans-Golgi network and for Pi export activity. In addition, the AtPHO1 EXS domain is involved in the modulation of long-distance root-to-shoot signaling under P limitation [4].
Despite the importance of maize as a staple crop and the dependence of maize production on large-scale input of phosphate fertilizers, the molecular components of maize P uptake and translocation remain poorly characterized [11]. Although it has been possible to identify maize sequences homologous to known P-related genes from other species, functional assignment has been based largely on patterns of transcript accumulation. With the development of accessible public-sector resources, it is now feasible to conduct reverse genetic analyses in maize. Here, we aim to extend the molecular characterization of maize P response by generating mutant alleles of maize Pho1 genes using endogenous Activator/Dissociation (Ac/Ds) transposable elements. The (Ac/Ds) system consists of autonomous Ac elements that encode a transposase (TPase) and non-autonomous Ds element that are typically derived from Ac elements by mutations within the TPase gene. Lacking TPase, Ds elements are stable, unless mobilized by TPase supplied in trans by an Ac. Ac/Ds elements move via a cut-and-paste mechanism [12], with a preference for transposition to linked sites [13] that makes the system ideal for local mutagenesis [14]. To exploit the system for reverse genetics, Ac and Ds elements have been distributed throughout the genome and placed on the maize physical map, providing potential “launch pads” for mutagenesis of nearby genes [15] [16].
In this study, we identify four maize Pho1 like genes in the maize (var. B73) genome, including two (ZmPho1;2a and ZmPho1;2b that we consider co-orthologs of AtPHO1. We provide experimentally determined gene structures of ZmPho1;2a and ZmPho1;2b and characterize transcript accumulation in the roots and shoots of seedlings grown in P-replete or P-limiting conditions. We present novel insertional alleles of ZmPho1;2a and ZmPho1;2b generated using the Ac/Ds transposon system.
Materials and Methods
Identification of maize Pho1 genes
The AtPHO1 cDNA sequence (GenBank ID: AF474076.1) was used to search the maize working geneset peptide database (www.maizesequence.org) in a BLASTX search performed under the default parameters. Identified maize sequences were in turn used to reciprocally search Arabidopsis thaliana (www.phytozome.net). Four sequences were identified with a high level of similarity to AtPHO1: GRMZM5G891944 (chr 3:28,919,073-28,923,871); GRMZM2G466545 (chr 4:171,946,555-171,952,268); GRMZM2G058444 (chr 5:215,110,603-215,115,635) and GRMZM2G064657 (chr 6:122,577,593-122,582,074). The putative protein sequences were confirmed to contain canonical PHO1 domain structure by PFam analysis (pfam.sanger.ac.uk) and NCBI conserved domains search (www.ncbi.nlm.nih.gov).
Amplification of full length Pho1;2 cDNAs
Total RNA was extracted using Trizol-chloroform from the roots of 10-day-old B73 seedlings grown under phosphate limiting conditions (sand substrate; fetilized with 0mM PO4). 1.5μg of RNA was used to synthesize cDNA with oligo(dT) primer and SuperScript III Reverse Transcriptase (Invitrogen, in Guanajuato, Mexico) in a reaction volume of 20μl. PCR amplification of full length Pho1;2 cDNAs was performed using using Platinum Taq DNA Polymerase High Fidelity (Invitrogen) under the following cycling conditions: initial incubation at 95°C for 3min, followed by 35 cycles of 94°C for 30sec, 61°C for 30sec and 68°C for 30sec. Primers used are shown in the Table 1.
Analysis of Pho1;2 transcript accumulation
Total RNA was extracted using Trizol-Chloroform from the roots of 10-day-old B73 seedlings grown under P replete (sand substrate; fetilized with 1mM PO4) or P limiting (sand substrate; fertilized with 0mM PO4) conditions. cDNA was synthesized as described above. PCR amplification of was performed under the following cycling conditions: initial incubation at 95°C for 5min, followed by 32 cycles of 95°C for 30sec, 63°C for 30sec and 72°C for 1min. Primers are shown in Table 1. Maize ubiquitin (GRMZM2G419891) was used as a control. Products were analyzed on 1.5% agarose gels.
Transposon mutagenesis
The strategy for Ac/Ds mutagenesis was as previously described [15]; [16]. Genetic stocks were maintained in the T43 background, a color-converted W22 stock carrying r1-sc∷m3, a Ds6-like insertion in the r1 locus that controls anthocyanin accumulation in aleurone and scutellar tissues (Alleman and Kermicle 1993). The frequency of purple spotting in the aluerone as a result of somatic reversion of r1-sc∷m3 was used to monitor Ac activity (McClintock1951). Donor Ac and Ds stocks were selected from existing collections [15] [16]: the element Bti31094∷Ac is placed on the B73 physical map 650.8Kb from ZmPho1;2a; the element I.S06.1616∷Ds is inserted in Intron 13 of ZmPho1;2b, and was subsequently designated ZmPho1;2b-m1∷Ds. To generate a testcross population for mutagenesis of ZmPho1;2a, 207 indiviudals homozygous for Bti31094∷Ac were crossed as females by T43, and rare finely spotted progeny kernels selected for screening. To remobilize the Ds element I.S06.1616∷Ds within ZmPho1;2b-1, homozygous ZmPho1;2b-1∷Ds individuals carrying the unlinked stable transposase source Ac-Immobilized Ac-im (Conrad2005) were used as males to pollinate T43, and coarsely spotted progeny kernels were selected for screening.
To identify novel Ac/Ds insertions into pho1;2a and pho1;2b, selected kernels were germinated in the greenhouse and DNA isolated from pools of 18 seedlings. The candidate gene space was explored by PCR performed with a range of gene specific primers used in combination with outward-facing Ac/Ds primers. All primers are listed in Table 2. PCR reactions contained 400ng DNA and .25μM of each primer. Reactions for pho1;2a were performed using Platinum Taq DNA Polymerase High Fidelity (Invitrogen; in NY, USA,) under the following cycling conditions: denaturation at 94°c for 4 min; 30 cycles of 94°c for 30 sec, 60c for 30 sec, 72°c for 3min 30 sec;final extension at 72°c for 10 min. Reactions for pho1;2b were performed using Kappa Taq DNA polymerase (Kapa Biosystems; in Guanajuato, Mexico) under the following cycling conditions: denaturation at 95°c for 5 min; 35 cycles of 95°c for 30 sec, 58°c for 30 sec, 72°c for 3min 30 secs; final extension at 72°c for 5 min. Positive pools were re-analyzed as individuals, following the same cycling conditions. PCR reactions were analyzed on 1.5% agarose gels. Products from positives individuals were purified using QIAquick PCR Purification Kit (Qiagen), ligated into pGEM T-easy vector (Promega) and sequenced.
Footprint alleles generation was carried on using homozygous individuals for pho1;2a-m1∷Ac/pho1;2a-m1∷Ac as males and testcrossed by T43. Rare non-spotted progeny kernels were selected. Putative Ac excision was confirmed by PCR across the site using primers shown in Table 2, and amplified with Kappa Taq DNA polymerase (Kapa Biosystems; in Guanajuato, Mexico) under the following cycling conditions: denaturation at 95°c for 5 min; 35 cycles of 95°c for 30 sec, 58°c for 30 sec, 72°c for 3min 30 secs; final extension at 72°c for 5 min. PCR productS from each individuals were purified using QIAquick PCR Purification Kit (Qiagen), ligated into pGEM T-easy vector (Promega, in Guanajuato, Mexico) and sequenced.
Results
The maize genome contains four PHO1 homologs
To identify maize Pho1 genes, we searched the B73 reference genome (B73 reference genome v3; www.maizeGDB.org) to identify gene models whose putative protein products exhibit a high degree of similarity to the Arabidopsis protein AtPHO1 (Table 3). We identified four such maize gene models, and, on the basis of similarity to previously annotated rice genes Secco, 2010, designated these ZmPho1;1 (GRMZM5G891944), ZmPho1;2a (GRMZM2G466545), ZmPho1;2b (GRMZM2G058444) and ZmPho1;3 (GRMZM2G064657). To investigate orthology among Arabdidopsis and grass PHO1 genes, we identified additional sequences from sorghum (Sorghum bicolor) and canola (Brassica rapa), and performed a multiple sequence alignment of the putative protein sequences to generate a distance tree (Fig. 1).
In our analysis, we included from Arabidopsis only the proteins AtPHO1 and AtPHO1;H1, leaving aside a large clade of divergent functionally distinct PHO1 proteins that are specific to dicotyledonous plants Stefanovic, 2007. We used a PHO1 protein from the moss Physcomitrella patens to root the tree. Our analysis supported the previously reported divergence of PHO1 and PHO1;H1 clades, dating from before divergence of monocotyledonous and dicotyledonous plants Secco, 2010; Stefanovic, 2007. Within the PHO1;H1 clade, we observed a duplication event specific to the grasses in our analysis. As a result, the three grass species each contain two co-orthologs of AtPHO1;H1 – encoded by the genes annotated Pho1;1 and Pho1;3. We observed also an expansion of the PHO1;H1 clade in canola, although this expansion is lineage specific, and there is no suggestion that Pho1;H1 was not a single gene at the base of this clade. The PHO1 clade itself contains the products of single-copy Pho1/Pho1;2 sequences in all species in our analysis, with the exception of a lineage-specific duplication in maize. As a consequence, the duplicated maize genes ZmPho1;2a and ZmPho1;2b are both considered to be orthologous to AtPho1.
ZmPho1;2a and ZmPho1;2b show features of syntenic paralogs retained following to genome duplication
The high degree of sequence similarity between ZmPho1;2a and ZmPho1;2b suggests that they result from a recent gene duplication event. This interpretation is supported by the observation that Pho1;2 is a single copy sequence in both rice and sorghum. It has been hypothesized that the last whole genome duplication (WGD) event in maize occurred between 5 and 12 million-years-ago, sometime after divergence from the sorghum lineage, as the result of polyploidization [17]. Following WGD, the maize genome has returned to a diploid organization, and the majority of duplicate sequences have been lost through a process known as fractionation [17]. In certain cases, however, both paralogs have been retained. To assess whether ZmPho1;2a and ZmPho1;2b might represent a pair of paralogs retained following WGD, we investigated synteny between the maize and sorghum genomic regions in which the Pho1;2 sequences are present. ZmPho1;2a and ZmPho1;2b are located on chromosomes (Chr) 4 and 5, respectively (Table 3), and the homologous gene SbPho1;2 on sorghum Chr4. The regions carrying the two maize Pho1;2 genes have been assigned previously to distinct pre-tetraploid ancestral genomes (Chr4:168,085,162‥179,711,318 to sub-genome 2; Chr5:208,925,180‥217,680,842 to sub-genome 1; [17]). Furthermore, both maize regions of maize exhibit synteny with the region of sorghum Chr4 that carries SbPho1;2. The genomic region surrounding the Pho1;2 genes exhibits a high frequency of candidate retained paralog pairs (Fig. 2A), providing ample evidence of micro-synteny between the regions containing ZmPho1;2a and ZmPho1;2b. In all cases, sorghum and maize Pho1;2 genes are adjacent to a putative WD40 protein encoding gene, present on the opposite strand and partially overlapping the annotated 3’ UTR region of the Pho1;2 sequence (Fig. 2B), a feature not observed in the other Pho1 paralogs.
Transcripts encoded by ZmPho1;2a and ZmPho1;2b exhibit divergent patterns of expression
To determine if both ZmPho1;2a and ZmPho1;2b are potentially functional, we used RT-PCR to amplify gene-specific fragments of the annotated 3’ UTR regions of the two genes from cDNA prepared from roots or leaves of 10-day-old seedlings (B73) grown in sand, watered with either complete Hoagland solution (+P) or a modified phosphate-free Hoagland solution (-P) (Fig. 3A). Transcripts of ZmPho1;2a were detected in roots but not leaves, with an indication of greater accumulation under -P. In contrast, transcripts of ZmPho1;2b showed constitutive accumulation in in both roots and leaves, indicating that the two maize Pho1;2 genes, while potentially both functional, have diverged at the level of transcript accumulation. We examined also the accumulation of SbPho1;2 transcripts in sorghum (Tx623) seedlings under the same growth conditions. Transcripts of SbPho1;2 accumulated in a pattern similar to that observed for ZmPho1;2b, suggesting that constitutive expression was the ancestral state. Subsequently, we designed additional gene-specific PCR primers to the 5’ UTR regions of ZmPho1;2a and ZmPho1;2b and used these to amplify the complete coding-sequence from cDNA and to confirm the gene-model structure.
ZmPho1;2a and ZmPho1;2b are associated with phosphate-regulated cis-natural anti-sense transcripts
Although we observed differential transcript accumulation between maize ZmPho1;2a paralogs, it has been reported that OsPho1;2 is largely regulated at the post-transcriptional level by an associated with a cis-natural anti-sense transcript (cis-NATOsPho1;2), the accumulation of which responds more strongly to P availability than the corresponding sense transcript [5]; [9]. The (cis-NATOsPho1;2) transcript has been shown to act as a translational enhancer, and has been proposed to act by direct interaction with the sense transcript [9]. The rice cis-NATOsPho1;2 is initiates in Intron 4 of OsPho1;2 and extends into the 5’ UTR region [9]. A putative anti-sense sequence has been annotated in the maize reference genome to initiate in a homologous position in ZmPho1;2a (maizeGDB.com; Jabnoune2013), although, on the basis of cDNA evidence, the transcript is considerably shorter than cis-NATOsPho1;2 and extends only to Intron 2 of ZmPho1;2a (Fig. 2B). No paralogous sequence has been annotated associated with ZmPho1;2b.
To confirm the presence of cis-NAT transcripts associated with ZmPho1;2a and explore the possibility that a paralogous transcript might be generated from ZmPho1;2b, we designed gene-specific primers to the introns flanking the homologous Exons 4 and 3 of ZmPho1;2a and ZmPho1;2b, respectively, and attempted to amplify products from cDNA prepared from seedling root and leaves as described above. Products of the predicted size were amplified using both ZmPho1;2a and ZmPho1;2b primer sets, consistent with the accumulation of cis-NATs (Fig. 3B). No products were amplified from no-RT control samples (data not shown). Putative cis-NAT products were sequenced and confirmed to originate from the ZmPho1;2a and ZmPho1;2b genes. We had not observed evidence of accumulation of alternatively or partially spliced transcripts in our amplification of the full length coding sequence.
The accumulation of cis-NATZmPho1;2a was observed to be induced under low P in both roots and leaves. The transcript cis-NATZmPho1;2b was observed to accumulate in roots but not leaves, with no response to P availability, providing further evidence of functional divergence between the paralogs. Interestingly, using the approach we employed in maize, we found no evidence of an equivalent cis-NAT associated with SbPho1;2 (Fig. 3B), although additional experiments will be required to rule out the possibility that antisense transcripts are produced from other regions of the sorghum gene.
Transposon mutagenesis of ZmPho1;2a and ZmPho1;2b
To investigate functional divergence between ZmPho1;2a and ZmPho1;2b, we initiated a program to mutagenize both loci using the endogenous Activator/Dissociation (Ac/Ds) transposon system. Autonomous Ac elements and their non-autonomous Dissociation (Ds) derivatives are members of the hAT (hobo- Activator- Tam3) family of DNA transposons, moving via a cut-and-paste mechanism. Both Ac and Ds show a strong preference for reinsertion into locations genetically linked to the donor site. This enrichment for local transposition can be exploited for mutagenesis by identification and remobilization of a donor element closely linked to a gene of interest. Once established, it becomes possible to generate multiple alleles from a single test-cross population.
To mutagenize ZmPho1;2a, we recovered 1082 novel transposition events from the element Ac (bti31094∷Ac) located 650.8kb upstream the target (Fig. 4). A PCR-based strategy was designed to screen for reinsertion of Ac into ZmPho1;2a in which the gene was divided into three overlapping fragments. Allowing for both possible orientations of Ac insertion, we performed a total of 12 reactions to cover the gene space, screening first pools of 18 seedlings, and subsequently the individuals constituting positive pools. Putative insertions were re-amplified using DNA extracted from a second seedling leaf to discount somatic events. Using this strategy, we recovered a novel germinal Ac insertion in Exon 6 of ZmPho1;2a (zmpho1;2a-m1∷Ac). Left and right flanking border fragments were amplified and sequenced, confirming the exact location of the element and identifying an 8bp target site duplication (AGCCCAGG) consistent with Ac insertion.
To mutagenize ZmPho1;2b, we remobilized I.S06.1616∷Ds), a Ds element identified to be inserted in intron 13 of the target gene, and that we designated Zmpho1;2b-s1∷Ds. Plants homozygous for the Zmpho1;2b-s1∷Ds allele did not present any observable phenotype and RT-PCR analysis of transcript accumulation indicated such plants to accumulate correctly-spliced transcript to normal levels (data not shown). To derive novel functionally significant insertions, individuals homozygous for Zmpho1;2b-s1∷Ds carrying the unlinked stable transposase source Ac-immobilized were crossed as males to T43 (Fig. 5). Test-cross progeny were screened using a strategy similar to that employed in the mutagensis of ZmPho1;2a. Two novel Ds insertions were identified, one in the promoter region, 591bp upstream of the ATG (zmpho1;2b-s3∷Ds) and the second in intron 5 (zmpho1;2b-s4∷Ds). Sequencing of the region flanking each novel insertions identified the expected 8bp target site duplication.
Derivation of stable derivatives of ZmPho1;2a
To generate stable “footprint” alleles by Ac excision from ZmPho1;2a-m1∷Ac, a ZmPho1;2a-m1∷Ac homozygous individual was crossed as male to T43, and the resulting progeny screened to identify rare colorless kernels, indicating loss of Ac function (Figure 7). Colorless kernels were germinated, DNA extracted from seedlings, and PCR used to amplify the genomic region spanning the ZmPho1;2a-m1∷Ac insertion site. Products of a size consistent with Ac excision were cloned and sequenced. Two footprint alleles were identified, one with an 8bp insertion (GCCCAGCT) (ZmPho1;2a’m1.1) and the second with a 5bp insertion (GCCCA) (ZmPho1;2a’m1.2). For confirmation, and to facilitate future genetic experiments, the region spanning ZmPho1;2a’m1.1 was re-amplified and digested with the enzyme BseYI that specifically recognized a target site generated by the 8bp duplication 8. As a result of non-triplet duplication, both ZmPho1;2a’m1.1 and ZmPho1;2a’m1.2 alleles disrupt the DNA reading frame and our predicted to result in a premature termination of translation.
Discussion
Maize is the most widely grown cereal in the world. Much of this cultivated area is P limited. And yet, the molecular basis of P uptake and translocation in maize remains poorly characterized (reviewed in [11]). In this study, we have described the maize Pho1 gene family and generated novel insertional mutant alleles of pho1 using the endogenous maize Ac/Ds transposon system. The genetic material described will open the way to functional analysis of P homeostasis in maize.
The maize Pho1 family consists of four genes, corresponding to the three gene (PHO1;1, PHO1;2, PHO1;3) structure reported previously in rice (Oryza sativa) [5], with the elaboration of the duplication of PHO1;2. The sorghum Pho1 family was also found to consist of three genes. The restricted PHO1 family present in these cereals is in contrast to larger 11 members family of Arabidopsis [10]. Specifically, the cereals lack a large clade of PHO1 related sequences present in Arabidopsis that has been implicated in a range of biological functions [10,18,19]. Indeed, in studies aimed at complementing the Atpho1 by expression of other Arabidopsis PHO1 family members it was only AtPHO1;H1 that could rescue the mutant [18]. Similar experiments expressing rice PHO1 genes in the null Atpho1 background led to plants that despite the low shoot Pi as indication of low Pi transfer activity, plants maintain normal growth and suppression of the Pi-deficiency response, this phenotype resulted similar to PHO1 underexpressing lines, suggesting the PHO1 role in the way the available Pi is used and partitioned [6]. Recently the AtPHO1 role by the EXS domain has been address to Pi export a [4].
Although phylogenetic analysis and experimental data from Arabidopsis and rice suggest all four maize Pho1 to be directly involved in P homeostasis, further work in heterologous systems, and ultimately the analysis of the mutants described here, will be required to determine functional equivalence across species and the biological role in maize.
The lineage leading to maize experienced a tetraploidy event sometime after it split with the sorghum lineage, 5-12 million years ago [17]. Considering contemporary sorghum as an unduplicated outgroup, and taking the structure of the rice Pho1 family into account, suggests that immediately following this tetraploid event, maize would have carried six Pho1 genes, represented by three pairs of syntenic paralogs (homeologs). Subsequently, the maize genome has returned to a diploid state through a process of reorganization that has been coupled with extensive fractionation - the loss of one of a pair of syntenic paralogs [20]. Large scale gene loss following whole genome duplication appears to be a general trend, observed across taxa and across timescales [21]. Gene loss is presumed to be buffered by the presence of a functionally equivalent paralogs. Where both paralogs of a syntenic pair are retained, it may indicate either functional constraint or simply incomplete fractionation. In the former case, such constraint would imply either functional divergence or a selective advantage of increased dosage. In maize, 3228 pairs of syntenic paralogs are retained, representing ∼20% of the total complement of ∼32,000 total genes, or closer to ∼10% of the unduplicated gene set [17] [22]. While gene loss is the more likely outcome following genome duplication, it is difficult to determine the balance of selective gene-by-gene reduction and the largely random loss of larger sections of DNA. Similarly, where a pair of syntenic paralogs are retained, as is the case with Pho1;2, it may indicate selection directly on the gene pair or a genomic context that insulates the gene pair from larger scale DNA loss events. It is noticeable that a number of syntenic paralog pairs have been retained close to the Pho1;2 locus, potentially “hitchhiking” on direct selection to maintain one or more of the adjacent pairs. In the case of the pair GRMZM2G164854/GRMZM5G853379, the two paralogs overlap Pho1;2 sequence on the opposite DNA strand. Consequently, selection to maintain either the Pho1;2 or GRMZM2G164854/GRMZM5G853379 paralog pair might protect also the adjacent genes from silencing or deletion.
Analysis of ZmPho1;2a and ZmPho1;2b transcript accumulation clearly demonstrated regulatory divergence between the two paralogs, with ZmPho1;2b transcripts presenting a pattern similar to that of SbPho1;2, the presumed unduplicated state. Characterization of the leaf transcriptome has estimated 13% of retained syntenic paralogs to undergo regulatory neofunctionalization [22], placing the Pho1;2 pair among just 2-3% of the total maize gene set. Characterization of putative cis-NAT transcripts offered further evidence of regulatory divergence between maize Pho1;2 paralogs. Accumulation of cis-NATZmPho1;2a was induced by P limitation, in a manner similar to that observed for cis-NATOsPho1;2, while cis-NATZmPho1;2b accumulation mirrored that of the ZmPho1;2b sense transcript. Given that we failed to detect Pho1;2 associated anti-sense transcripts in sorghum using the techniques applied, we might infer the unduplicated state from the more distantly related rice. Interestingly, on such a basis, our data are consistent with regulatory neofunctionalization, acting on the one hand on ZmPho1;2a sense, and on the other on ZmPho1;2b anti-sense, transcript accumulation. Although characterized cis-NATs act on the activity of adjacent protein coding genes, the translational enhancer function postulated for Pho1;2 NATs may allow for trans action between ZmPho1;2 paralogs given the degree of sequence similarity.Indeed, one intriguing hypothesis, suggested by our transcript accumulation data, is that subfunctionalization at the maize Pho1;2 loci has resulted in the primary production of sense transcripts from Pho1;2b. Characterization of the insertional alleles described here will be central to determine the functional role of ZmPho1;2a and ZmPho1;2b and associated sense and anti-sense transcripts. We are continuing to mobilize Ac and Ds elements at the maize Pho1;2 loci, taking full advantage of the capacity of the system to generate allelic series, impacting variously sense and anti-sense transcripts. Such material will be invaluable in the fine-scale evaluation of regulatory crosstalk and functional redundancy between between ZmPho1;2 paralogs and, ultimately, the biological role of PHO1 proteins in maize.
Acknowledgments
This work was funded by Mexican National Council of Science and Technology (CONACYT) grant CB2012-151947. 3
Footnotes
↵* rsawers{at}langebio.cinvestav.mx