Abstract
Forward genetics determines the function of genes underlying trait variation by identifying the change in DNA responsible for changes in phenotype. Detecting phenotypically-relevant variation outside protein coding sequences and distinguishing this from neutral variants is not trivial; partly because the mechanisms by which DNA polymorphisms in the intergenic regions affect gene regulation are poorly understood. Here we utilized a dominant genetic marker with a convenient phenotype to investigate the effect of cis and trans-acting regulatory variation. We performed a forward genetic screen for natural variation that suppress or enhance the semi-dominant mutant allele Oy1-N1989, encoding the magnesium chelatase subunit I of maize. This mutant permits rapid phenotyping of leaf color as a reporter for chlorophyll accumulation, and mapping of natural variation in maize affecting chlorophyll metabolism. We identified a single modifier locus segregating between B73 and Mo17 that was linked to the reporter gene itself, which we call very oil yellow1. Based on the variation in OY1 transcript abundance and genome-wide association data, vey1 is predicted to consist of multiple cis-acting regulatory sequence polymorphisms encoded at the wild-type oy1 alleles. The vey1 allele appears to be a common polymorphism in the maize germplasm that alters the expression level of a key gene in chlorophyll biosynthesis. These vey1 alleles have no discernable impact on leaf chlorophyll in the absence of the Oy1-N1989 reporter. Thus, use of a mutant as a simple and efficient reporter for magnesium chelatase activity resulted in the detection of expression-level polymorphisms not readily visible in the laboratory.
Introduction
Genetic variation in the coding or regulatory sequences causes differences between genotypes and is fundamental to crop improvement (Springer and Stupar 2007). Gene function discovery via mutant analyses focuses on linking phenotype alterations to gene variants. As a result, forward genetics has been of great value to understand biological systems but is predominantly useful for determining functions for genes with alleles that cause large phenotypic impacts. Natural variants, including those that encode alleles of relevance to adaptation and fitness of the organism, are often of small effect (Fisher 1930; Orr 1998, 2005). Mutant alleles with conditional impacts on phenotypes, through genetic interactions or modifiers, as well as alleles of small individual effect are difficult to study. Further, even if we identify loci that have not been previously associated with a biological process, it can be difficult to validate and associate such natural variants with physiological and biochemical mechanisms.
A forward genetics approach that uses a mutant phenotype as a reporter for genetic interactions can be used to detect modifiers in natural populations and expose cryptic variation affecting traits of interest (Johal et al. 2008). This Mutant-Assisted Gene Identification and Characterization (MAGIC) approach is particularly efficient in species where outcrossing is easy, such as maize (Zea mays). It can be applied to any mutant with any quantifiable phenotype to expand our understanding of the process disrupted by mutation. This approach is convenient when a dominant mutant allele is used as a reporter because natural variants that encode enhancers or suppressors of a given mutant phenotype can be detected in F1 crosses. Thus, it enables easy detection of dominant enhancers and suppressors. Any germplasm collection, diversity panel, or line-cross population that can serve as the variable parents in these crosses, can be utilized to detect and map loci that alter mutant phenotype expression. Natural variants discovered this way have an experimental link (genetic modifiers) of the processes affected by the mutant reporter allele (induced variation). Thus, this approach speeds the assignment of mechanism(s) to natural variation in the germplasm. Indeed, one can consider this as a screen for epistatic or contingent gene action. MAGIC screens have identified loci from maize involved in the hypersensitive response (Chintamanani et al. 2010; Penning, Johal, and McMullen 2004; Olukolu et al. 2013, 2014, 2016) and plant development (Buescher et al. 2014), among others. In each case, in the absence of a mutant allele, no phenotype was previously associated to the modifier loci demonstrating the remarkable efficiency of this genetic screen to detect epistatic interactions between mutant and modifier alleles. Thus, this approach is a powerful way to both characterize cryptic variation in genomes and construct genetic pathways affecting phenotypes of interest.
The easy visual scoring and simplicity of quantifying chlorophyll make chlorophyll biosynthetic mutants an excellent reporter for MAGIC screens. Chlorophyll is a major component of central metabolism in plants, which can produce phototoxic intermediates during both synthesis (Hu et al. 1998; Huang et al. 2009) and breakdown (Gray et al. 1997; Gray et al. 2002; Mach et al. 2001; Yang et al. 2004), and its levels are carefully regulated in plants (Meskauskiene et al. 2001). Enzymatic conversion of protoporphyrin IX into magnesium protoporphyrin IX by Magnesium Chelatase (MgChl) is the first committed step in chlorophyll biosynthesis (Wettstein et al. 1995). MgChl is a hetero-oligomeric enzyme consisting of three subunits (I, D, and H) that are conserved from prokaryotes to plants. The MgChl subunit I is encoded by the oil yellow1 (oy1; GRMZM2G419806) gene in maize and encodes the AAA+-type ATPase subunit that energizes the complex (Fodje et al. 2001). Weak loss-of-function alleles of oy1 result in recessive yellow-green plants while complete loss-of-function alleles result in a recessive yellow seedling-lethal phenotype (Sawers et al. 2006). The semi-dominant Oy1-N1989 allele carries a leucine (L) to phenylalanine (F) change at amino acid position 176 (L176F) that hinders the formation of a functional complex between the OY1 protein and other subunits of MgChl. Heterozygous plants carrying one Oy1-N1989 allele and one wild-type allele are oil-yellow, but homozygous Oy1-N1989 plants lack any MgChl activity, resulting in a recessive yellow seedling-lethal phenotype with no chlorophyll accumulation (Sawers et al. 2006). Consistent with the conservation of this protein complex, the orthologous L->F mutation (encoded by Oy1-N1989) was identified in barley (L161F) and also results in a recessive yellow seedling-lethal phenotype with no detectable chlorophyll in homozygous condition and a pale-green phenotype as a heterozygote (Hansson et al. 1999). The biochemical basis of this semi-dominant mutant allele was studied by creating a mutant MgChl subunit I in Rhodobacter (BCHI), with the orthologous amino acid change at position 111 of wild-type BCHI (Hansson et al. 2002). The L111F mutation converted BCHI into a competitive inhibitor of MgChl that reduced enzyme activity by 4-fold when mixed 1:1 with wild-type BCHI (Hansson et al. 2002; Lundqvist et al. 2013). This conserved leucine residue is between the ATP-binding fold of MgChl subunit I created by the Walker A and B motifs (Hansson et al. 2002; Sawers et al. 2006; Lundqvist et al. 2013), and its substitution with phenylalanine was deleterious to dephosphorylation activity. The ATPase activity of wild-type MgChl complex is directly proportional to the magnesium chelation reaction. However, complexes assembled from MgChl subunit I with the L->F change exhibit reduced ATPase activity and no ability to chelate Mg2+ ions into protoporphyrin IX (Hansson et al. 1999, 2002; Sawers et al. 2006). The absence of MgChl activity displayed by the mutant BCHI carrying L111F substitution (BCHIL111F) demonstrates that this amino acid change results in a dominant-negative subunit which disrupts the coupling of ATPase activity to magnesium chelation (Hansson et al. 2002; Lundqvist et al. 2013).
We screened maize germplasm for cryptic variation using the Oy1-N1989 mutant as a dominant reporter for chlorophyll biosynthesis. We hypothesized that alteration in the quantity of biochemically active MgChl complex should read out as a change in chlorophyll content and plant color in heterozygous Oy1-N1989 mutants. For instance, in a population of heterozygous Oy1-N1989/oy1 F1 plants, an amino acid change in the wild-type OY1 protein that alters the dissociation constant (kD) of OY1 for the other protein subunits of MgChl should contribute to variance in chlorophyll biogenesis. Similarly, a cis-acting expression QTL (eQTL) that increases expression of the wild-type oy1 allele should result in assembly of more active MgChl complexes and increase chlorophyll content in F1 mutant plants. Thus, chlorophyll content of Oy1-N1989 mutants should be modulated by the stoichiometry of both the wild-type and mutant OY1 proteins present in the MgChl complex in heterozygous Oy1-N1989/oy1 plants either by protein structure or abundance changes.
We introgressed the maize Oy1-N1989 mutant allele into the B73 inbred background and maintained it as a heterozygote (Oy1-N1989/oy1:B73). While crossing this mutant to multiple backgrounds, we detected genetic variation in the Oy1-N1989/oy1 mutant phenotype expression between the maize inbred lines B73 and Mo17. The phenotype of the Oy1-N1989 mutant heterozygotes was suppressed in the B73 background. However, F1 hybrids with Mo17 dramatically enhanced it. In an attempt to determine the genetic basis of this modification, we carried out genetic mapping experiments using five F1 populations. In each of these mapping experiments, the Oy1-N1989/oy1:B73 parents were crossed with (1) the intermated B73 x Mo17 recombinant inbred lines (IBM), (2) B73 x Mo17 doubled haploid lines (Syn10), (3) Mo17 x B73 F1 hybrids, (4) B73 x Mo17 near-isogenic lines (BM-NILs), and (5) a genome-wide association mapping panel of diverse maize inbred lines (MDL). In each case, we identified a quantitative trait locus (QTL) of large effect on chromosome 10 linked to the oy1 locus itself. In each of the B73 x Mo17 line-cross populations, the B73 wild-type allele at oy1 suppressed the Oy1-N1989/oy1 mutant phenotype. This QTL, which we call very oil yellow1 (vey1), was not associated with changes in protein sequence at oy1 and did not correlate with the chlorophyll content of the wild-type mapping parents. However, a cis-acting eQTL causing higher expression of the B73 allele of oy1 in the IBM lines was consistent with the observed phenotypic variation in the mutant siblings. Consistently, the allele-specific expression at oy1 in the mutant heterozygotes was biased towards the Oy1-N1989 allele in the hybrids comprising wild-type oy1 allele from Mo17. The inheritance of the traits, proposed allele expression bias at oy1 due to a putative cis-acting regulatory element, implications of the discovered cryptic variation, and the utility of this study in general are discussed.
Materials and Methods
Plant materials
The Oy1-N1989 mutant allele was acquired from the Maize Genetics Cooperation Stock Center (University of Illinois, Urbana-Champaign, IL). The original allele of the Oy1-N1989 mutation was isolated from a r1 c1 colorless synthetic stock of mixed parentage (G. Neuffer, personal communication). This allele was introgressed into B73 for eight generations by repeated backcrossing of B73 ear-parents with Oy1-N1989/oy1 pollen-parents and is maintained as a heterozygote (Oy1-N1989/oy1:B73) by crossing to wild-type siblings.
Line-cross QTL mapping: For these experiments, Oy1-N1989/oy1:B73 plants were crossed as pollen-parents to 216 Intermated B73 x Mo17 recombinant inbred lines (IBM; Lee et al, 2002) and 251 Syn10 doubled haploid lines (Syn10; Hussain et al, 2007). The QTL validation was done using F1 progenies derived from the cross of 35 B73-Mo17 Near-Isogenic lines (BM-NILs) ears with pollen from Oy1-N1989/oy1:B73. These BM-NILs consisted of 22 B73-like NILs and 13 Mo17-like NILs with introgression of the reciprocal parental genome (B73 or Mo17) and were developed by three repeated backcrosses into recurrent parent followed by four to six generations of self-pollination (Eichten et al. 2011).
Genome-wide association (GWA) mapping: For this experiment, Oy1-N1989/oy1:B73 plants were crossed to 343 inbred lines that included 249 inbreds from maize association panel (Flint-Garcia et al. 2005), and 94 inbred lines that included 82 Expired Plant Variety Protections (ExPVP) lines from the Ames panel (Romay et al. 2013). Pollen from Oy1-N1989/oy1:B73 plants were used for these crosses except for the popcorn lines in the maize association panel, where Oy1-N1989/oy1:B73 plants were used as an ear-parent to avoid the crossing barrier affected by gametophytic factor GA1-S (first described by Correns in 1902; Mangelsdorf and Jones 1926; Lauter et al. 2017). This panel of 343 inbred lines is referred to as maize diversity lines (MDL). The full list of IBM, Syn10, BM-NILs, and MDL used to develop F1 hybrid populations are provided in Tables S1-S4.
Field trials
All field experiments were performed at the Purdue Agronomy Center for Research and Education (ACRE) in West Lafayette, Indiana. All F1 populations described below were planted as a single plot of 12-16 plants that segregated for both mutant and wild-type siblings. Plots were sown in a 3.84 m long row with the inter-row spacing of 0.79 meters and an alley space of 0.79 meters. No irrigation was applied during the entire crop season as rainfalls were uniformly distributed for satisfactory plant growth. Conventional fertilizer, pest and weed control practices for growing field maize in Indiana were followed. Progenies of Oy1-N1989/oy1:B73 pollen-parents crossed with B73 and Mo17 were planted as parental checks in each block of every experiment. The testcross F1 populations with IBM were evaluated in a single replication in the summer of 2013 with each range treated as a block. In 2016, the testcross F1 populations with Syn10 lines were evaluated as two replications in a randomized complete block design (RCBD) with each range divided into two blocks. The testcross F1 progenies with BM-NILs and parents (B73 and Mo17) were planted in a RCBD with five replications in 2016. In the same year, F1 populations with MDL were also evaluated with three replications planted in a RCBD. Each replication of MDL F1 population was divided into ten blocks of the same size, and parental checks were randomized within each block.
Phenotyping and data collection
Maize seedlings used for destructive chlorophyll quantification were grown under greenhouse conditions using mogul base high-pressure sodium lamps (1000 Watts) as the supplemental light source for L:D cycle of 16:8 and temperature around 28°C (day-time) and 20°C (night-time). Destructive chlorophyll measurements were performed on the fresh weight basis in 80% acetone solution using a UV-VIS spectroscopic method described in Lichtenthaler & Buschmann, 2001.
For the field-grown experiments, mutant siblings in the suppressing genetic backgrounds were tagged at knee height stage with plastic tags so that they can be easily distinguished from the wild-type siblings for trait measurements at later developmental stages in the season. All the F1 families segregated for the mutant (Oy1-N1989/oy1) and wild-type (oy1/oy1) siblings in approximately 1:1 fashion. For each F1 family, two to four plants of each phenotypic class were picked at random for trait measurements. Non-destructive chlorophyll content in the maize leaves was approximated using a chlorophyll content meter model CCM-200 plus (Opti-Sciences, Inc., Hudson, NH) and the measurements were expressed as chlorophyll content index (CCM). Measurements were taken on the leaf lamina of the top fully expanded leaf at two time points. First CCM measurements were taken at 25-30 days after sowing (expressed as CCMI) and the second at 45-50 days after sowing (expressed as CCMII). For each trait, measurements were performed on both mutant (reported with a prefix MT) and wild-type (reported with a prefix WT) siblings. Besides using primary trait measurements of CCMI and CCMII on mutant and wild-type siblings, indirect CCM measurements were also calculated and expressed as ratios (MT/WT) and differences (WT-MT) of CCMI and CCMII. Phenotypic data of all the CCM traits in the F1 populations with both bi-parental populations, BM-NILs and MDL are provided in Tables S1-S4.
Public/open-access genotypic and gene expression data
Public marker data for the IBM was obtained from www.maizegdb.org (Sen et al. 2010). A total of 2,178 retrieved markers were reduced to 2,156 after the removal of markers with the duplicate assessment. Approximately 13.3% of the marker data were missing. The reads per kilobase of transcript per million mapped reads (RPKM) for the oy1 locus were obtained from a public repository of the National Science Foundation grant (GEPR: Genomic Analyses of shoot meristem function in maize; NSF DBI-0820610; https://ftp.maizegdb.org/MaizeGDB/FTP/shoot_apical_meristem_data_scanlon_lab/). These data consist of normalized read counts (expressed as RPKM) of the maize genes from the transcriptome of shoot apex of 14 days old IBM seedlings.
Marker data for Syn10 lines was obtained from Liu et al. 2015. The Syn10 lines were genotyped at 6611 positions (B73 RefGenv2) with SNP markers covering all ten chromosomes of maize. The entire set of markers were used for linkage analysis as there was no missing data. All B73-Mo17 NILs in both the B73 and Mo17 recurrent backgrounds that had introgression of the critical region from the opposite parent were selected for QTL validation. Genotyping data of the BM-NILs to choose informative lines and perform QTL validation was obtained from Eichten et al. 2011.
Genotypes for the MDL used in this study to perform GWA were obtained from third generation maize haplotypes (HapMap3) described in Bukowski et al. 2018. We identified 305 common inbred lines that were part of HapMap3. Briefly, HapMap3 consists of over 83 million variant sites across ten chromosomes of maize that are anchored to B73 version 3 assembly. After obtaining the genotypic data of 305 common inbred lines, variant sites were filtered for ≤10 percent missing data and minor allele frequency of ≥0.05 using VCFtools (Danecek et al. 2011). The filtered SNP dataset was used for GWA analyses. A summary of variant sites before and after the filtering procedure are listed in Table S5. A reduced set of genotypes for the same set of 305 accessions were obtained from the maize inbred lines described in Romay et al. 2013. These genotypes consist of 681 257 SNPs (physical positions from B73 RefGenv2) obtained using a GBS protocol (Elshire et al. 2011) covering all ten chromosomes of maize. This marker dataset was filtered for ≤10 percent missing data and minor allele frequency of ≥0.05 using TASSEL (Bradbury et al. 2007), reducing the marker number to 150 920 SNPs. This genotypic dataset was solely used to compute principal components (PC) and a kinship matrix to control for population structure and familial relatedness in a unified mixed linear model, respectively (Yu et al. 2006). GWA analyses using the reduced marker dataset (Romay et al. 2013) on the full set of MDL found similar results to the one by the larger marker dataset (HapMap3) on the reduced set of MDL (data not shown). To test for cis-eQTL at the oy1 locus in the maize diversity lines used for GWA mapping, normalized count of OY1 expression derived from the germinating seedling shoots was obtained from http://www.cyverse.org (Kremling et al. 2018).
Statistical analyses
QTL mapping: Line-cross phenotypes and markers were used to detect and localize QTL using the R/QTL package version 1.40-8 (Broman et al. 2003). Trait means were used for the QTL analyses. Single interval mapping (SIM) was used for all traits, although composite interval mapping (CIM) was carried out with remarkably similar results (data not shown). Statistical thresholds were computed by 1000 permutations (Churchill and Doerge 1994) of each trait in both bi-parental F1 populations.
Genome-wide association study (GWAS): Preliminary data analysis was done using JMP 13.0 (SAS Institute Inc. 2016). Statistical corrections on the raw phenotypic data were performed by determining the most significant terms in the model using analysis of variance (Fisher 1921). Genotype and replication were used as a random effect in a linear mixed-effects model built in the lme4 package (Bates et al. 2015) implemented in R (R core team, 2014) to calculate the best linear unbiased predictor (BLUP) for each trait. Broad-sense heritability (line mean basis) were calculated using BLUP values, using the method described by Lin and Allaire 1977. BLUP estimates for each trait were used to perform GWAS. GWAS was done using a compressed mixed linear model implemented in R package GAPIT (Lipka et al. 2012; Zhang et al. 2010). HapMap3 SNPs were used to calculate genotype to phenotype associations. As explained before, kinship and population structure estimates were obtained for the same population using the second subset of 150 920 SNPs to correct for spurious associations. The Bonferroni correction and false discovery rate (FDR) adjustments were used to compute a statistical threshold for the positive association to further control for false positive assessment of associations (Holm 1979; Benjamini and Hochberg 1995).
Molecular Analyses
Genotyping: The recombinants in selected Syn10 lines and the BC1F1 population were detected using three PCR-based markers. Two markers detecting insertion polymorphisms flanking the oy1 locus and one dCAPS marker at oy1 locus were designed for this purpose. Genotyping at insertion-deletion (indel) marker ftcl1 (flanking an indel polymorphism in intron 4 of ftcl1; GRMZM2G001904) was performed with forward primer 5’- GCAGAGCTGGAATATGGAATGC-3’ and reverse primer 5’-GATGACCTGAGTAGGGGTGC -3’. Genotyping at indel marker gfa2 (flanking an indel polymorphism in the intron of gfa2; GRMZM2G118316) was performed with forward primer 5’- ACGGCTCCAAAAGTCGTGTA -3’ and reverse primer 5’-ATGGATGGGGTCAGGAAAGC -3’. A polymorphic SNP in the second intron at oy1 was used to design a dCAPS forward primer 5’- CGCCCCCGTTCTCCAATCCTGC -3’ and a gene-specific reverse primer 5’- GACCTCGGGGCCCATGACCT -3’ (http://helix.wustl.edu/dcaps/dcaps.html; Neff et al. 2002). The PCR products from polymerization reactions with the dCAPS oligonucleotide at oy1 were digested by Pst I restriction endonuclease (New England Biolabs, MA, USA) and resolved on 3.5% agarose gel.
Allele-specific expression analyses: Allelic bias at transcriptional level was quantified using the third leaf of maize seedlings at the V3 developmental stage. Total RNA was extracted using a modified Phenol/Lithium chloride protocol (Eggermont et al. 1996). The modification involved grinding of plant tissue in pestle and mortar into a fine powder using liquid nitrogen, instead of quartz sand and glass beads in the original protocol. Total RNA was subjected to DNase I treatment using Invitrogen Turbo DNA-free kit (Catalog#AM1907, Life Technologies, Carlsbad, CA) and 1 µg of DNase treated RNA from each sample was converted to cDNA using oligo dT primers and a recombinant M-MLV reverse transcriptase provided in iScriptTM Select cDNA synthesis kit (catalog#170-8896, Bio-Rad, Hercules, CA) according to the manufacturer’s recommendations. Besides the cDNA samples, genomic DNA samples were also prepared as a control to test the sensitivity of the assay. Genomic DNA controls included a 1:1 (F1 hybrid), 1:2, and 2:1 mixture of B73 and Mo17. PCR was conducted using gene-specific forward primer 5’- CAACGTCATCGACCCCAAGA -3’ and reverse primer 5’- GGTTACCAGAGCCGATGAGG -3’ for 30 cycles (94° for 30s, 56° for 30s, 72 ° for 30s and final extension for 2 minutes) to amplify the OY1 gene product. These primers flank SNP252 (C->T), which is the causative mutation of Oy1-N1989, and SNP317 (C->T) which is polymorphic between B73 and Mo17 but monomorphic between the B73 and Oy1-N1989 genetic backgrounds. Corresponding PCR products were used to generate sequencing libraries using transposon-mediated library construction with the Illumina Nextera® DNA library preparation kit, and sequence data were generated on a MiSeq instrument (Illumina, San Diego, CA) at the Purdue Genomics Core Facility. The SNP variation and read counts were decoded from the sequenced PCR amplicons by alignment of the quality controlled reads to the oy1 reference allele from B73 using the BBMap (Bushnell 2014) and the GATK packages (DePristo et al. 2011). Additional analysis was performed using IGV (Robinson et al. 2011) to manually quality-check the alignments and SNP calls. Read counts at polymorphic sites obtained from GATK was used to calculate allele-specific expression. Genomic DNA control samples showed bias in the read counts in a dosage-dependent manner. DNA from F1 hybrids between B73 and Mo17 resulted in 1:1 reads at oy1 demonstrating no bias in the assay to quantify expression.
OY1 sequencing: The coding sequence of the oy1 locus from B73, Mo17, W22, and Oy1-N1989 homozygous seedlings was obtained from the genomic DNA using PCR amplification. For the rest of the maize inbred lines, the oy1 locus was amplified from cDNA synthesized from total RNA derived from the shoot tissue of 14 days old maize seedlings. PCR amplification of oy1 locus from genomic DNA was performed using four primer pairs: (a) forward primer Oy1-FP1 5’- GCAAGCATGTTGGGCACAGCG -3’ and reverse primer Oy1-RP12 5’- GGGCGGCGGGATTGGAGAAC -3’, (b) forward primer Oy1-FP5 5’- GGTGGAGAGGGAGGGTATCT -3’ and reverse primer Oy1-RP6 5’- GGACCGAGGAAATACTTCCG -3’, (c) forward primer Oy1-F8 5’- ATGCCCCTTCTTCCTCTCCT -3’ and reverse primer Oy1-R8 5’- CGCCTTCTCGATGTCAATGG -3’, (d) forward primer Oy1-F9 5’- GGCACCATTGACATCGAGAA -3’ and reverse primer Oy1-R9 5’- GCTGTCCCTTCCTTTCAACG -3’. PCR amplification of OY1 transcripts from cDNA was performed using all primer pairs except Oy1-FP1/RP12. The PCR products from these samples were sequenced either using Sanger or Illumina sequencing. For Sanger sequencing, amplified PCR products were cleaned using homemade filters with sterile cotton plug and Bio-Gel® P-30 gel (Bio-Rad, Hercules, CA) to remove unused dNTPs and primers. Cleaned PCR products were used to perform a cycle reaction using Big Dye version 3.1 chemistry (Applied Biosystems, Waltham, MA) and run on ABI 3730XL sequencer by Purdue genomics core facility. Read with high-quality base pairs from Sanger sequencing were aligned using ClustalW (Thompson et al. 1994). Illumina sequencing was performed as described above except in this case paired-end reads were aligned to the B73 reference of oy1 gene using bwa version 0.7.12 (Li and Durbin 2009) and variant calling was done using Samtools (Li et al. 2009).
Results
Mo17 encodes an enhancer of the semi-dominant mutant allele Oy1-N1989 of maize
The Oy1-N1989 allele was recovered from a nitrosoguanidine mutant population in mixed genetic background. The molecular nature of the mutation is a single non-synonymous base pair change (Sawers et al. 2006). Heterozygous Oy1-N1989 plants have the eponymous oil-yellow color but are reasonably vigorous and produce both ears and tassels. During introgression of the semi-dominant Oy1-N1989 allele into B73 and Mo17 inbred backgrounds, we observed a dramatic suppression of the mutant phenotype in F1 crosses of the mutant stock (obtained from Maize COOP) to the B73 background. In contrast, crosses to Mo17 enhanced the mutant phenotype. The difference in phenotype expression was stable and persisted in both genetic backgrounds through all six backcross generations observed to date. To further explore and quantify this suppression, B73, Mo17, as well as Oy1-N1989/oy1:B73, crossed with each of these inbred lines were grown to the V3 stage in the greenhouse. To improve upon our visual assessment of leaf color and provide quantitation, optical absorbance was measured using a Chlorophyll Content Meter-200 plus (CCM; Opti-Sciences, Inc), a hand-held LED-based instrument. CCM is predicted to strongly correlate with chlorophyll and carotenoid contents. To confirm that our rapid phenotyping with the CCM would accurately assess chlorophyll levels, we measured these pigments using a UV-VIS spectrophotometer following destructive sampling of the same leaves used for CCM measurements. The non-destructive CCM measurements and destructive pigment quantification using UV-VIS protocol displayed a strong positive correlation with an R2 value of 0.94 for chla, chlb, and total chlorophyll (Figure S1). Given this high correlation of maize leaf greenness between the rapid measurement using CCM-200 plus instrument and absolute pigment contents quantified using UV-VIS spectrophotometer (destructive sampling), we performed all chlorophyll measurements of Oy1-N1989 enhancement discussed in later results using CCM values.
In the greenhouse grown seedlings, we observed no visible chlorophyll accumulation in the Oy1-N1989 homozygotes using CCM or spectrophotometric method. In wild-type plants, CCM measurements were slightly higher in B73 than Mo17, but the spectrophotometric method did not identify any significant difference in the amount of chlorophyll a (chla), chlorophyll b (chlb), total chlorophyll, or carotenoids between these two genotypes (Table S6). We detected a mild parent-of-origin-effect for both CCM and absolute amounts of chla, chlb, total chlorophyll, and carotenoids in the wild-type siblings of our F1 crosses. These plants had slightly higher chlorophyll accumulation when B73 was used as the pollen parent (Table S6). However, no such effect of parent-of-origin was observed for the mutant heterozygotes (Oy1-N1989/oy1) and reciprocal hybrid combinations of crosses between Oy1-N1989/oy1:B73 and Mo17 were indistinguishable. Further, both CCM and absolute chlorophyll contents were higher in the Oy1-N1989/oy1:B73 plants compared to the mutants in B73 x Mo17 hybrid background. Thus, there was a strong effect of genetics on chlorophyll pigment variation in mutants, that went opposite to predictions for hybrid vigor.
Heterozygous maize plants encoding the Oy1-N1989 allele display reduced chlorophyll pigment abundance compared to the wild-type siblings, resulting in a yellow-green whole plant phenotype due to reduced MgChl and ATPase activity (Sawers et al. 2006). We tested the progenies from the crosses of Oy1-N1989/oy1:B73 with B73 and Mo17 inbred lines in the field. Consistent with our previous observation, B73 inbred background resulted in substantially greener mutant heterozygotes (Oy1-N1989/oy1) than mutant siblings in the Mo17 x B73 F1 hybrid background (Figure 1, Table S7). The increased severity of Oy1-N1989 heterozygotes in the Mo17 genetic background was observed even after six backcrosses (Figure S2). No suppressed mutant plants were observed during any generation of backcrossing into Mo17 (data not shown). Thus, these results demonstrate the profound negative impact of the Oy1-N1989 allele on chlorophyll pigment accumulation and the dramatic differential suppression response of this allele by B73 and Mo17 genetic backgrounds.
vey1 maps to a single locus that co-segregates with the oy1 allele of Mo17 in DH, RIL, BC1F1 and NIL families derived from B73 and Mo17
To identify the genetic basis of the suppression of heterozygotes with the Oy1-N1989 allele in B73, we performed a series of crosses to four mapping populations. In each case, we crossed a B73 line into which we have introgressed Oy1-N1989 allele in heterozygous condition (Oy1-N1989/oy1:B73) as a pollen-parent to a population of recombinant lines as ear-parent (Figure 2). We chose two public populations to map all modifiers altering the severity of the Oy1-N1989 phenotypes. The IBM is a publicly available RIL population that has been extensively used by the maize community for a variety of mapping experiments (Lee et al. 2002). A second intermated B73 x Mo17 population Syn10 is derived from ten rounds of intermating followed by fixing alleles using double haploid process (Hussain et al. 2007). IBM and Syn10 differ in the number of rounds of intermating, and therefore vary in the number of recombinants captured and genetic resolution of trait localization. Each F1 progeny of the testcross with Oy1-N1989/oy1:B73 pollen-parent segregated approximately 1:1 for wild-type (oy1/oy1) and mutant heterozygote (Oy1-N1989/oy1) in the hybrid genetic background with B73. In the mutant heterozygote siblings of both (IBM and Syn10) F1 populations, chlorophyll approximation using CCM measured at an early (CCMI) and late (CCMII) developmental stages displayed bimodal trait distributions (Figures 3a and 3b; Figures S3 and S4), and there was no overlap between the wild-type and mutant CCM distribution (Figures S5a and S5b). Moreover, mutant siblings alone in these F1 populations with IBM and Syn10 displayed bimodality for CCM measurements, suggesting segregation of a single, effectively Mendelian, suppressor locus (Figures 3a and 3b). We name this suppressor locus very oil yellow1 (vey1). CCM values collected at two different times (CCMI and CCMII) in the wild-type F1 siblings show positive correlations in both F1 populations (Tables S8 and S9; Figures S3 and S4). Similar trend was also observed for the CCMI and CCMII in the mutant F1 siblings. However, CCM measurements in the wild-type and mutant displayed statistically insignificant correlations with each other. This suggests that higher chlorophyll accumulation in the mutant siblings is not predicted by the amount of chlorophyll accumulation in the wild-type siblings. To control for variation in CCM observed due to the genetic potential of each line that was independent of the Oy1-N1989 modification, we divided the mutant CCM trait values by the congenic wild-type sibling CCM values to derive ratio for both time points. We also calculated differences between congenic wild-type and mutant CCM values. Each of the direct measurements, as well as the ratio-metric and difference values, were used as phenotypes to detect and localize QTL.
QTL mapping was carried out on all traits using single interval mapping by the EM algorithm as implemented in R/qtl (Broman et al. 2003). Summary of the peak positions of all QTLs passing a permutation computed threshold are presented in Table S10 for the IBM F1 populations and Table S11 for the Syn10 F1 populations. All mutant CCM traits, all mutant to wild-type CCM ratios, and all differences between mutant and wild-type CCM measurements identified vey1 as a QTL of large effect on chromosome 10. A plot of the log10 of odds (LOD) score and permutation calculated threshold for CCMII from the mutant siblings in the Syn10 F1 population is plotted in Figures 3c and 3d. Other mutant-related traits in Syn10 and IBM F1 populations produced similar plots (data not shown). The magnitude of the effects of vey1 genetic position on all of the traits in the mutant siblings, particularly the effects on CCMI and CCMII, were consistent with the prediction of a single Mendelian locus controlling the majority of the phenotypic variance in these traits based on trait distributions (Figure 3a and 3b). Only one additional minor effect QTL was identified that influenced the wild-type CCMI. This QTL was only found in the IBM F1 population (Table S10). No QTL affecting the chlorophyll accumulation of wild-type siblings were detected at the position of vey1, or anywhere else in the genome. Interestingly, vey1 mapped to the same genetic position as oy1 locus itself (Figures 3e and 3f). These analyses indicate that the vey1 QTL encodes a single locus with an effect contingent upon the allelic status at oy1 responsible for the suppression of Oy1-N1989 in these populations.
The identification of the vey1 modifier as a single Mendelian locus of large effect in the presence of the Oy1-N1989 allele suggested that we could take a similar approach to localization as for a previous metabolic QTL (Li et al. 2014). Thus, we classified each F1 mutant hybrid in Syn10 F1 population as high or low CCMII by using the bimodal distribution to assign lines into phenotypic categories. We then compared the marker genotypes at each marker under vey1 with the phenotypic categories. Table S12 shows the mutant trait values, marker genotypes, and phenotypic categories for the nine F1 Syn10 lines with recombinants within the vey1 region (between the flanking markers 10.90.5 and 10.95.5). Because of the high penetrance of the CCM trait, we interpret a discordance between the marker genotype and F1 mutant phenotype for high and low CCMII categorization as recombination between vey1 and that marker. A summary of the recombinant data across the vey1 flanking markers and additional markers with the physical position of each marker annotated by the number of instances of discordance between the markers and the phenotypic class is presented in Figures 3e and 3f. Genotypes at marker 10.93 perfectly predicted CCMII trait expression in Syn10 F1 mutant siblings. Recombinants separated the trait outcome from the marker genotype in one Syn10 line at marker 10.94.5 and two Syn10 lines at 10.90.5, indicating that the QTL resides in ∼227kb interval between markers 10.94.5 and 10.90.5 (Figure 3e). This physical position includes the oy1 locus itself, suggesting that vey1 may be encoded by the Mo17 allele of oy1 which enhances the impact of the Oy1-N1989 allele. In the three Syn10 lines that contained recombination within this critical region, the genotype at oy1 matched mutant CCM trait values perfectly. To improve the resolution of vey1 localization we developed additional markers in the region. The genotypes at polymorphic indel markers were determined for the Syn10 recombinants. One marker was encoded by the ftcl1 locus, two genes towards the telomere from oy1 and the second, was encoded by gfa2, one gene towards the centromere. No recombinants were detected between an indel marker in the proximal end of the gfa2 gene and vey1, although marker 10.94.5 from the Syn10 population is at the distal end of the gfa2 gene and did show one recombinant (Figure 3e). Three recombinants were identified that separated the ftcl1 marker from the vey1 QTL. As a result, vey1 could be encoded by the genomic region between ftcl1 and gfa2. This region includes oy1, ereb28 (GRMZM2G544539), the small regions of gfa2 (from marker 10.94.5), and ftcl1 proximate to oy1.
A similar fine mapping approach was adopted for the data from the F1 crosses to the IBM. Mutant CCMII readings were used to bin IBM F1 population into suppressed and enhanced categories. The number of cases of discordance between each marker genotype in the vey1 region and the F1 mutant phenotypic class is summarized in Figure 3f. For all IBM with unambiguous genotypes, the Oy1-N1989 suppression or enhancement phenotype was correctly predicted by marker genotypes at isu085b. A single recombinant between phenotype and genotype was identified for the flanking marker phi059 which is ∼372 kb from the oy1 locus (towards the telomere). Twenty-seven recombinants between the phenotype and genotype were noted for the marker umc2069 which is ∼3.14 Mb from oy1 (towards the centromere). This analysis identified an ∼3.51 Mbp window in IBM containing vey1 on chromosome 10 providing a confirmation of the Syn10 results with no additional resolution. We attempted to generate additional recombinants by generating a population of ∼1100 BC1F1 plants from (B73 x Mo17) x Oy1-N1989/oy1:B73 crosses. All the mutant siblings from this population (n=576) were separated into suppressed and enhanced phenotype categories by CCM readings and genotyped at ftcl1, oy1, and gfa2. In a sample of 576 mutants, we identified three recombinants between vey1 and ftcl1 indel marker, whereas, no recombinant at oy1 and gfa2 were detected (Figure 3e). Thus, we could not further narrow down this QTL interval.
To validate the vey1 locus, we made crosses between Oy1-N1989/oy1 in the B73 background and a series of BM-NILs that contained the vey1 QTL region introgressed into a homogeneous background of either B73 or Mo17. These BM-NILs also displayed the bimodal effect observed in the QTL experiment, but now without additional segregating B73 and Mo17 alleles. This bimodality was still visible in crosses of Oy1-N1989 mutant to NILs that used B73 as the recurrent parent to test the vey1 QTL in an otherwise inbred B73 background demonstrating that a hybrid background was not necessary for vey1 expression (Figure S6 and Table S3). We also observed an increase in the mutant CCM in F1 hybrids of Oy1-N1989/oy1:B73 with NILs that used Mo17 as a recurrent parent but had B73 introgression at vey1 locus. Thus, our NIL data confirm the expectation of the QTL mapping and validates the existence of the vey1 modifier. The recombinants in the BM-NILs were sufficient to identify an ∼3.01 Mb region that must contain the vey1 QTL, but this did not further narrow the region from the four gene window which included the oy1 locus itself, identified in the Syn10 F1 population.
Thus, the formal list of candidate genes for the vey1 QTL is the oy1 gene itself and the three most closely linked loci. Locus ftcl1 is annotated as a 5-formyl tetrahydrofolate cyclo-ligase1, which is involved in folate metabolism. The ortholog of Zmftcl1 (∼62% protein identity) from Arabidopsis thaliana has been shown to be localized in the chloroplast and T-DNA insertion knockouts are embryo lethal (Pribat et al. 2011). The maize gene gfa2 is uncharacterized, but mutation of the Arabidopsis ortholog caused defects in megagametogenesis including failures of polar nuclear fusion in the female gametophyte and synergid cell-death at fertilization (Christensen et al. 2002; Christensen, Subramanian, and Drews 1998). The third linked gene, ereb28 (Apetela2-Ethylene Responsive Element Binding Protein-transcription factor 28) exhibits poor homology to other plant species but has a highly conserved AP2/EREB domain. This gene has a very low expression level and is localized only to the root tissue of maize (http://www.maizegdb.org/gene_center/gene?id=GRMZM2G544539#rnaseq).
Controlling for the vey1 QTL neither detected additional epistatic interactions with vey1 nor Oy1-N1989 phenotype expression
The non-normality of some of the trait distributions and apparent thresholds prompted us to explore additional QTL models. No additional QTL were recovered by implementing two-part threshold models (Broman et al. 2003) for any of the traits (data not shown). This observation is consistent with a single major QTL affecting the non-normality in the phenotypic data and normal distribution of residuals remaining after single marker regressions (data not shown). Similarly, two-way scans of the genome also failed to detect any statistically significant genetic interactions. It is worth noting that in both the IBM and the Syn10, the region encoding vey1 exhibited substantial segregation distortion with the B73:Mo17 alleles present at 120:72 in the IBM and 175:76 in the Syn10 population. This uneven sample size will reduce the power to detect epistasis with vey1 but would not limit the detection of additional unlinked epistatic modifiers of Oy1-N1989.
We used the top marker at vey1 as a covariate to control for the contribution of this allele to phenotypic variation and performed a one-dimensional scan of the genome (Broman et al. 2003). In our previous naïve one-dimensional scans, the large effect of vey1 partitioned into the error term and might reduce our power to detect additional unlinked QTL(s). By adding a marker linked to vey1 as a covariate, this term will capture the variance explained by vey1 and should improve detection of additional QTL(s) of presumably smaller effect. Use of vey1 linked marker as a covariate, in both IBM and Syn10 F1 populations, did not identify any additional QTL for any trait (data not shown). Thus, modification of the Oy1-N1989 phenotype by vey1 was inherited as a single QTL, acting alone.
GWAS for chlorophyll content in maize diversity lines and Oy1-N1989/oy1 F1 genotypes identifies vey1
We undertook GWA mapping of Oy1-N1989 severity to search for additional loci and potentially identify recombinants at vey1. A population of 343 lines including members of the maize association panel (Flint-Garcia et al. 2005) and the Ames panel including ExPVPs (Romay et al. 2013) were crossed to Oy1-N1989/oy1:B73. This procedure generated MDL x Oy1-N1989/oy1:B73 F1 populations segregating 1:1 for mutant and wild-type siblings in hybrid genetic background. There was total separation between mutant and wild-type siblings in the MDL F1 populations for the CCMI and CCMII traits (Figure S5c). Additionally, dramatically enhanced mutant F1 families, similar to the F1 progeny of Mo17 x Oy1-N1989/oy1:B73, were present within the MDL F1 populations (Figure S7). Pairwise correlations of the CCM trait measurements at two-time points (CCMI and CCMII) in the wild-type siblings displayed statistically significant positive relationship (Table S13). CCM traits were much more strongly correlated in the mutant F1 siblings, similar to the B73 x Mo17 F1 populations. However, weak positive correlations were also observed between mutant and wild-type CCM measurements in the MDL F1 populations. Broad-sense heritability estimates were also calculated in the MDL F1 population. The chlorophyll estimates of the leaves (CCMI and CCMII) showed very high heritability for the mutant and ratio traits, whereas wild-type siblings had much lower repeatability (Table S14).
GWA was performed using the HapMap3 SNP data set (Bukowski et al. 2018). Although 343 inbred maize lines were crossed to Oy1-N1989/oy1:B73, only 305 inbred lines were genotyped as part of HapMap3 and were subsequently used for GWAS. The variation in mutant plant’s CCMI, CCMII, and their ratios identified a single locus that passed a multiple test correction (see Methods) on chromosome 10 at the site of the oy1 gene. Just like the detection of the vey1 QTL in the B73 x Mo17 RIL, no other loci modifying these traits were identified in the GWA analysis. No statistically significant peak was detected for the wild-type CCM traits. The Manhattan plot showing the negative log10 of the p-values from GWA tests for all the SNPs for MT_CCMII trait is graphed in Figure 4a. A closer view of the SNPs within the region encoding the vey1 locus on chromosome 10 are plotted in Figure 4c. A summary of the GWAS results for mutant CCM and ratio traits is presented in Table S15. The top association for the mutant CCM traits was a SNP at position 9161643 on chromosome 10 located just 3’ of the oy1 locus (a gene on the reverse strand). This SNP displays high allelic frequency in our population (f=0.49). Thus, it appears that the oy1 locus can be responsible for the suppression of the Oy1-N1989 mutant allele in the diverse panel of maize inbred lines analyzed in this experiment. Analysis of the LD between S10_9161643 and the other variants in this region identified no other variants with r2 greater than 0.5 despite the relatively strong associations between many SNPs and the CCM traits (Figure 4e). LD was substantially higher for SNPs encoded towards the telomere from oy1 than towards the centromere, with a strong discontinuity of LD at the 3’-end of oy1. Given the relatively low p-values calculated for multiple SNPs in the area, this raises the possibility that multiple alleles could contribute to the suppression of the Oy1-N1989 phenotype. To test for multiple genetic effects at this locus, the genotypes at SNP S10_9161643 were used as a covariate, and the genetic associations were recalculated. If SNPs segregate independently of S10_9161643 and contribute to Oy1-N1989 suppression, the p-values of association test statistics for such variants should decrease in this analysis (become more significant). On the contrary, those SNPs that have relatively low p-values due to linkage with S10_9161643 should become less significant in the covariate model. When these analyses were done, low-frequency variants at the 5’ end of the oy1 locus were identified as the most significant SNPs and passed a chromosome-wide multiple test correction (Figure 4b and 4d). This result suggests that there are multiple alleles capable of modifying the Oy1-N1989 mutant phenotype in the MDL panel. The top SNP on chromosome 10 in the covariate model of MLM was detected at position 9179932 and the allele associated with Oy1-N1989 suppression is a relatively rare variant (f=0.08). It remains formally possible that the SNP S10_9179932 is not causative and merely in LD with a causative polymorphism, and the second locus is fortuitously present in recombinant haplotypes. Analysis of LD of S10_9179932 with other SNPs in ∼500 kb window detected multiple SNPs of low allelic frequency that were in high LD (r2 ∼0.85) towards the 5’ end of oy1 (Figure 4f). Consistent with the strong discontinuity of LD at 3’ end of oy1 with S10_9161643, we observed discontinuity of LD with S10_9179932 at 5’ end of the oy1 locus. The LD analyses suggest that S10_9161643 and S10_9179932 are not in LD with each other and can act independently. These two SNPs account for ∼27 percent of the mutant CCMII variation in the MDL x Oy1-N1989/oy1:B73 F1 population (Table S15).
Given that the MLM model using S10_9161643 as a covariate detected S10_9179932 as the most significant association, we tested the phenotypic outcome of the four possible haplotypes at these two SNPs in the MDL F1 population. We observed that the four haplotypes at these two SNPs varied only for mutant CCM traits, with haplotypes AG and CA being the most favorable (highest CCM mean) and least favorable (lowest CCM mean), respectively (Table S16). Alleles at these two SNPs affected CCMI and CCMII in the mutant plants, consistent with additive inheritance for two polymorphisms. The additive impacts of the SNPs make mechanistic predictions about the suppression of Oy1-N1989. This additivity is consistent with independent alleles acting in cis at oy1 to modify the Oy1-N1989 mutant phenotype. Consistent with the strong enhancement caused by crossing Mo17 to Oy1-N1989/oy1:B73, the Mo17 oy1 locus encodes the most severe, and relatively rare, CA allele combination, and B73 encodes the most suppressing AG allele combination (Table S16). Thus, the line-cross mapping was performed with inbred lines that carry the most phenotypically extreme allele combinations of these two SNPs in the vicinity of the oy1 locus.
Oy1-N1989 is a semi-dominant chlorophyll mutant and enhanced by reduced function at oy1
If alleles of oy1 encode the suppression of Oy1-N1989, then the phenotype of heterozygous Oy1-N1989 should be strongly responsive to other mutant alleles at oy1. Maize seedlings that are heterozygotes between Oy1-N1989 and hypomorphic oy1 alleles are more severe than isogenic Oy1-N1989/oy1 siblings (Sawers et al. 2006). To confirm that the reduced oy1 function could determine the differential sensitivity to Oy1-N1989, we crossed dominant and recessive mutant alleles of oy1 to each other. The recessive weak hypomorphic allele oy1-yg was obtained from the Maize COOP in the unknown genetic background. The homozygous oy1-yg plants were crossed as a pollen-parent with both B73 and Mo17 to develop F1 material that would segregate the mutation. The F1 plants were then crossed to Oy1-N1989/oy1:B73 as well as backcrossed to the oy1-yg homozygotes in the original mixed background. These crosses allowed us to recover plants that had Oy1-N1989 in combination with the wild-type oy1B73, wild-type oy1Mo17, and mutant oy1-yg alleles. Chlorophyll contents were determined using CCM at 21 and 40 days after sowing in the field. The Oy1-N1989 allele was substantially enhanced when combined with the oy1-yg allele, demonstrating that reduced function of the oy1 allele in Mo17 could be the genetic basis of vey1 QTL (Figures 5a and 5b). A summary of these data is presented in Table S17. This result is similar to the one described by Sawers et al. 2006, where a reduction in chlorophyll content was observed when Oy1-N1989 mutant allele was combined with a recessive allele of oy1 (chlI-MTM1). We also noticed a similar drop in chlorophyll accumulation in the oy1-yg homozygotes as oppose to oy1-yg heterozygotes with wild-type oy1 allele from both B73 and Mo17 in the BC1F1 progenies (Figure 5c and Table S17). However, we did not observe any significant difference in the oy1-yg heterozygotes with B73 and Mo17 wild-type oy1 allele. Selfed progeny from Oy1-N1989 heterozygotes segregated for yellow-seedling lethal Oy1-N1989 homozygotes with no detectable chlorophyll by either CCM or spectrophotometer quantification (Table S6). Therefore, consistent with previous work (Hansson et al. 2002; Sawers et al. 2006), the Oy1-N1989 is a dominant-negative neomorphic mutant allele with no evident MgChl activity under the tested conditions. Based on these genetic data, any QTL resulting in decreased expression of oy1 or an increased proportion of mutant to wild-type gene product in the Oy1-N1989/oy1 heterozygotes can increase the severity of the mutant phenotype.
No coding sequence difference in OY1 accounts for vey1 inheritance
Our reliance on SNP variation leaves us open to the problem that linked, but the unknown non-SNP variation can be responsible for vey1. Given that reduced oy1 activity enhanced the phenotype of Oy1-N1989, we sequenced the oy1 locus from Mo17 and B73 to determine if coding sequence differences could encode the vey1 modifier. The only non-synonymous changes that distinguish these two alleles is at the site of the previously reported in-frame 6 bp insertion (Sawers et al. 2006), which adds alanine (A) and threonine (T) amino acid residues to the OY1 protein. PCR amplification of oy1 locus in 18 maize inbred lines, as well as the Oy1-N1989 allele, was performed. Sequencing of the amplification products confirmed the absence of the 6 bp insertion in Oy1-N1989 allele reported by Sawers et al. 2006. In addition, multiple inbred lines including B73, CML103, and CML322 also carried this 6 bp in-frame deletion. A polymorphism within the 6 bp insertion was also found that resulted in an alternative in-frame insertion encoding an alanine and serine (S) codon in Mo17 and five other inbred lines. Thus, three alleles at this site were found to be a common variant in OY1 gene product. These allelic states of oy1 did not explain the phenotypic severity of CCM trait value in the F1 mutant siblings (Figure 6). The allelic state of oy1 at this polymorphic site in 18 maize inbred lines and the average CCM trait values in the wild-type and mutant siblings of their respective F1 progenies with Oy1-N1989/oy1:B73 are summarized in Table S18. Five inbred lines, including Mo17, resulted in dramatic enhancement of the CCMI and ratio of CCMI phenotypes of F1 plants crossed to Oy1-N1989/oy1:B73. These enhanced genotypes encoded all three possible alleles at oy1. In addition, the suppressing inbred lines also encoded all three possible alleles. Besides this 6 bp indel, three inbred lines had few more variants in OY1 protein. An enhancing inbred line CML322 had two missense mutations that lead to amino acid change at position 321 (D->E), and 374 (S->I). A suppressing inbred line NC358 had one amino acid change at position 336 (D->G) and the enhancing inbred Tzi8 had a 15 bp in-frame deletion leading to the removal of five amino acids (VMGPE) in the third exon of the coding sequence. Even considering the additional alleles at oy1 found in few maize inbred lines, these results suggest that the only coding sequence polymorphism at oy1 between B73 and Mo17 could not be genetic basis of vey1. This result leaves the two additive top SNPs in cis with oy1 as the most likely cause of cis-acting regulatory variation.
Expression level polymorphism at oy1 co-segregates with suppression of Oy1-N1989
Measurements of mRNA accumulation from oy1 in the IBM was available in a previously published study (Li et al. 2013, 2018). The normalized transcript abundance (expressed as RPKM) of OY1 from the shoot apex of 14 days old maize seedlings from IBM were obtained from MaizeGDB (Sen et al. 2010). Out of 105 IBM lines that were assessed for expression level, 74 were among those tested for chlorophyll accumulation in Oy1-N1989 F1 hybrid populations. Using the genetic marker isu085b that is linked to oy1 locus, we determined that a cis-acting eQTL controlled the accumulation of OY1 transcripts in IBM shoot apex (Figure 7). A summary of these data is presented in Table S19. This cis-acting eQTL conditioned greater expression of the B73 allele and explained 19 percent of the variation (p < 0.0001) in OY1 transcript abundance in the IBM. Given the enhancement of Oy1-N1989 by the oy1-yg allele, a lower expression of the wild-type oy1 allele from Mo17 is expected to enhance the phenotype of Oy1-N1989 (Figure 5). In addition, OY1 RPKM values obtained from the shoot apex of IBM were able to predict the CCM trait values in the mutant but not the wild-type siblings in IBM F1 population with Oy1-N1989/oy1:B73 (Figure 7). This result suggests that inbred lines with increased MgChl subunit I transcripts available for protein production and MgChl complex assembly could overcome chlorophyll accumulation defects caused by the Oy1-N1989/oy1 genotype in IBM F1 hybrids. Consistent with this, full linear regression model that included both the isu085b marker (cis-eQTL) genotypes and the residual variation in RPKM at OY1 did a better job in predicting CCMI and CCMII in the IBM mutant F1 hybrids than the isu085b marker by itself. If the cis-eQTL at oy1, which results in differential accumulation of OY1 transcripts in the IBM inbred lines can affect allele-specific expression in the F1 hybrids, it could explain the better performance of the IBM mutant F1 hybrids with the B73 allele at vey1.
A previous study of allele-specific expression in the F1 hybrid maize seedlings identified expression bias at oy1 towards B73 in the hybrid combinations of B73 inbred line with PH207 and Mo17 but not Oh43 (Waters et al. 2017). We used two SNP positions, SNP_252 and SNP_317, to explore the allele-specific expression of OY1 in our materials. SNP_252 is the causative polymorphism for the Oy1-N1989 missense allele while SNP_317 is polymorphic between B73 and Mo17, but monomorphic between Oy1-N1989 and B73. As the original allele of the Oy1-N1989 mutation was isolated from a r1 c1 colorless synthetic stock of mixed parentage (G. Neuffer, personal communication), this raises the possibility that the same cis-acting regulatory variation that lowered expression of OY1 from PH207 and Mo17 when combined with the B73 allele might also be present in the oy1 allele that was the progenitor of Oy1-N1989. We tested this possibility by using the SNPs that distinguish B73, Mo17, and the Oy1-N1989 alleles to measure allele-specific expression in each of the hybrids. Consistent with the previous data (Waters et al. 2017), we observed biased expression at oy1 towards the B73 allele in the B73 x Mo17 F1 wild-type hybrids (Table 1). Extended data from this experiment is provided in Table S20. In the B73 isogenic crosses, transcripts from the Oy1-N1989 and B73 wild-type alleles accumulated to equal levels in the heterozygotes, indicating that the suppressed phenotype of the mutants in B73 background was not due to a lowered expression of Oy1-N1989 relative to the wild-type allele. Remarkably, mutant siblings from the reciprocal crosses between Oy1-N1989/oy1:B73 and Mo17 resulted in greater expression from the Oy1-N1989 allele than the wild-type oy1 allele of Mo17. Allele-specific bias at oy1 was significantly higher towards the Oy1-N1989 allele in the Oy1-N1989 mutant heterozygotes in the B73 x Mo17 hybrid background compared to B73 isogenic material. Thus, in mutant hybrids, overexpression of Oy1-N1989 relative to the wild-type oy1 allele in Mo17 could account for increased phenotypic severity.
If vey1 is encoded by an eQTL, then PH207 should encode an enhancing allele and Oh43 should encode a suppressing allele of vey1. We tested this genetically by producing F1 progenies in crosses of PH207 and Oh43 by Oy1-N1989/oy1:B73 pollen. Oh43 was evaluated in our initial screening and also as part of the MDL panel used for GWAS. In both experiments, the F1 hybrids between Oh43 and Oy1-N1989/oy1:B73 suppressed the mutant phenotype, suggesting that Oh43 is a suppressing inbred line (CCM values in Tables S4 and S18). B73 x PH207 F1 hybrids were missing from our previous datasets. We crossed PH207 ears with pollen from Oy1-N1989/oy1:B73 plants. The F1 hybrids from this cross were analyzed in the greenhouse at seedlings stage along with F1 hybrids of B73 x Oy1-N1989/oy1:B73 and Mo17 x Oy1-N1989/oy1:B73 F1 progenies as controls. PH207 was an enhancing inbred genotype as mutant heterozygotes in a PH207 x B73 F1 genetic background accumulated less chlorophyll than mutants in the B73 isogenic background (Figure S8 and Table S21).
We further leveraged the normalized expression data of OY1 in the emerging shoot tissue of the maize diversity lines (Kremling et al. 2018) and used the top two additive SNPs (S10_9161643 and S10_9179932) at vey1 from GWAS to test if these cis-variants of oy1 affect its expression. When tested, plants carrying alleles A and G at marker S10_9161643 and S10_9179932, respectively, showed highest OY1 abundance in the emerging shoots of diverse maize inbred lines, whereas, plants with alleles C and A at S10_9161643 and S10_9179932, respectively, showed lowest OY1 count (Table S22). Alleles that suppress Oy1-N1989 linked to either SNP were associated with the greater abundance of OY1 transcripts. This observation is consistent with the hypothesis that increased OY1 abundance can overcome the negative effect of the Oy1-N1989 allele. Consistent with the additive suppression of leaf greenness in Oy1-N1989 mutants by the alleles at S10_9161643 and S10_9179932 discussed previously (Table S16), these alleles were also additive for their impacts on OY1 transcript abundance (Table S22). Thus, it is likely that multiple phenotypically affective polymorphisms linked to these top unlinked SNPs underlie cis-acting regulatory variation at oy1.
The effect sizes of the gene expression changes observed in the IBM, diverse maize inbred lines, and allele-specific expression in hybrids are quite modest, resulting in ∼10% of differences in oy1 accumulation. If these changes in wild-type OY1 transcript accumulation are responsible for suppression of Oy1-N1989, then the severity of the mutant phenotype (as indicated by CCM) in the MDL F1 population should correlate with OY1 expression level. As expected, we observed a statistically significant positive correlation between the mutant derived CCM traits and OY1 counts (Table S23). Wild-type CCM in the MDL F1 population did not show any significant correlation with OY1 abundance in the emerging shoot tissue of maize inbred lines. These correlations are in agreement with the lack of any QTL at this locus controlling wild-type chlorophyll levels, and the epistatic relationship between vey1 and Oy1-N1989.
Discussion
The semi-dominant mutant allele Oy1-N1989 encodes a dominant-negative allele at the oy1 locus, which compromises MgChl enzyme activity (Sawers et al. 2006). In a heterozygous condition, the strength of the negative effect of this allele on the MgChl enzyme complex depends on the wild-type oy1 allele. B73 and Mo17 show differential suppression response in mutant heterozygotes resulting in suppressed and severe mutant phenotype, respectively. Thus, the Oy1-N1989 mutant allele can sensitize maize plants to variation in MgChl and expose a phenotypic consequence for genetic variants that are otherwise invisible. Similar methodology has been adopted previously in maize to gain the genetic understanding of various traits (Chintamanani et al. 2010; Olukolu et al. 2013, 2014; Buescher et al. 2014). Employing a mutant allele as a reporter to screen for effects of natural variants is a simple and efficient technique to detect standing variation in a specific biological process. Although vey1 polymorphisms are phenotypically consequential in the presence of Oy1-N1989 allele, no QTL was detected in the absence of Oy1-N1989 allele. These cryptic genetic variants, or contingent QTL, result from epistasis of Oy1-N1989 and permits the discovery and re-classification of DNA sequence variants that might otherwise be hypothesized to be neutral or non-functional in plant adaptation. Thus, genetic screens based upon semi-dominant mutant alleles as reporters offer a cost-effective and robust approach to map QTL(s) for metabolic pathways of interest by leveraging the publicly available genetic resources such as bi-parental mapping populations and maize diversity lines.
Alleles with modest fitness consequences may not be visible to researchers working with population sizes even in the thousands, such as in GWAS. By contrast, evolutionarily-relevant segregating variation may have minimal phenotypic effects. One of the possible uses of MAGIC is in boosting the relative contribution of natural variants to phenotypic variance in the trait of interest. This can both uncover cryptic variation affecting a biochemical pathway of interest and assist in improving the mapping resolution of a detected QTL. Not all of the cryptic variation observed by these mutant-contingent QTL approaches need be fitness-affecting, and interpreting mutant-conditioned phenotypes as non-neutral variation would be a mistake. It is, of course, possible that neutral variants may result in increased severity of a mutant phenotype due to changes not physiologically relevant for all alleles of that reporter gene present in a species. Nevertheless, it can identify new pathway member and inform us about the allelic variation in the species and pathway topology via gene discovery.
In the current study, use of bi-parental mapping populations derived from the same inbred lines but developed using different intercross and inbreeding schemes provided the opportunity to compare the effect of additional rounds of random interbreeding in the development of mapping population on the genetic resolution. The comparative fine mapping of vey1 in the Syn10 population, that employed ten rounds of random mating, provided far better localization of the vey1 QTL than the IBM populations that was derived from four rounds of random mating. This observation demonstrates the benefits of increased recombination during random intermating of early generations in QTL localization. Based on these results, as future RILs are generated, we recommend increased intermating in the early generations followed by DH induction rather than relying on further recombination during the self-pollination cycles of RIL development.
As expected, GWAS provided a fine-scale genetic resolution and corroborated the mapping of vey1. The distance between the best marker and the oy1 gene was substantially less in the GWA experiments. GWAS identified two SNPs, one in the 5’ and another in the 3’ intergenic DNA, proximate to oy1 that represent candidate quantitative trait nucleotides (QTN). The architecture of this region included four haplotypes with every combination of alleles at these two SNPs. The signal detection by multiple unlinked SNPs in GWAS may indicate a complex set of phenotypically affective alleles at the oy1 locus. Alternatively, it could very well be an artifact of missing the causative polymorphism within our genetic data resulting in strong associations with markers that are tightly linked to the causative variation but unlinked or in repulsion to each other. For example, indel variation is not captured by the approaches used in the GWA analysis. Consistent with two QTN, rather than fortuitous linkage of tag SNPs with a single causative polymorphism, alleles at the two SNPs additively influenced chlorophyll contents in the mutant siblings. Future work to identify the nucleotide changes responsible for the differences between B73, Mo17, and other inbred lines that encode vey1 will be required to test these possibilities definitively. Together, this study illustrates the complementary nature of line-cross QTL mapping and GWAS to explore the genetics of the trait under investigation.
We used Oy1-N1989 together with a non-destructive, inexpensive, and rapid phenotyping method to measure leaf chlorophyll. Rapid and robust phenotyping is critical and contributed to the strong correlation between the absolute trait measurement and the estimated phenotypes like CCM. Previous studies have highlighted the importance of benchmarking indirect measurements or proxies for traits of interest. For instance, near infrared reflectance spectroscopy (NIRS) that estimates major and total carotenoids in maize kernels could replace sensitive, accurate, cumbersome, expensive, slow, and destructive measurements by HPLC (Berardo et al. 2004). Comparisons of HPLC and NIRS measurements resulted in a correlation of 0.85 for total carotenoids (Berardo et al. 2004). A subjective visual score for yellow color in maize kernels was not adequate, yielding a correlation of 0.12 with HPLC measurements of total carotenoids (Harjes et al. 2008). Thus, using mutant alleles as reporters to develop methodologies that rely on non-invasive multispectral or hyperspectral data as proxies for specific biochemical compounds will accelerate studies on gene function and allele discovery. This approach can enable genetic studies that are currently deemed unfeasible due to the arduous task of phenotyping large populations for traits only visible in the laboratory.
How could a 10% change in wild-type OY1 expression affect chlorophyll biosynthesis in the Oy1-N1989/oy1 mutant heterozygotes?
Magnesium chelatase (MgChl) is formed by a trimer of dimers of MgChl subunit I interacting with the other subunits of MgChl complex. A previous study found that addition of mutant or wild-type BCHI protein to pre-assembled MgChl complexes resulted in altered reaction rates due to differences in subunit turnover, which occurred on a minutes time-scale (Lundqvist et al. 2013). This subunit turnover and reformation of the complex dynamically exchange mutant and wild-type BCHI subunits over time. Therefore, any net increase in the amount of wild-type OY1 in the reaction pool, for instance, due to higher transcription of wild-type oy1 allele will allow a higher rate of magnesium chelatase activity and result in more chlorophyll biosynthesis. The observation of stronger affinity and greater dissociation rate of BCHIL111F subunits (orthologous to the L176F change encoded by Oy1-N1989) for the wild-type subunits (Hansson et al. 2002) suggests that exchange of BCHI monomeric units in the magnesium chelatase complex might also differ based on the structure of BCHI protein (Lundqvist et al. 2013). In the AAA+ protein family, the ATP-binding site is located at the interface of two neighboring subunits in the oligomeric complex (Vale 2000). Since a dimer of functional MgChl subunit I proteins are required for the complex to carry out MgChl activity (Lundqvist et al. 2013), approximately 1 in 3 dimers of assembled MgChl subunit I will be active in a 1:1 mixture of wild-type and BCHIL111F. Indeed, complexes made from reaction mixtures with equal proportions of wild-type and BCHIL111F subunits resulted in ∼26% of the enzyme activity of an equivalent all-wildtype mix (Lundqvist et al. 2013). Therefore, we expect that decreasing expression of the wild-type oy1 subunit by 10% and creating a 0.9:1.1 mixture of wild-type OY1 and mutant OY1-N1989 protein, respectively, would result in ∼21% activity compared to the activity of all-wildtype mixture. Likewise, increasing the wild-type oy1 expression by 10% would result in ∼30% activity of MgChl compared to the all-wildtype mixture. This dosage-sensitivity is a general feature of protein complexes (Birchler and Veitia 2012; Veitia 2003; Grossniklaus, Madhusudhan, and Nanjundiah 1996; Birchler and Newton 1981), and the semi-dominant nature of Oy1-N1989 is dosage sensitive. Taking these observations and proposed models on the dynamics of molecular interaction between the wild-type and mutant BCHI protein subunits (especially BCHIL111F) into account, it is formally possible that a small change in the expression of wild-type OY1 can have a significant impact on the magnesium chelatase activity of heterozygous Oy1-N1989/oy1 plants. The increase in magnesium chelatase activity due to the even small relative increase in the proportion of wild-type OY1 transcripts over the mutant OY1-N1989 transcripts will read out as a proportional, presumably non-linear, increase in chlorophyll accumulation.
What is vey1?
Variation at the vey1 locus appears to be the result of allelic diversity linked to the oy1 locus. The only remaining possibilities are regulatory polymorphisms within the cis-acting control regions of oy1. Previous studies have utilized reciprocal test-crosses to loss-of-function alleles in multiple genetic backgrounds to provide single-locus tests of additive QTL alleles in an otherwise identical hybrid background (Dilkes et al. 2008). Protein-null alleles of oy1 isolated directly from the B73 and Mo17 backgrounds could be used to carry out a similar test. The intergenic genomic region in maize is spanned by transposable elements and can be highly divergent between different inbred lines due to large insertions/deletions polymorphisms (SanMiguel and Bennetzen 1998). Consistent with this, inbreds B73 and Mo17 are polymorphic at the region between oy1 and gfa2. These two maize inbred lines share ∼12 kb of sequence interspersed with numerous large insertions and deletions that add ∼139 kb of DNA sequence to Mo17 as compared to B73 (data not shown). However, we did not find any conserved non-coding sequence (CNS) in this region (data not shown). It is conceivable that recombinants at oy1 between B73 and Mo17 themselves could be identified and used to test the effects of upstream or downstream regulatory sequences. The recombinant haplotypes encoding all four possible alleles at the top two SNPs identified in the GWAS indicate that Mo17 may be a strong enhancer due to more than one causative polymorphism.
In the absence of such recombinants, we can only consider what mechanisms might be consistent with the observed suppression of the chlorophyll biosynthesis defects such as cis-acting effects on OY1 transcript abundance, and allele-specific gene expression. Genotype at vey1 in wild-type plants accounted for only 19 percent of the variation in OY1 abundance in the shoot apices of the IBM. This QTL accounted for more than 80 percent of the variation in chlorophyll content in mutants. If transcript accumulation is insensitive to the presence of Oy1-N1989 allele, then the remaining ∼80 percent of the variation in OY1 abundance observed as not vey1-dependent in the wild-type IBM might be expected to account for a greater change in phenotype. However, trans-acting effects or environmental effects that would equally affect both alleles at the oy1 locus would be expected to have a lesser impact on suppression of the deleterious Oy1-N1989 allele. The effect of cis-acting eQTL on oy1 expression would eventually result in a greater or lesser proportion of wild-type OY1 protein accumulation. These allele-specific differences will alter the mutant to wild-type subunit stoichiometry which should have a greater impact on chlorophyll content than changes in absolute RPKM, due to the competitive inhibition of MgChl complex activity by Oy1-N1989 (Sawers et al. 2006; Lundqvist et al. 2013), as detailed above.
The reciprocal crosses between Oy1-N1989/oy1:B73 and Mo17 did not result in different phenotypes. Thus, vey1 does not exhibit imprinting. Allele-specific expression at oy1 locus in the Oy1-N1989 heterozygous mutants demonstrated the existence of functional cis-acting regulatory polymorphism between Oy1-N1989 and both wild-type oy1 alleles in B73 and Mo17. In addition, oy1 is affected by a cis-eQTL in the IBM and MDL. Together with our other data, the allele-specific expression at OY1 that was visible when we reanalyzed the data from Waters et al. 2017, we propose that vey1 is encoded by cis-acting regulatory DNA sequence variation (Figure S9). Ultimately, transgenic testing of oy1 cis-acting regulatory polymorphisms identified from assembled maize genomes is required to determine the causative variant(s) encoding vey1. Sequence comparisons outside the protein coding sequence of the gene can be quite challenging, especially in maize, as it exhibits limited to poor sequence conservation between different inbred lines (SanMiguel et al. 1996). Thus, distinguishing phenotypically affective polymorphisms from the neutral variants is not trivial. As a result, biochemical and in vitro studies are the best tool for functional validation of these polymorphisms (Wray et al. 2003). Similar experiments have been done to characterize the role of DNA sequence polymorphisms in cis on the expression of downstream genes in case of flowering locus T in Arabidopsis and teosinte branched1 in maize (Adrian et al. 2010; Studer et al. 2011).
Relevance to research on transcriptional regulation
Since their discovery, the role of regulatory elements in gene function has been recognized as vital to our understanding of biological systems (McClintock 1950, 1956a, 1956b, 1961; Peterson 1953; Jacob and Monod 1961). Gene regulation and gene product dosage are at the forefront of evolutionary theories about sources of novelty and diversification (Ohno 1972; King and Wilson 1975). Transcriptional regulation of a gene can be as important as the protein coding sequence (Wray et al. 2003). For instance, complete knock-down of expression of a gene by a regulatory polymorphism will have the same phenotypic consequence as the non-sense mediated decay of a transcript harboring an early stop codon (Willing et al. 1996). But we do not have a set of rules, analogous to codon tables, for functional polymorphisms outside the coding sequence of a gene. Detecting expression variation and tying it to phenotypic consequence, especially in the absence of CNS, remains a challenge to this day.
Several eQTL studies ranging from unicellular to multicellular eukaryotic organisms have found abundant cis and trans-acting genomic regions that affect gene expression (Brem et al. 2002; West et al. 2007; Li et al. 2013, 2018). The proportion of cis-acting eQTLs from the total eQTLs detected in various studies including yeast, mouse, rat, humans, eucalyptus, and maize range from 19-92% (Gibson and Weir 2005; Li et al. 2013, 2018). Expression polymorphisms are a potential source of variation in some phenotypic traits (Gibson and Weir 2005), and multiple studies detected expression polymorphisms co-segregating with phenotypic variation, including contributions to species domestication (Clark et al. 2006; Salvi et al. 2007; Schwartz et al. 2009; Lemmon et al. 2014). But a majority of the cis-eQTLs only exhibit a moderate difference in the gene expression. Detecting such variants and linking them to visible phenotypes may require detailed study using approaches focused on the specific biological process affected by the gene product. We do not yet have a standardized experimental tool for these purposes. As such, we cannot simultaneously identify and characterize the phenotypic impact of most cis-acting eQTLs.
The vey1 polymorphism detected in the current study co-segregates with a cis-eQTL at oy1 in IBM (Li et al. 2013, 2018) and diverse maize inbred lines (Kremling et al. 2018). Using Oy1-N1989 allele to expose the consequences of these cis-eQTLs allowed chlorophyll approximation using CCM to substitute for expensive, cumbersome, highly sensitive gene expression assays or metabolite measurements. The high heritability of this alternative and direct phenotype, allowed us to scan large populations at a rapid rate to identify the genomic regions underlying the cis-acting regulatory elements and study allelic diversity in the natural population of maize. We propose that the approach we have taken is not likely to be unique to oy1. Therefore, we propose that all semi-dominant mutant alleles can be used as reporters to not only detect novel cis-acting gene regulatory elements but also functionally validate previously-detected cis-eQTL(s) from genome-wide eQTL studies.
Supplemental files
1. S1_IBM_F1_Rqtl_input: CSV file with average CCM values and genotypic data of IBM x Oy1-N1989/oy1:B73 F1 population formatted for R/qtl.
2. S2_Syn10_F1_Rqtl_input: CSV file with BLUP value of CCM traits and genotypic data of Syn10 x Oy1-N1989/oy1:B73 F1 population formatted for R/qtl.
Acknowledgements
Authors acknowledge the help of the staff at Purdue ACRE farm for assistance with planting and crop management of all the field experiments described in this study. R.S.K was supported by USAID Feed the Future CGIAR-US Initiative grant “Heat stress resilient maize for South Asia through a public-private partnership” to G.S.J.. This work was supported by NSF grant 1444503 awarded to G.S.J. and B.P.D. We especially thank the Muehlbauer lab for making their data open to the public, and the USDA for the MaizeGDB website, which allowed us to reanalyze the eQTL data.