Abstract
Photosynthetic organisms provide food and energy for nearly all life on Earth, yet half of their protein-coding genes remain uncharacterized1,2. Characterization of these genes could be greatly accelerated by new genetic resources for unicellular organisms that complement the use of multicellular plants by enabling higher-throughput studies. Here, we generated a genome-wide, indexed library of mapped insertion mutants for the flagship unicellular alga Chlamydomonas reinhardtii (Chlamydomonas hereafter). The 62,389 mutants in the library, covering 83% of nuclear, protein-coding genes, are available to the community. Each mutant contains unique DNA barcodes, allowing the collection to be screened as a pool. We leveraged this feature to perform a genome-wide survey of genes required for photosynthesis, which identified 303 candidate genes. Characterization of one of these genes, the conserved predicted phosphatase CPL3, showed it is important for accumulation of multiple photosynthetic protein complexes. Strikingly, 21 of the 43 highest-confidence genes are novel, opening new opportunities for advances in our understanding of this biogeochemically fundamental process. This library is the first genome-wide mapped mutant resource in any unicellular photosynthetic organism, and will accelerate the characterization of thousands of genes in algae, plants and animals.
Among unicellular photosynthetic organisms, the green alga Chlamydomonas has long been employed for genetic studies of eukaryotic photosynthesis because of its rare ability to grow in the absence of photosynthetic function3. In addition, it has made extensive contributions to our basic understanding of light signaling, stress acclimation, and metabolism of carbohydrates, lipids, and pigments (Fig. 1a)4-6. Moreover, Chlamydomonas retained many genes from the plant-animal common ancestor, which allowed it to reveal fundamental aspects of the structure and function of cilia and basal bodies7,8. Like Saccharomyces cerevisiae, Chlamydomonas can grow as a haploid, facilitating genetic studies. However, until now, the value of Chlamydomonas has been limited by the lack of mutants in most of its nuclear genes.
In the present study, we sought to generate a genome-wide collection of Chlamydomonas mutants with known gene disruptions to provide mutants in genes of interest for the scientific community, and then to leverage this collection to reveal genes with roles in photosynthesis. To reach the necessary scale, we chose to use random insertional mutagenesis and built on advances in insertion mapping and mutant propagation from our pilot study9. To enable mapping of insertion sites and screening pools of mutants on a much larger scale, we developed new tools leveraging unique DNA barcodes in each transforming cassette.
We generated mutants by transforming haploid cells with DNA cassettes that randomly insert into the genome and inactivate the genes they insert into. We maintained the mutants as indexed colony arrays on agar media containing acetate as a carbon and energy source to allow recovery of mutants with defects in photosynthesis. Each DNA cassette contained two unique barcodes, one on each side of the cassette (Supplementary Fig. 1a-d). For each mutant, the barcode and genomic flanking sequences on each side of the cassette were initially unknown (Supplementary Fig. 1e). We determined the sequence of the barcode(s) in each mutant colony by combinatorial pooling and deep sequencing (Supplementary Fig. 1f). We then mapped each insertion by pooling all mutants and amplifying all flanking sequences together with their corresponding barcodes followed by deep sequencing (Supplementary Fig. 1g). The combination of these datasets revealed the insertion site(s) in each mutant. This procedure yielded 62,389 mutants on 245 plates, with a total of 74,923 insertions that were largely randomly distributed over the chromosomes (Fig. 1, b and c, and Supplementary Table 5).
This library provides mutants for ~83% of all nuclear genes (Fig. 2a-d). Approximately 69% of genes are represented by an insertion in a 5’ UTR, an exon or an intron – regions most likely to cause an altered phenotype when disrupted. Many gene sets of interest to the research community are well represented, including genes encoding proteins phylogenetically associated with the plant lineage (GreenCut2)1, proteins that localize to the chloroplast10, or those associated with the structure and function of flagella or basal bodies11,12 (Fig. 2b). Mutants in this collection are available through the website https://www.chlamylibrary.org/. Over 1,800 mutants have already been distributed to over 200 laboratories worldwide in the first 18 months of pre-publication distribution (Fig. 2e). These mutants are facilitating genetic investigation of a broad range of processes, ranging from photosynthesis and metabolism to cilia structure and function (Fig. 2f).
To identify genes required for photosynthesis, we screened our library for mutants deficient in photosynthetic growth. Rather than phenotyping each strain individually, we pooled the entire library into one culture and leveraged the unique barcodes present in each strain to track its abundance after growth under different conditions. This feature enables genome-wide screens with speed and depth unprecedented in photosynthetic eukaryotes. We grew a pool of mutants photosynthetically in light in minimal Tris-Phosphate (TP) medium with CO2 as the sole source of carbon, and heterotrophically in the dark in Tris-Acetate-Phosphate (TAP) medium, where acetate provides fixed carbon and energy3 (Fig. 3a). To quantify mutant growth under each condition, we amplified and deep sequenced the barcodes from the final cell populations. We then compared the ability of each mutant to grow under photosynthetic and heterotrophic conditions by comparing the read counts of each barcode from each condition (Supplementary Table 10; Methods). Mutant phenotypes were highly reproducible (Fig. 3b and Supplementary Fig. 5, a and b). We identified 3,109 mutants deficient in photosynthetic growth (Fig. 3c and Methods).
To identify genes with roles in photosynthesis, we developed a statistical analysis framework that leverages the presence of multiple alleles for many genes. This framework allows us to overcome several sources of false positives that have been difficult to identify with previous methods, including cases where the phenotype is not caused by the mapped disruption. For each gene, we counted the number of mutant alleles with and without a phenotype, and evaluated the likelihood of obtaining these numbers by chance given the total number of mutants in the library that exhibit the phenotype (Supplementary Table 11; Methods).
We identified 303 candidate photosynthesis genes based on our statistical analysis above. These genes are enriched for membership in a diurnally regulated photosynthesis-related transcriptional cluster13 (P<10-11), are enriched for upregulation upon dark-to-light transitions14 (P<0.003), and encode proteins enriched for predicted chloroplast localization (P<10-8). As expected15, the candidate genes also encode a disproportionate number of GreenCut2 proteins (P<10-8), which are conserved among photosynthetic organisms but absent from non-photosynthetic organisms1: 32 GreenCut2 proteins are encoded by the 303 candidate genes (11%), compared to |3% in the entire genome.
Photosynthesis occurs in two stages: the light reactions and carbon fixation. The light reactions convert solar energy into chemical energy, and require coordinated action of Photosystem II (PSII), Cytochrome b6f, Photosystem I (PSI), ATP synthase complexes, a plastocyanin or cytochrome c6 metalloprotein, as well as small molecule cofactors16. PSII and PSI are each assisted by peripheral light-harvesting complexes (LHCs) known as LHCII and LHCI, respectively. Carbon fixation is performed by enzymes in the Calvin-Benson-Bassham cycle, including the CO2-fixing enzyme Rubisco. In addition, most eukaryotic algae have a mechanism to concentrate CO2 around Rubisco to enhance its activity17.
Sixty-five of the genes we identified encode proteins that were previously shown to play a role in photosynthesis or chloroplast function in Chlamydomonas or vascular plants (Fig. 3f). These include three PSII-LHCII subunits (PSBP1, PSBP2, and PSB27) and seven PSII-LHCII biogenesis factors (CGL54, CPLD10, HCF136, LPA1, MBB1, TBC2, and Cre02.g105650), two cytochrome b6f complex subunits (PETC and PETM) and six cytochrome b6f biogenesis factors (CCB2, CCS5, CPLD43, CPLD49, MCD1, and MCG1), five PSI-LHCI subunits (LHCA3, LHCA7, PSAD, PSAE, and PSAL) and nine PSI-LHCI biogenesis factors (CGL71, CPLD46, OPR120, RAA1, RAA2, RAA3, RAT2, Cre01.g045902, and Cre09.g389615), one protein required for ATP synthase function (PHT3), plastocyanin (PCY1) and two plastocyanin biogenesis factors (CTP2 and PCC1), 12 proteins involved in the metabolism of photosynthesis cofactors or signaling molecules (CHLD, CTH1, CYP745A1, DVR1, HMOX1, HPD2, MTF1, PLAP6, UROD3, Cre08.g358538, Cre13.g581850, and Cre16.g659050), three Calvin-Benson-Bassham Cycle enzymes (FBP1, PRK1, and SEBP1), two Rubisco biogenesis factors (MRL1 and RMT2), three proteins involved in the algal carbon concentrating mechanism (CAH3, CAS1, and LCIB), as well as proteins that play a role in photorespiration (GSF1), CO2 regulation of photosynthesis (Cre02.g146851), chloroplast morphogenesis (Cre14.g616600), chloroplast protein import (SDR17), and chloroplast DNA, RNA, and protein metabolism (DEG9, MSH1, MSRA1, TSM2, and Cre01.g010864) (Fig. 3h and Supplementary Table 12). We caution that not all genes previously demonstrated to be required for photosynthetic growth are detectable by this approach, especially the ones with paralogous genes in the genome, such as RBCS1 and RBCS2 that encode the small subunit of Rubisco18. Nonetheless, the large number of known factors recovered in our screen is a testament to the power of this approach.
In addition to recovering these 65 genes with known roles in photosynthesis, our analysis revealed 238 candidate genes with no previously reported role in photosynthesis (Methods). These 238 genes represent a rich set of targets to better understand photosynthesis. Because our screen likely yielded some false positives, we divided all genes into “higher-confidence” (P<0.0011; FDR< 0.27) and “lower-confidence” genes based on the number of alleles that supported each gene’s involvement in photosynthesis (Fig. 3d-f; Tables 1 and 2; Methods). The 21 higher-confidence genes with no previously reported role in photosynthesis are enriched in chloroplast localization (9/21, P<0.011; Fig. 3g) and transcriptional upregulation during dark to light transition (5/21, P<0.005), similar to the known photosynthesis genes. Thus, these 21 higher-confidence genes are particularly high-priority targets for the field to pursue.
Functional annotations for 15 of the 21 higher-confidence genes suggest that these genes could play roles in regulation of photosynthesis, photosynthetic metabolism, and biosynthesis of the photosynthetic machinery. Seven of the genes likely play roles in regulation of photosynthesis: GEF1 encodes a voltage-gated channel, Cre01.g008550 and Cre02.g111550 encode putative protein kinases, CPL3 encodes a predicted protein phosphatase, TRX21 contains a thioredoxin domain, Cre12.g542569 encodes a putative glutamate receptor, and Cre13.g586750 contains a predicted nuclear importin domain. Six of the genes are likely involved in photosynthetic metabolism: the Arabidopsis homolog of Cre10.g448950 modulates sucrose and starch accumulation19, Cre11.g467712 contains a starch-binding domain, Cre02.g073900 encodes a putative carotenoid dioxygenase, VTE5 encodes a putative phosphatidate cytidylyltransferase, Cre10.g429650 encodes a putative alpha/beta hydrolase, and Cre50.g761497 contains a magnesium transporter domain. Finally, two of the genes are likely to play roles in the biogenesis and function of photosynthesis machinery: EIF2 has a translation initiation factor domain, and CDJ2 has a chloroplast DnaJ domain. Future characterization of these genes by the community is likely to yield fundamental insights into our understanding of photosynthesis.
As an illustration of the value of genes identified in this screen, we sought to explore the specific function of one of the novel higher-confidence hits, CPL3 (Conserved in Plant Lineage 3, Cre03.g185200), which encodes a putative protein phosphatase (Fig. 4a and Supplementary Fig. 6e). Many proteins in the photosynthetic apparatus are phosphorylated, but the role and regulation of these phosphorylations are poorly understood20. In our screen, three mutants in CPL3 exhibited a deficiency in photosynthetic growth (Fig. 3c and Supplementary Table 13). We chose to examine one allele (LMJ.RY0402.153647, referred to hereafter as cpl3; Fig. 4a and Supplementary Fig. 6a) for phenotypic confirmation, genetic complementation, and further studies.
Consistent with the pooled growth data, cpl3 showed a severe defect in photosynthetic growth on agar, which was rescued under heterotrophic conditions (Fig. 4b). We confirmed that the CPL3 gene is disrupted in the cpl3 mutant and found that complementation with a wild-type copy of the CPL3 gene rescues the phenotype, demonstrating that the mutation in CPL3 is the cause of the growth defect of the mutant (Supplementary Note and Supplementary Fig. 6a-d).
We then examined the photosynthetic performance, morphology of the chloroplast, and the composition of photosynthetic pigments and proteins in cpl3. Photosynthetic electron transport rate was decreased under all light intensities, suggesting a defect in the photosynthetic machinery (Fig. 4c). The chloroplast morphology of cpl3 appeared similar to the wild type based on chlorophyll fluorescence microscopy (Supplementary Fig. 7a). However, we observed a lower chlorophyll a/b ratio in cpl3 than in the wild type (Supplementary Fig. 7b), which suggests a defect in the accumulation or composition of the protein-pigment complexes involved in the light reactions21. Using whole-cell proteomics, we found that cpl3 was deficient in accumulation of all detectable subunits of the chloroplast ATP synthase (ATPC, ATPD, ATPG, AtpA, AtpB, AtpE, AtpF), some subunits of PSII (D1, D2, CP43, CP47, PsbE, PsbH), and some subunits of PSI (PsaA and PsaB) (FDR<0.31 for each subunit, Fig. 4d, Fig. 4f, and Supplementary Table 14). We confirmed these findings by western blots on CP43, PsaA, and ATPC (Fig. 4e). Our results indicate that CPL3 is required for normal accumulation of thylakoid protein complexes (PSII, PSI, and ATP synthase) involved in the light reactions of photosynthesis.
Our finding that 21/43 of the higher-confidence photosynthesis hit genes were uncharacterized suggests that nearly half of the genes required for photosynthesis remain to be characterized. This finding is remarkable, considering that genetic studies on photosynthesis extend back to the 1950s22. Our validation of CPL3’s role in photosynthesis illustrates the value of the uncharacterized hit genes identified in this study as a rich set of candidates for the community to pursue.
More broadly, it is our hope that the mutant resource presented here will serve as a powerful complement to newly developed gene editing techniques23-28, and that together these tools will help the research community generate fundamental insights in a wide range of fields, from organelle biogenesis and function to organism-environment interactions.
Author contributions
X.L. developed the method for generating barcoded cassettes; R.Y. and S.R.B. optimized the mutant generation protocol; R.Y., N.I., and X.L. generated the library; J.M.R., N.I., A.G., and R.Y. maintained, consolidated, and cryopreserved the library; X.L. developed the barcode sequencing method; N.I., X.L., R.Y., and W.P. performed combinatorial pooling and super-pool barcode sequencing; X.L. performed LEAP-Seq; W.P. developed mutant mapping data analysis pipeline and performed data analyses of barcode sequencing and LEAP-Seq; W.P. analyzed insertion coverage and hot/cold spots; R.Z. and J.M.R. performed insertion verification PCRs and Southern blots; F.F., R.E.J., and J.V.-B. developed the library screening protocol; F.F., J.V.-B., and X.L. performed the photosynthesis mutant screen and barcode sequencing; R.E.J. and W.P. developed screen data analysis methods and implemented them for the photosynthesis screen; X.L. and T.M.W. annotated the hits from the photosynthesis screen; X.L., J.M.R., and S.R. performed growth analysis, molecular characterizations, and complementation of cpl3; S.S. and T.M.W. performed physiological characterizations of cpl3; M.T.M. and S.S. performed western blots on the photosynthetic protein complexes; M.T.M. performed microscopy on cpl3; X.L., W.P., and T.S. performed proteomic analyses; M.L. and P.A.L. maintained, cryopreserved, and distributed mutants at the Chlamydomonas Resource Center; X.L., W.P., A.R.G., and M.C.J. wrote the manuscript with input from all authors; M.C.J. and A.R.G. conceived and guided the research and obtained funding.
Competing interests
The authors declare no competing interests.
Acknowledgments
We thank Olivier Vallon for helpful discussions; Matthew Cahn and Garret Huntress for developing and improving the CLiP website; Xuhuai Ji at the Stanford Functional Genomics Facility and Ziming Weng at the Stanford Center for Genomics and Personalized Medicine for deep sequencing services; Alan Itakura for help in library pooling; Shriya Ghosh, Kyssia Mendoza, and Matthew LaVoie for technical assistance; Kathryn Barton, Winslow Briggs, and Zhi-Yong Wang for providing lab space; Joseph Ecker, Liz Freeman Rosenzweig and Moshe Kafri for constructive suggestions on the manuscript; and the Princeton Mass Spectrometry Facility for proteomics services. This project was supported by a grant from the National Science Foundation (MCB-1146621) awarded to M.C.J. and A.R.G., grants from the National Institutes of Health (DP2-GM-119137) and the Simons Foundation and HHMI (55108535) awarded to M.C.J., a German Academic Exchange Service (DAAD) research fellowship to F.F., Simons Foundation fellowships of the Life Sciences Research Foundation to R.E.J. and J.V.-B., EMBO long term fellowship (ALTF 1450-2014 and ALTF 563-2013) to J.V.-B and S.R., and a Swiss National Science Foundation Advanced PostDoc Mobility Fellowship (P2GEP3_148531) to S.R.
References
References
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.