Abstract
Microbial production of fuels and chemicals from lignocellulosic biomass provides promising bio-renewable alternatives to the conventional petroleum-based products. However, heterogeneous sugar composition of lignocellulosic biomass hinders efficient microbial conversion due to carbon catabolite repression. The most abundant sugar monomers in lignocel-lulosic biomass materials are glucose and xylose. While industrial Escherichia coli strains efficiently utilize glucose, their ability to utilize xylose is often repressed in the presence of glucose. Here we independently evolved three E. coli strains from the same ancestor to achieve high efficiency for xylose fermentation. Each evolved strain has a point mutation in a transcriptional activator for xylose catabolic operons, either CRP or XylR, and these mutations are demonstrated to enhance xylose fermentation by allelic replacements. Identified XylR variants (R121C and P363S) have a higher affinity to their DNA binding sites, leading to a xylose catabolic activation independent of catabolite repression control. Upon introducing these amino acid substitutions into the E. coli D-lactate producer TG114, 94 % of a glucose-xylose mixture (50 g L-1 each) was utilized in mineral salt media that led to a 50 % increase in product titer after 96 h of fermentation. The two amino acid substitutions in XylR enhance xylose utilization and release glucose-induced repression in different E. coli hosts, including wild-type, suggesting its potential wide application in industrial E. coli biocatalysts.
Introduction
Microbial biocatalysts such as Escherichia coli and Saccharomyces cerevisiae have been developed to convert sugars to an array of value-added chemicals, ranging from simple fermentation products to complex terpenoids like artemisinic acid [1]. Lignocellulosic biomass represents a promising renewable feedstock that can support large-scale microbial production processes for fuels and specialty chemicals without interfering with human food supply [2, 3]. Lignocellulose is a complex matrix present in plant cell walls and is mainly composed of polysaccharides and phenolic polymers [2]. D-glucose (the sole monomer of cellulose) and D-xylose (the predominant sugar in hemicellulose) are major sugars found in typical lignocellulosic materials [2, 3]. Although the sugar content in lignocellulosic materials (e.g. agricultural wastes such as corn stover) is higher than 50 % of their dry weight, the heterogeneous nature of lignocellulosic sugars inhibits efficient microbial catabolism and thus decreases production [2, 3]. Industrial microbes such as S. cerevisiae and Zymomonas mobilis do not natively metabolize xylose and a foreign xylose catabolic pathway must be integrated into these hosts for xylose utilization [4, 5]. Even for bacteria like E. coli that natively contain the xylose catabolic pathway, xylose utilization rates and growth rates on xylose are low [6]. More importantly, utilization of xylose is repressed in the presence of glucose due to a global regulatory mechanism called carbon catabolite repression (CCR), a common phenomenon observed in bacteria and fungi, which results in abundant amounts of xylose unused when cells ferment a glucose-xylose mixture [7, 8].
As one of classic global regulatory mechanisms, CCR is well characterized in E. coli [7, 8]. The global transcriptional regulator CRP (cAMP receptor protein) plays a central role in modulating transcriptional activation of catabolic operons for secondary sugars such as xylose, arabinose and galactose [7, 8]. The phosphoenolpyruvate:sugar phosphotransferase system (PTS) and the membrane bound enzyme adenylate cyclase (AC; catalyzing the conversion of ATP to cAMP) are also involved in glucose-induced repression of xylose utilization in E. coli [7, 8]. The phosphorylation state of EIIAGlc, a PTS component encoded by crr in E. coli, plays a pivotal role in regulating AC activities and cAMP levels according to glucose concentrations [9]. When glucose concentrations are high, the phosphate from EIIAGlc is drained towards the sugars and the dephosphorylated EIIAGlc is not able to activate AC, which results in low levels of cAMP [10]. Without cAMP, CRP cannot activate the transcription of xylose catabolic operons. In contrast, at low glucose concentrations copious amounts of phosphorylated EIIAGlc exist and are able to activate AC and promote cAMP synthesis. If xylose is present, CRP activated by cAMP and the xylose-specific activator XylR (activated when bound by xylose) together co-activate the xylose catabolic operons, xylAB and xylFGH (Fig. 1A) [11]. After xylose is imported by XylFGH, a xylose-specific ATP-binding cassette transporter protein, xylose is converted to xylulose through a reversible one-step reaction catalyzed by xylose isomerase, XylA. Xylulose is then converted to xylulose-5-phosphate by the xylulokinase, XylB, for further degradation via the pentose phosphate pathway and glycolysis (Fig. 1A) [12].
Releasing CCR by genetic engineering allows microbial biocatalysts to simultaneously utilize glucose and other secondary sugars derived from lignocellulosic biomass and leads to a more efficient fermentation process [7]. Common strategies to engineer sugar co-utilization in E. coli include the inactivation of PTS components, such as PtsG and PtsI [13, 14, 15], and mutagenesis of CRP [16, 17]. However, both approaches have caveats and only limited success has been achieved, especially for conditions more relevant to industrial practice such as high sugar concentrations and low-cost media. Inactivation of PTS components also impairs glucose uptake and thus extra efforts to compensate this defectiveness are needed [13, 18]. Theoretically, a cAMP-independent CRP variant should be able to activate the catabolic operons of secondary sugars even in the presence of glucose. However, these cAMP-independent CRP mutants often cannot activate the target operons at the same efficiency as wild-type CRP bound with cAMP [16]. In addition, as an important global regulator, CRP directly regulates the transcriptional expression of more than 400 genes and CRP mutants often have slow growth phenotypes potentially due to unpredictable expressional changes of other important genes [19]. Here, we evolved E. coli for enhanced xylose fermentation and identified the convergent genetic basis that increases xylose utilization. By characterizing the transcriptional activation mechanism of xylose catabolic genes, we discovered a simple and effective genetic approach to release CCR in E. coli. We identified single nucleotide mutations in xylR that increase xylose utilization up to 4-fold in different E. coli strains when fermented in a glucose-xylose mixture (50 g L−1 of each sugar). This discovery has the potential to enable different E. coli biocatalysts to simultaneously convert major sugars from lignocellulosic biomass into value-added chemicals.
Results
Identification of primary genetic changes of E. coli adaptation for improved xylose fermentation to D-lactate
We hypothesized that characterization of repeated evolutionary trajectories would reveal the convergent causative mutations that improve xylose fermentation. A previously engineered D-lactate producing strain XW043 [20] was independently evolved three times in mineral salt media containing 100 g L−1 xylose (Fig. 1B). The bacterial population was maintained at the exponential or early stationary phase in a fermentation vessel by serially transferring cultures into new media as previously described [20, 21]. In all three evolutionary trajectories, a rapid adaptation occurred that simultaneously increased xylose catabolism, lactate titer and cell growth (Fig. 1B, S1). Since lactate production is the only fermentation pathway supporting cell growth in XW043 under oxygen-limiting experimental conditions [20], increased xylose catabolism would lead to higher cell growth. At the end of three evolution experiments, there was an approximately 5-fold increase of lactate titers at 48 h compared to the ancestor strain XW043, accompanied with an increase in yield from 0.6 to 0.8–0.9 g lactate per gram xylose (Fig. 1B). To understand the genetic changes responsible for the increased xylose catabolism, we sequenced the genomes of the ancestor XW043 and three representative evolved clones (one from each evolved population), designated as strains TL1, CM2, and LP2 using Illumina paired-end sequencing. Each clone was sequenced twice with an average 24-fold coverage per library. By applying a comprehensive analysis pipeline (Details in SI Methods), overall 5 point mutations, a 1-bp deletion, 15 duplications and 7 deletions in mostly uncharacterized proteins were found in three evolved isolates compared to XW043 (Table S2). Additionally, we detected 60 breakpoints as an indicator for structural rearrangements occurring in all three evolved clones (Table S2). Of all the mutations detected, three point mutations, one per clone, occurred at the transcriptional co-activators CRP and XylR that are critical for xylose catabolism (Fig. 1A), suggesting a potential result of convergent evolution to relieve the predominant metabolic constraints in the ancestor strain. The three independently isolated point mutations result in amino acid changes in CRP (G141D in LP2) and XylR (R121C in CM2 and P363S in TL1). In addition, an IS10 insertion was identified in focA reading frame for all three evolved clones at different positions (Fig. 1C, S4C), thereby suggesting an independent origin for each insertion and serving as a strong indicator of the potential benefit of focA inactivation for xylose to lactate fermentative production.
Characterization of physiological effects of convergent mutations
We hypothesized that the identified convergent mutations (focA inactivation and missense mutations in crp and xylR) are primary genetic changes that improved xylose catabolism and lactate fermentation. Genetic mechanisms of evolved phenotypes can be potentially explained by allelic replacement, in which the wild-type copy of the ancestor is replaced by a mutant copy in the descendant, or vice versa [22, 23]. Introduction of a single point mutation crp (G141D) in XW043 background doubled both xylose utilization and D-lactate titer after 96 h fermentation, and growth rate and final biomass were also increased (Fig. 2A, S2). A 32 kb chromosomal region containing xylR and other xylose catabolic genes is duplicated in the ancestor XW043 and all evolved strains as indicated by genome sequencing (Fig. S3) probably because the precursor strain of XW043 was adapted for using hemicellulose hydrolysates as the carbon source [20]. Only one of the two xylR copies was mutated (R121C in CM2 and P363S in TL1), thereby conferring a quasi-heterozygous genotype (Fig. S3). We were not able to replace xylR with its mutant alleles in XW043 using λ-red recombinase-mediated homologous recombination presumably because the second copy of wild-type xylR tended to replace the selection marker due to a favored intrachromosomal recombination. Instead, using λ-red recombinase-mediated intrachromosomal recombination approach (Details in SI Methods), we replaced the xylR mutations with a wild-type copy in both CM2 and TL1 strains. This chromosomal modification to restore the wild-type allele decreased xylose utilization rates of the initial 24 h by 40 % and 65 % compared to CM2 and TL1, respectively (Fig. 2B, C). Accordingly, the modified strains had decreased lactate production and cell growth in both CM2 and TL1 backgrounds (Fig. 2B, C, S2). However, restoring the xylR mutations in CM2 or TL1 back to wild-type did not produce a complete ancestral phenotype in terms of fermentative growth and xylose utilization (Fig. 2B, C), suggesting the presence of other beneficial mutations important for enhanced xylose utilization.
Insertion of focA by IS10 is another convergent event for all three evolved isolates, suggesting a potential benefit for xylose to lactate conversion (Fig. 1). FocA is a bidirectional formate transporter that regulates intracellular formate levels [24]. Since IS10 elements are repetitive in the genome, we were not able to reconstruct the identical IS10 insertion in the ancestor background. However, an in-frame deletion of focA in the ancestor only showed a marginal positive effect on xylose fermentation (Fig. S4A), thereby indicating the presence of a more complex mechanism besides focA inactivation alone to cause enhanced xylose utilization. The IS10 insertions in focA had negative polar effects on transcription of the downstream gene pflB as shown by quantitative reverse-transcription PCR (qRT-PCR) in all evolved clones (Fig. S4B). The pflB gene encodes pyruvate formate-lyase which competes against the lactate production pathway for the common substrate pyruvate with a lower Km value than lactate dehydrogenase (Fig. 1A) [24, 25]. Decreased expression of pflB may help redirect carbon flow to lactate dehydrogenase. To test the epistatic interaction between the identified convergent mutations, we restored wild-type focA and xylR in evolved background TL1 to reconstruct an ancestor phenotype (Fig. 2C). Interestingly, this strain (CS4: TL1 xylR wt/wt, focA wt) performed even worse than XW043, indicating that both the xylR mutation and focA IS10 insertion are primary causative mutations with a potential synergy effect and that there is possibly one or further mutations which epistatically interact with the described mutations.
To test if the identified xylR mutations have a universal effect and if the heterozygous xylR copies are required to improve xylose utilization, we substituted wild-type xylR with its mutant version (R121C, P363S, or both) in wild-type E. coli W (ATCC9637) (Fig. 2D). All mutants increased xylose utilization and cellular fermentative growth to different degrees (Fig. 2D). Xylose utilization rates were increased 2.4 and 4.3-fold compared to wild-type strain within the initial 24 h for xylR R121C and P363S substitutions, respectively (Fig. 2D). We next combined both xylR mutations in a wild-type strain and observed a slightly additive effect (Fig. 2D). At the end of 96 h fermentation, 40 g L−1 xylose remained unused in the broth for wild-type strain while only 10 g L−1 for xylR P363S substitution or xylR R121C and P363S substitution (Fig. 2D).
XylR variants show enhanced activity in vitro
CRP G141D was previously identified as a mutant with an altered allosteric mechanism and position 141 is involved in the interfacial interactions between subunits and domains [16, 26]. Potentially due to the CRP G141D mutation in the evolved strain LP2, the transcription of xylose catabolic genes was upregulated by more than 10-fold as measured by qRT-PCR (Fig. S5). Unlike CRP and its mutant variants, which are well characterized, much less is known about XylR. We examined the effects that xylR mutations potentially had on the transcriptional levels for its responsive genes which are organized as two operons xylAB and xylFGHR (Fig. 3A). The xylAB and xylFGH transcripts were increased at least 20-fold in CM2 with xylR (wt/R121C) and 10-fold in TL1 with xylR (wt/P363S) compared to the ancestor strain XW043 (Fig. 3B). The xylR transcript alone shows only between 3.1- and 3.6-fold upregulation suggesting that the xylFGHR transcript is partially degraded at the 3’ end or read-through of the RNA polymerase is prevented by clashing with proteins bound at the xylR specific promoter. Since the crp transcript level is not significantly changed (Fig. 3B), we concluded that the XylR mutations are the primary reason for upregulation of the xyl operons.
To further study the molecular mechanism causing enhanced transcriptional activation by XylR variants, we conducted electrophoretic mobility shift assays using three purified N-terminal His-tagged XylR variants (wild-type, R121C and P363S) and DNA fragments containing the binding sites IA and IF located in the intergenic region between xylAB and xylFGH (Fig. 3A) [27]. The dissociation constant KD for each site was determined to measure their binding affinity (Fig. 3C, D, S6A, B). Although the gel based assay allows for only a rough approximation, our data for the wild-type XylR is consistent with a previous report showing a KD of 33±0.8 nM for IA and 25±0.6 nM for IF, respectively (Fig. 3D) [27]. The evolved XylRs have a significantly lower KD ranging from 3- to 14-fold depending on which binding site and variant (Fig. 3D, E), suggesting that a higher binding affinity leads to a more stable transcription initiation complex and a subsequently higher transcriptional rate. Moreover, wild-type XylR could not bind operator sequences in the absence of xylose since the majority of DNA fragments remained unbound in the gel (Fig. S6C). In contrast, both XylR variants bound DNA even without xylose (Fig. S6C), suggesting that these XylR variants have xylose-independent activities. A two-fold increase in binding of wild-type XylR to its responsive site was observed when 100 μM xylose was added, but similar enhancement was much less significant for both XylR variants (Fig. S6C).
Amino acid substitutions in XylR release CCR
The higher binding affinity of the evolved XylR and the concomitant xylose independency led to the hypothesis that these amino acid substitutions in XylR are able to release CCR. Wild-type E. coli W was used as the test strain for different genetic modifications to release CCR. Batch fermentations were conducted using a mineral salt medium AM1 containing 100 g L-1 a glucose-xylose mixture (50 g L−1 each). The wild-type strain consumed all glucose but only 16 % of the xylose after 96 h of fermentation (Fig. 4A, S7). In comparison, the XylR variants R121C or P363S released CCR, and glucose and xylose were consumed simultaneously, eventually leading to 61-69 % of the xylose utilized (Fig. 4A, S7). LN6, the strain with both R121C and P363S substitutions, utilized 87 g L−1 of the total sugars compared to 57 g L−1 for wild-type (Fig. 4A). The xylose utilization rate was increased 4-fold compared to wild-type (81 % of the xylose used after 96 h fermentation) and only 3 g L−1 glucose remained unused in the fermentation broth (Fig. 4A, S7). CRP G141D alone or together with xylR mutations did not enhance sugar co-utilization (Fig. 4A). Investigation of known mutations that relieve CCR such as ΔmgsA [28] and crp* (I112L, T127I and A144T) [29, 30] was also conducted in the same wild-type background. CRP* only increased xylose utilization to 37 % with impaired cell growth while the mgsA deletion had no benefit in wild-type background (Fig. 4A, S7).
To further demonstrate the application of our discovery in converting sugar mixtures into renewable chemicals, we substituted wild-type xylR with xylR (R121C and P363S) of a previously engineered D-lactate E. coli producer TG114, which efficiently converts glucose into D-lactate [21]. Fermenting TG114 in a glucose-xylose mixture, 50 g L-1 of each sugar, showed that only 34 % of the xylose was used due to CCR (Fig. 4B). Introducing R121C and P363S substitutions in xylR enabled the modified strain to use 88 % of the xylose while all glucose was consumed within 96 h. Consequently, D-lactate production increased to 1.5-fold compared to TG114 with a final titer at 86 g L−1 and a yield of 0.91 g g−1 sugars.
Discussion
We characterized the primary genetic changes of adaptation for xylose utilization in an E. coli D-lactate producer and found a novel way to effectively release CCR in E. coli. By deleting competing fermentation pathways in the ancestor strain, the possible evolutionary trajectories were strictly constrained to xylose-lactate conversion and convergent mutation patterns were observed in three parallel evolved populations. In contrast to the convergent mutations in crp and xylR identified in this work, previous research using similar experimental evolutionary approaches showed that beneficial mutations for enhanced xylose fermentations in E. coli biocatalysts occurred at sugar transporter genes, such as galP and gatC [31, 32]. This suggests that there are potentially multiple evolutionary solutions to the same problem and bacterial genotypic backgrounds may predetermine evolutionary trajectories. Current high-throughput next-generation sequencing capacity has significantly outpaced reverse engineering processes to characterize the causative mutations. One time-consuming bottleneck is to distinguish the causative mutations from mutational noise and sequencing errors [22]. Here, two strategies were employed to effectively identify most critical mutations for the adapted phenotype. First, characterization of multiple independent evolutionary trajectories from the same ancestor revealed the convergent mutations important for adaptation to experimental conditions as demonstrated in this work and other laboratory evolution research [33]. Second, as sequencing technical errors were excluded by sequencing each genome twice with high coverage, linkage disequilibrium is probably the only source of noise which we further reduced by focusing on only an individual evolved clone instead of the whole bacterial population.
We identified a G141D substitution in CRP enhancing transcription of xyl operons in E. coli. This mutation was previously reported to alter allosteric regulation [16]. Position 141 is part of a hinge important for the intramolecular transduction of the activation signal [34]. CRP G141D did not release CCR (Fig. 4A), suggesting that this variant remains to be responsive to cAMP, which is consistent with other previously reported mutations at this position [26]. We hypothesize that a polar residue Asp at the position 141 orients the DNA binding domain for a better interaction with operator sequences resulting in an increased expression compared to wild-type. Interestingly, an intrinsically active CRP homolog of the plant pathogen Xanthomonas campestris has an Asp residue at the corresponding position to G141 of E. coli CRP. A substitution of this Asp residue with the nonpolar residue Ala reduces the binding affinity to its promoter by approximately 12-fold [35]. In contrast to CRP G141D, the identified XylR variants not only enhance xylose utilization but also release CCR leading to an efficient co-utilization of glucose-xylose mixtures (Fig. 4). The AraC-type transcription factor XylR has two helix-turn-helix (HTH) motifs and its N-terminal ligand binding domain contains a unique periplasmic-binding protein fold that is structurally related to LacI/GalR transcription factors [27]. Upon xylose binding, XylR undergoes a conformational change orienting the two HTH motifs of the DNA binding domain in an active conformation [27]. In one instance, a Pro residue of XylR was substituted with Ser (P363S) at the site that connects the two HTH motifs likely causing a significant reorientation and tighter binding to the DNA (Fig. 3E). The second variant is a R121C substitution which is neither in close proximity to the DNA nor to the ligand binding domain (Fig. 3E). The region is also not known to participate in signal communication from the N-terminal ligand binding to the C-terminal DNA binding domain [27]. However, the effect on the binding affinity is similarly increased in both variants. A possible mode of action might be an interaction of residue 121 with Thr185 which is directly connected to a helix motif that promotes dimerization of XylR. Cys is less bulky than Arg which might stabilize dimerization due to a reduced steric hindrance.
CCR in E. coli can be dramatically released simply by amino acid substitution in XylR as demonstrated here (Fig. 4). The likely mechanism is that the enhanced binding of XylR variants to the operator sequences between xylAB and xylFGH operons enables independency from both CRP and xylose, thereby resulting in the observed sugar co-utilization. Multiple genetic engineering approaches have previously been developed to release CCR in E. coli. Inactivation of mgsA significantly enhanced sugar co-utilization in an ethanologenic E. coli [28]. However, glucose-induced repression is still severe in two tested strains with inactivated mgsA, LN24B (E. coli W Δ AmgsA) and TG114 (mgsA is deleted to prevent product impurity) (Fig. 4), suggesting its limited application in a broad range of E. coli catalysts. Several cAMP-independent CRP mutants were discovered to release CCR to some extent [16, 36]. But cell growth is often stunted by crp mutations and sugar coutilization is not efficient (Fig. 4A, S7). Disruption of the PTS system (deletion of ptsG or ptsl) is an effective approach to release CCR, but glucose uptake is impaired and need to be re-engineered for glucose utilization, which involves extra genetic modifications or adaptation [7, 13, 18]. In this study, by simply substituting two residues in XylR of an established D-lactate producer, this modified strain is able to simultaneously ferment 50 g L−1 glucose and 43 g L−1 xylose and produce 86 g L−1 lactate within 96 h in a mineral salt medium (Fig. 4B), which shows advantages compared to previously engineered strains for lactate fermentative production using sugar mixtures in terms of the xylose utilization rate, lactate yield and titer under a similar batch fermentation condition [14, 37]. This genetic approach may be effective for other E. coli biocatalysts since these XylR variants have a similar positive effect in a wild-type strain (Fig. 4A).
The ratio between glucose and xylose in many lignocellulosic hydrolysates is usually higher than that was used in this study (1:1 by weight) [2] so that the positive effect of XylR substitution on co-utilization of sugars derived from lignocellulosic hydrolysates is expected to be even greater due to the relative lower ratio of xylose. Moreover, these XylR variants may also reduce arabinose-induced repression which is caused by a competitive binding of AraC (activated by L-arabinose) to the regulatory regions of xyl operons [38]. Enhanced binding of these XylR variants to the operator sequences will potentially compete against AraC and achieve co-utilization of arabinose and xylose.
Materials and Methods
Detailed description of the materials and methods can be found in SI Methods. The used strains, plasmids, and primers are summarized in Table S1. Chromosomal modification was conducted using two-step λ-red recombinase-mediated homologous recombination as previously described [39]. For batch fermentations and adaptive evolution experiments, E. coli was grown at 37 °C in a pH-controlled fermentation vessel using AM1 mineral salt media containing a defined carbon source [20, 21]. Genomes were sequenced using a MiSeq Illumina sequencer generating 600 nt paired-end reads and all sequencing reads were deposited in NCBI (SRA accession: SRP083931).
Acknowledgments
We thank Scott Bingham, David Winter, and Kael Dai for technical assistance. This work was supported by Arizona State University and NIH grant R01-HG007178.