Abstract
Protein folding abnormalities are associated with the pathology of many diseases. This is surprising given the plethora of cellular machinery dedicated to aid protein folding. It is though that cellular response to proteotoxicity is generally sufficient, but may be compromised during pathological conditions. We asked if, in a physiological condition, cells have the ability to re-program transcriptional outputs in accordance with proteostasis demands. We have used S. cerevisiae to understand the response of cells when challenged with different proteostasis impairments, by removing one protein quality control (PQC) gene from the system at a time. Using 14 PQC deletions, we investigated the transcriptional response and find the mutants were unable to upregulate pathways that could complement the function of the missing PQC gene. To our surprise, cells have inherently a limited scope of response that is not optimally tuned; with transcriptomic responses being decorrelated with respect to the sign of their epistasis. We conclude that this non-optimality in proteotoxic response may limit the cellular ability to reroute proteins through alternate and productive machineries resulting in pathological states. We posit that epistasis guided synthetic biology approaches may be helpful in realizing the true potential of the cellular chaperone machinery.
Introduction
The nascent polypeptide chain in the cell is attended to by a complex network of proteins which assist with co-translational and post-translational folding. Further down the assembly line, other proteins assist in translocation, cellular localization and eventually degradation(1). The collective group of chaperones, co-chaperones and degradation proteins that maintain these processes will be referred to as Protein Quality Control (PQC) hence and are integral nodes in the Proteostasis Network (PN) (2-5). PQC’s are known to form large interconnected genetic and physical interaction networks (6-8). They participate in multiple protein complexes and are backed up by redundant machinery to ensure stable proteostasis during conditions of increased protein folding load in cells (9-11). The involvement of multiple chaperones in similar processes naturally leads to the question of whether the saturation or depletion of the substrate of a chaperone leads to rerouting of fluxes towards maintaining homeostasis. However, it is unknown if the deletion of a single chaperone leads to an efficient redistribution of substrates through alternate routes and if there exists an adaptation response in the cell that alters the capacity of other available folding channels.
Given the redundancies present in the network of PQC genes and the large number of connected components, it is expected that the network easily adapts to altered demands. This is counter-intuitive when one considers the preponderance of protein-aggregation associated phenotypes and diseases. It is already known that misfolded proteins cause toxicity in multiple model organisms including yeast and that the collapse of proteostasis is associated strongly with progressive aging (12-14). We are therefore forced to question our original assumption of a highly connected, robust network being capable of adapting efficiently to diverse misfolding stresses. Is the failure of proteostasis a consequence of limitations on re-routing that prevent an efficient response to alterations?
Physiological processes like aging are known to decrease the ability of organisms to mount an efficient stress response. (15-17). The PQC genes are controlled by stress response pathways so, any decrease in the response pathway would lead to a proteostasis collapse. This has been shown elegantly in different model systems (15, 18, 19). Altering the stress response amplitude does have an effect on the folding capacity of a cell (20). However it is not known if the adaptability of the network is sufficiently robust in a physiological setting featuring optimal stress response.
S. cerevisiae is an ideal system for addressing these questions for quite a few reasons. First, a comprehensive genome-wide genetic interaction map is available which elucidates the connections of the genetically interacting modules between the chaperones. Second, the physiological growth in exponential phase is not reported to be complicated by deregulation of stress response pathways, allowing the investigation of the effects of proteostatic disabilities in conditions with a fully responsive PQC network. Finally, the physical interaction data of yeast chaperones is available as part of protein complex data as well as the physical interaction database. Using this model, we investigated how S. cerevisiae cells re-adapt to deletion (or depletion) of single PQC genes. Using a small library of 14 PQC gene deletions we obtained the transcriptomic adaptation of the cells to these deletions, when different regions of the PN are perturbed.
We obtain a new network of the chaperones based on cellular sensory mechanisms with insignificant overlap with the genetic interaction map. Remarkably, it appears that even a healthy exponentially growing cell has limited ability to respond to the specific proteostasis disabilities in the most optimal manner; the redundant machineries are not specifically upregulated. The cellular response on the other hand tackles the disability primarily by overexpressing the physically interacting partners but not the PN complexes. This suggests that cellular response is tuned to handle more of the substrate proteins to maintain cellular activity by increasing the degradation capacity in order to limit toxicity. Thus, it appears that the chaperone network, may have limited capacity to adapt to changing requirements of proteostasis, thereby acting as an Achilles’ heel in many pathological conditions.
Results
PN perturbations do not lead to canonical heat shock response
Chaperones have been shown to be regulated by HSF-1(21), the primary transcription factor for cytosolic proteotoxic stress; this would imply that protein folding stress is sensed in an universal manner irrespective of the type of perturbation in proteostasis. Molecules to be deleted were chosen to represent the diverse hubs that govern proteostasis in S. Cerevisiae (Figure 1A, left panel). The chaperone Ssa1 was chosen as the most canonical protein folding chaperone of the HSp70 family in the cytosol. Ssa3 was chosen in addition to Ssa1 to represent the inducible Hsp70 system in the cytosol. Hsp82 and its co-chaperone Sti1 were chosen to represent the cytosolic Hsp90 system. Hsp104 was picked as a disaggregase and San1 as part of protein degradation. The other classes of chaperones were primarily involved in ribosome-associated quality control (Rqc2, Rkr1), ribosome-associated protein folding of nascent chains (Ssb1, Egd1), Ribosome Assembly (Sse1, Ssb1, JJJ1) (22). Deletion of Mcx1 and a hypomorphic allele of Ssc1 (DAMP allele) were used to obtain information pertaining to the cellular response due to the loss of mitochondrial chaperones. The abundance of these molecules vary (figure 1A, right panel) in basal condition in a wildtype strain; providing an insight into how much the cell depends upon the particular molecule in question. Transcriptome profiling was carried out for the 14 PQC deletions using ultra-deep sequencing in replicates. Growth conditions were kept standard to mimic growth in rich conditions with minimal load on protein folding due to metabolic requirements. The Heat Shock Response (HSR) as measured by the amount of synthetic HSE-induced GFP (23) in these chaperone deletions show that under basal conditions, only hsp104, sti1 and sse1 deletions cause a significant HSR (Figure S1 A); although upon heat shock at 37°C, all the selected PQC deletions (and depletion) show a significant increase in HSE-induced GFP. We do not find a transcriptomic enrichment of the canonical HSR genes from our transcriptome analysis (Figure S1B), different deletion strains indicated that the general response to chaperone deletions do not induce a canonical response in S. cerevisiae (Table S1, S2, S3). While this is understandable for the inducible chaperone systems, it is interesting to note that deletion of the most abundant chaperone system Ssa1, did not induce the expression of a canonical heat shock responsive protein Ssa4. However, sse1Δ did show upregulation of three of the chaperones (Hsp104, Hsp78 and Hsp42) that increase during heat shock. The degree of response may be linked to the severity of the defects that arise due to each of these deletions. Indeed, sse1Δ and jjj1Δ exhibit a drastic growth defect at 30°C although other deletions did not show any significant difference from WT (Figure 1B). However, jjj1Δ did not exhibit a strong transcriptional upregulation of canonical heat shock response genes (Figure S1B, Table S3). Interestingly, none of the strains show a significant increase in overall chaperone levels. Ssa4, a chaperone which is known to be induced upon cytosolic stress, is not upregulated in any of the strains except sse1Δ (Figure 1C). This is in line with the fact that HSR genes are not upregulated in these PQC deletions. However, we do see Msn2/4 target Hsp12 significantly upregulated in many of the deletions, suggesting that either activation of Msn2/4 pathway or HSF-1 induced genes occur to restore homeostasis (Figure 1C). This indicates that severity of defect is not the primary determinant of the amplitude or nature of response to proteostasis perturbations. The cellular response to a PQC deletion depends on the type of functional node perturbed showing there is a branch specificity in the response to which PQC is absent.
Perturbation of proteostasis primarily reroutes cellular response through protein degradation and altered metabolism
Using a standard GO based approach we attempted to obtain the primary routes of re-programming that ensures cellular homeostasis in the different PQC deletion strains (Figure 2A). We observe an up-regulation in amino acid metabolism pathways in egd1Δ, jjj1Δ and ssc1 Damp strains indicating that the response to the loss of the chaperones is primarily mediated by metabolic changes that may be trying to stabilize proteostasis. Since transcriptome data analysis relies more on reducing type-I errors, the analysis typically rejects a large number of true positive alterations. This is inherent for systems with large noise that is typical of transcriptome data at low expression levels. To get over this problem and look at alterations in a pathway specific manner, we chose multiple GO-classes that may have potential links with chaperone perturbations. We obtained the fold-change values of these sets in the different strains with respect to WT cells and then queried if the distribution of the fold-changes of these subsets were significantly different from the whole transcriptome (Mann-Whitney test with correction for FDR) (Figure 2B, Table S4). This analysis reveals that rqc2Δ and sse1Δ are the most perturbative deletions in all categories queried among all the deletion strains. In most of the cases, proteosome and ubiquitin upregulation is the primary response to PQC deletions; although expected, this was non-obvious from the conventional GO analysis. Protein degradation seems to be one of the primary routes to retain homeostasis in the face of different types of proteostasis-network perturbations. Of note, is the fact that many of these cytosolic PQC deletions show an upregulation of mitochondrial and ER chaperones. Ribosome-related and translation-associated categories show significant alteration in rqc2Δ and this could point towards the importance of Rqc2 is general translation and ribosome biogenesis. Although a recent report (23) has shown that Rqc2 is essential in mounting a functional HSR using HSE-GFP based reporter system, we find that HSR is among the most upregulated category in rqc2Δ. This could be due to the fact that we have taken the top 40 genes upregulated upon 37°C heat stress as obtained from a previous microarray dataset (24)(Figure S2). Two of the strains (hsp104Δ and rqc2Δ) however, show an increase in expression of genes belonging to polyphosphate, indicating the complementarity of this pathway in assisting intracellular protein folding. Of interest, is the fact that sterols are implicated in aiding thermotolerance (25) and we find biosynthesis of sterols is upregulated in rqc2Δ, ssa1Δ and egd1Δ. It is striking that metabolic genes were strongly altered in expression in many of the PQC deletion strains and this was the pathway that differed significantly in most of the strains. This was surprising and it hints that one of the outcomes or responses of PQC perturbation is routed through metabolic rewiring. Overall, each of the PQC perturbations lead to specific transcriptional outcomes with different modules being altered to aid protein folding anomalies that may arise in these strains.
Cellular response to PQC deletions define similarities in functionality of the genes
While cells respond to each of the chaperone deletions in a specific manner, there are commonalities in the response. The similarity in the response is obvious once the correlation between the transcriptome profiles (fold-change values of a chaperone deletion with respect to WT) are plotted. To obtain the similarities between the deletions we obtained the pair-wise correlation coefficients between the transcriptome profiles of different chaperone deletion strains (Figure 3A). The strength of the correlations indicates the similarity between the transcriptomic responses of two chaperone deletions. The pairwise correlations above a threshold were used to generate a network between the different chaperones (Figure 3B). This provides a bird’s-eye view of the proteostasis network based on the similarity of cellular response to each of these PQC deletions. This would primarily indicate the level of similarity in the proteostatic perturbations due to depletion of each of these PQCs. Deletion of genes that function through similar pathways, or aid in the folding of a similar group of proteins should ideally elicit similar transcriptomic responses. Similarities of response were apparent within multiple chaperones. As seen by the positional clustering in the network view as well as from the hierarchical clustering of the correlation matrix (Figure 3C), the network is a well-knit one with strong correlations between the chaperones. This information was unobtainable from the conventional differential expression analysis of only the significantly altered transcripts. Hierarchical clustering identified four main clusters in the data. First, a cluster of two cytosolic (Jjj1, Ssa1), and two mitochondrial (Mcx1, Ssc1) PQC nodes were formed (cluster 1). Second, the more heterogeneous group consisting of chaperones along with two E3 ubiquitin ligases (Sti1, Hsp82, Ssb1, Rkr1, and San1) (Cluster 2). Third, Egd1, and Rqc2 (Cluster 3), both of which are associated with ribosomes and are involved in quality control of nascent polypeptides. Interestingly, Rkr1 (Ltn1), known to be associated with Rqc2 is not part of this cluster. Fourth, cytosolic chaperones that aid folding and refolding of aggregates (Hsp104, Ssa3 and Sse1) clustered together (Cluster 4). Co-clustering of San1 and Rkr1 along with other chaperones, highlight the importance of these two proteins in proteostasis maintenance in general. Connection of Rkr1 with San1, another E3 ubiquitin ligase, is the strongest among all the pairwise comparisons in this map. Interestingly, this connection is stronger than the connection of Rkr1 with its physical interacting partner Rqc2 that forms the ribosome quality control complex. This network view reveals that Rkr1 may have a more general role like San1 in protein quality control that is not limited to quality control of nascent polypeptides through the RQC complex.
The network view also exhibited some unexpected correlations, in Cluster 2, for example a stronger correlation between Sse1 and Ssa3 than between Sse1 and Ssa1. Sse1 is better connected to Hsp104/Ssb1 than its known interacting partner, Ssa1. This reveals that either the nucleotide exchange activity of Sse1 has cryptic complexities that are yet to be discovered or there are additional undiscovered functions of Sse1.
Comparison of cross-compartment proteostasis perturbations similarly reveals a surprising network view. Ssc1 is slightly better connected to San1 than Mcx1, although Ssc1 and Mcx1 are known to manage protein quality control in the mitochondrial matrix. Interestingly, Mcx1 shows strong correlation with Rqc2 and Ssa1 along with Ssc1. This shows that there are multiple sensory mechanisms that connect mitochondrial proteostasis with cytosolic protein quality control machineries.
The connections and clustering agrees with an intuitive view of the network, suggesting the authenticity of the connections and the approach. Additionally, this view offers rich details of the connections that are non-intuitive. Overall, this view of PN, based on cells’ response to the depletion of different components of PN reveal interesting interactions between the modules, highlighting cryptic connections that exists through cellular sensory mechanisms.
Cellular response to PN perturbation is not guided by genetic interactions
Since the commonality of response was evident from the network view, we investigated the genetic principles that guide the response to specific PQC deletions. Genetic interactions as obtained through fitness epistasis scores (ε) (26) define functional relationships between pathways; we asked if cellular response is commensurate with epistatic interactions. When we overlay the response network with the epistatic scores between the chaperones, we obtain an unexpected dissimilarity between the epistasis scores and response correlations (Figure 3B). From a very simplistic view of epistasis, genes in the same pathways are expected to have positive epistatic scores. These pairs, since they are on the same pathways, should also show similar cellular response upon their deletion. However, the positively epistatic pairs do not show a stronger correlation than the negatively epistatic pairs. Strikingly, although Sse1 and Sti1 are strongly exacerbating pairs in terms of epistasis (negative epistatic interaction), they have one of the weakest links in the map. Additionally, many of the non-epistatic pairs (for example San1 and Rkr1) show the strongest connections in the response map. This again underlines the fact that there are connections through sensory pathways that are hidden from the fitness based epistasis scores.
Specifically, fitness dependent high-throughput studies have guided a large number of investigations in functional genomics. The various resultant genetic interaction maps have guided our understanding of the inter-dependency of different pathways. Given this, we wondered can the cellular response recapitulate the genetic interaction network architecture in the context of chaperone deletions? To obtain a more global picture of the correlation between epistasis scores and cellular response, we checked if the transcriptional response to a PQC gene deletion is guided by the epistatic interactions defined by the chaperone (Figure 4A). Typically, genes with strong negative interactions would be the ones that operate in parallel through a redundant pathway. We expect the negatively interacting genes to be upregulated to maintain proteostasis when a chaperone is depleted. However, we observed no apparent correlation between the interaction score and the transcriptome regulation. Upregulated genes do not show any preference for genetically interacting partners and are centered around zero epistasis scores. This suggests, even while accounting for the experimental error in the estimation of epistasis values, that the cells may not have evolved towards optimality, where the response is always tuned to upregulate the modules that interact negatively with the deleted gene. This would be supported by the fact that the strongest negative interactors do not show any alteration in expression. As a more relaxed check for optimality, we asked if there is overexpression of interactors that lead to synthetically lethality (Figure 4B). Except for Ssb1 none of the other chaperone deletions exhibited any statistical significance. For majority of the cases, there is no overexpression of the genes whose deletion is synthetically lethal with the chaperone deletion in question.
Paralogs in many cases are known to function in redundant pathways. We checked the expression levels for the existing paralogs of the different PQC genes (Table S1). In a case by case basis, Sse2, the homolog of Sse1 that it essential in sse1Δ deletion strain, is overexpressed upon the deleting sse1. Hsc82, the paralog of hsp82, was overexpressed in the hsp82Δ deletion strain but was also overexpressed in most other strains, suggesting that the change is non-specific. The other four paralogs of the PQC genes did not show any alteration in expression levels. This suggests that deletion of a PQC node neither upregulates the negatively genetic interacting partners, nor the synthetic lethal partners nor the paralogs, canonically.
A recent report (27) allowed us to investigate if transcriptional response to specific deletions is generally uncoupled to epistatic interaction scores or if this feature was exclusive of the PQC genes. None of the gene deletions, except three, show any significant correlation between the transcriptional alterations of the expressed genes and the epistasis of these with the gene that is deleted (Table S5). All the three that show significant correlation exhibit negative correlation between epistasis values and expression change; negatively epistatic genes are upregulated while the positive ones are down. This is exactly as we expected in case optimality is built into the response network. Thus, barring these three exceptions, all the other deletions behave like the PQC deletionsl; genes that are negatively epistatic to the deleted genes, do not show an upregulation. This underlines that response to any perturbation, not limited to the PN, is not guided by the optimality as predicted by the genetic interaction network.
In order to understand if there is any specific regulation guided by the functionality of the PQC genes, we asked if the expression of the protein complexes are altered when their constituent PQC gene is deleted. Protein complexes are generally known to be co-regulated (28, 29). Hence the expression level alteration of a PQC protein-containing complex when that particular PQC gene is deleted, indicate the ability of cells to compute and respond to depletion of that particular gene. In none of the cases of PQC gene deletions we find evidence of upregulation of the complex when a constituent PQC-gene is deleted indicating that cells do not measure and respond to changes in levels of functional PQC complexes (Figure4C and Figure S3). A special case of complex formation are the co-chaperones of different PQC proteins. For the known co-chaperones, we do not find a significant upregulation except for the case of Ssa1 (Figure 4D). Remarkably through, the physical interacting partners (30, 31) of the PQC proteins, in many cases show significant upregulation when the interacting PQC protein is knocked out (Figure 4E). This indicates that there is a signalling specificity that is built into the network, highlighting the cellular capability to compute a specific response to each deletion, given a limited number of transcription factors.
Taken together, it is apparent that cells have the capacity to compute and respond to many of the specific deletions, but this response is guided more by protein level interactions rather than genetic and hence functional redundancies. This implicates that cells most probably have evolved to respond towards an evolutionary quick-fix (upregulation of proteolysis) that helps the organism retain fitness rather than towards the most optimal response that renders the system robust to perturbations. Hence, cells will be unable to readapt towards the most efficient state when chaperones are depleted during pathological conditions.
Cryptic optimal pathways can be activated to alleviate PQC deletion associated fitness defects
We asked if cellular growth defect due to deletion of a PQC may be relieved in case there is an optimal response. Two PQC gene deletions (jjj1Δ and sse1Δ) showed significant growth defect at physiological temperatures (Figure 1B). We chose jjj1Δ as a case study, as the redundancy between Sse1 and its functional homolog Sse2, has been well-studied(32). Serendipitously, we obtained that jjj1Δ survives as good as WT strain at 37°C while it has a strong growth defect at 30°C (Figure 5A). This suggests that there are additional cellular responses that can take care of defects arising due to jjj1Δ, but are not part of the basal response to the deletion. We checked for the negative genetic interactors of jjj1Δ and found that one of the transcription factors responsible for oxidative stress response are among the top ten (Figure 5B)(26, 33). This led us to hypothesize, that a specific branch of oxidative response work in parallel with jjj1, upregulation of this pathway should lead to abrogation of the fitness defect of jjj1Δ. Indeed, fitness defect of jjj1Δ with respect to WT is completely abrogated in the presence of oxdiative stress (Figure 5C). Together, growth in presence of oxidative stress or at high temperature, demonstrates that cellular systems do have the capability to take care of jjj1Δ, but this capability is not switched on as a default pathway upon depletion of jjj1 function. Thus, it strongly supports our conjecture that cellular response to proteostasis disabilities is not tuned for optimal response, even when appropriate backup pathways exist.
Discussion
Compromised proteostasis due to depletion of chaperones is expected to be rampant in many physiological and pathological conditions. Primarily in cases where PQC genes are overburdened due to excessive expression of substrate proteins, or when there is a gene regulation blockage that prevents the expression of these genes. Literature is replete with assumptions that hypothesise that large reorganization of chaperone complexes in these cases would reroute fluxes of substrates to retain homeostasis (4, 9, 34, 35). We show that the capacity of alternate channels is not specifically altered when one of the folding channels is blocked. This undermines the importance of flux-rerouting in maintaining proteostasis. This along with the absence of cellular capacity to overexpress the depleted complexes implies that cellular system will be unable to specifically adapt specifically to chaperone depletions. The evolutionary solution to the response seems to be a quick-fix; it upregulates the proteolysis pathway to recycle misfolded or aggregated proteins in most of the cases. Interestingly, although the responses seem non-optimal, the most of the PQC gene deletions do not show a phenotype under standard growth conditions or when grown at heat shock temperatures. This however is not an evidence that these PQC genes are non-functional, the cells do show a clear response to all these deletions. It is nonetheless possible that the quick-fix solution of upregulating proteolysis and altering metabolism is able to cater to the immediate requirements of the cell.
This inability to respond specifically is crucial in conditions with proteotoxic aggregates like the amyloid aggregates of polyglutamine repeats. It has been suggested that chaperone depletion is a major cause for toxicity of these aggregates; it was unclear why cellular stress response machinery is unable to reinstate homeostasis by increasing the level of the depleted chaperones. Our work suggests that yeast cells lack the inherent capacity to respond specifically and optimally to chaperone deletions, raising the possibility that cells do not manage to compute and respond specifically during pathological conditions. However, redundant pathways in specific cases do exist but is not linked to cellular sensory mechanism. Linking the sensory mechanisms to the pathways or upregulating the redundant pathways by preconditioning cells to an altogether different stress may form a suitable avenue to cope up with depletion of essential chaperones.
One major caveat of the work is the assumption that cells would respond similarly to depletion and deletion of chaperones (36). In spite of this limitation, it is important to note that cellular rewiring, at least in this lower eukaryotic model, is independent of the underlying genetic network.
Moreover, the non-optimal response observed for yeast chaperone network leaves open the scope of reengineering yeast, using current technologies, to respond optimally to proteostasis imbalance. Investigation of this sort will reveal if this would be beneficial for the different proteotoxicity associated phenotypes, or for commercial application of yeast in protein production. Summarily, this study unravels a cryptic limitation in the yeast chaperone network, a proposal that needs further investigation in higher organisms.
Materials and Methods
Media and growth conditions
Rich media for culturing yeast (YPD medium) containing 1% (w/v) yeast extract, 2% (w/v) peptone and 2% (w/v) dextrose.
Yeast strains BY4741(leu2Δ0 ura3Δ0 met15Δ0 his3Δ1) and deletion strains in the background of BY4741 (Saccharomyces Genome Deletion Project, http://www-sequence.stanford.edu/group/yeast_deletion_project/deletions3.html) (Open biosystems, GE Healthcare) were grown at 30⁏ C, 200 rpm.
Cells were innoculated in YPD at 0.2 O.D 600 from an overnight primary culture and grown to mid-log phase before harvesting for RNA isolation.
RNA preparation
Cells were lysed using acid-washed glass bead and total RNA was extracted by TRIzol (Invitrogen) method. It was further purified using Qiagen columns (RNeasy mini kit). RNA concentration and integrity were determined using nanodrop and agarose gel electrophoresis.
Library preparation
Using Truseq RNA sample prep kit v2 (stranded mRNA LT kit), the library was prepared according to the manufacturer’s instructions (www.illumina.com). 700ng of RNA from each sample was used to generate the library. Adaptor-ligated fragments were purified using AMPure XP beads (Agencourt). Adaptor-ligated RNAseq libraries were amplified (12-14 cycles), purified and measured with a Qubit instrument (Invitrogen). The average fragment size of the libraries was determined with a BioAnalyzer DNA1000 LabChip (Agilent Technologies). Diluted libraries were multiplex-sequenced and run on a HiSeq 2000 Illumina platform in HiSeq Flow Cell v3 (Illumina Inc., USA) using TruSeq SBS Kit v3 (Illumina Inc., USA) for cluster generation as per manufacturer’s protocol.
Mapping sequence reads
Only reads with phred quality score equal or higher than 30 were taken for analysis. Trimmomatic (v0.43) (37) was used to trim read sequences. Reads were then aligned to the transcriptome of S. cerevisiae strain S288c as available from ENSEMBL using Kallisto (v0.36) software (38). The reads on an average had over 80% alignment with the reference genome. Gene expression levels were estimated as TPM (Transcripts per million) values. The estimated counts were combined into a matrix and analyzed with EBSeq v1.10.0 (39). Differential expression tests are run using EBTest with 20 iterations of the EM algorithm. After this, the list of differentially expressed genes, log2fold change of all the genes in the transcriptome and the posterior probabilities of being differentially expressed are obtained using GetDEResults with method “robust”, FDR method “soft”, FDR = 0.05, Threshold FC 0.7, Threshold FC Ratio = 0.3.
Construction of the Co-regulation Network and hierarchically clustered corrgrams
To examine the similarity of the transcriptomic response to chaperone deletions, we computed the pair-wise correlation co-efficient of log2 fold changes of all the genes (complete transcriptome) across all the strains. Here, the correlation co-efficient indicates the extent of similarity of the transcriptomic response across mutants to the loss of the chaperone (1 being most similar and 0 being least similar). The correlation matrix was converted into a network with nodes representing the mutants and the edges weighted by the correlation between the mutants. A force directed layout (implemented through Cytoscape (40) was then applied on the network which clustered strongly connected nodes towards the centre and repelled weakly connected nodes to the periphery of the network. The nodes of the network show the structure of the protein, if available and were taken from STRING database (41). A complementary representation involves hierarchically clustering of a heatmap of the correlation values.
Protein Complexes and Pathways, expression shift Analysis
To detect shifts in expression for pathways, we employed a GO pathway/functional category based analysis strategy that tests for the overall shift in expression levels for a group of genes that belong to a pathway(obtained from YeastMine) against the rest of the genes in the transcriptome. Refer to Table S4 for details of genes used for each category. For the category Heast Shock Response, we took the first 45 genes upregulated upon 60minute heat shock treatment at 37°C by analysing a previously published microarray data (24).The non-parametric Mann Whitney test is used to avoid issues with over-dispersion in the data and to test the significance of the median shift in expression of genes in a particular pathway in comparison to the whole transcriptome. The Benjamini-Hochberg correction is applied when testing in a combinatorial fashion across pathways to correct for multiple testing.
Besides GO pathways we followed the same strategy to examine the expression status of physically interacting partners, co-chaperone genes, and synthetic lethal partners. The list of physical interacting partners for each protein was obtained from BIOGRID. All statistical tests and analysis were performed with R version 3.3.0 (42), scatterplot matrices using the lattice package (43) and heatmaps were made using the corrplot package (44).
GO Functional Enrichment
Differentially expressed genes for each sample were classified into up-regulated and down-regulated lists and these lists were checked for GO enrichment of categories using WebGestalt (45) at an FDR cut-off of 0.05.
Transcription -Epistasis correlation
We downloaded the transcriptome fold changes as provided in the supplementary files by in a recent report by Kemmeren et al., (27) and mapped the deletion from the columns in the provided data set to deletions in the epistasis score matrix (26) and extracted their epistasis profiles. 511 mutants out of the 700 mutants mapped to the epistasis score matrix. For each deletion, we computed the correlation co-efficient with significance indicated by P-values between the transcriptome fold-changes of genes to the epistasis score within the epistasis profile.
Acknowledgement
We are grateful to Dr. Mohammed Faruq for aiding us with the Illumina sequencing platform. This work was primarily funded by OLP1104 grant by CSIR to KC and partially by SNU core-funding to KM. We thank the CSIR-IGIB for the HPC facility. AG1 and AG2 thank UGC for their fellowship. MV is grateful to CSIR, and SD to SNU-core funding for their fellowship. We also thank Gopal G. Jayaraj and Satya Pandey for critically reading the manuscript.
Author Contributions
AG1, KC, KM designed the work. KC, KM and DD supervised the work and analysis. AG1, SD, MV did the yeast experiments. Sequencing was done by AG1. Analysis was primarily done by AG2 along with AG1 and KC. AG1 and KC wrote the manuscript with input from all authors. All authors discussed the results and commented on the manuscript. Authors declare no competing interests.