Genetic Analysis of a Metazoan Pathway using Transcriptomic Phenotypes

David Angeles-Albores; Carmie Puckett Robinson; Brian A. Williams; Paul W. Sternberg

doi:10.1101/112920

Abstract

RNA-seq is commonly used to identify genetic modules that respond to a perturbation. Although transcriptomes have been mainly used for target gene discovery, their quantitative nature makes them attractive structures with which to study genetic interactions. To understand whether whole-organism RNA-seq is suitable for genetic pathway reconstruction, we sequenced the transcriptome of four single mutants and two double mutants of the hypoxia pathway in C. elegans. By comparing the expression levels of double mutants with their corresponding single mutants, we were able to determine, on a genome-wide level, that EGL-9 acts along VHL-1-dependent and independent branches to inhibit HIF-1. We were also able to observe transcriptome-wide suppression of the egl-9(lf) phenotype in an egl-9(lf) hif-1(lf) double mutant. As a by-product of our analysis, we identified a core hypoxic response consisting of 355 genes, and 45 genes that have hif-1-independent, vhl-1-dependent expression. Finally, we are able to identify 31 genes that exhibit non-canonical epistasis: for these genes, vhl-1(lf) mutants show opposing effects to egl-9(lf) mutants, but the egl-9(lf);vhl-1(lf) exhibits the egl-9(lf) phenotype. We suggest that this non-canonical epistasis reflects unexplored aspects of the hypoxia pathway. We discuss the implications, benefits and advantages of using transcriptomic phenotypes to perform pathway analysis.

Introduction

Genetic analysis of molecular pathways has traditionally been performed through epistatis analysis. Generalized epistasis indicates that two genes interact functionally; such interaction can involve the direct interaction of their products or the interaction of any consequence of their function (small molecules, physiological or behavioral effects)¹. If two genes interact, and the mutants of these genes have a quantifiable phenotype, the double mutant of interacting genes will have a phenotype that is not the sum of the phenotypes of the single mutants that make up its genotype. Epistasis analysis remains a cornerstone of genetics today².

Recently, biological studies have shifted in focus from studying single genes to studying all genes in parallel. In particular, RNA-seq³ enables biologists to identify genes that change expression in response to a perturbation. Gene expression profiling using RNA-seq has become much more sensitive thanks to deeper and more frequent sequencing due to lower sequencing costs⁴, better and faster abundance quantification ^5,6,7, and improved differential expression analysis methods^8,9. RNA-seq has been successfully used to identify genetic modules involved in a variety of processes, including T-cell regulation^10,11, the Caenorhabditis elegans (C. elegans) linker cell migration¹², and planarian stem cell maintenance^13,14. For the most part, the role of transcriptional profiling has been restricted to target gene identification.

Although transcriptional profiling has been primarily used for descriptive purposes, transcriptomic phenotypes have previously been used to make genetic inferences. Microarray analyses in S. cerevisiae and D. discoideum were used to show that transcriptomes can be interpreted to infer genetic relationships in simple eukaryotes^15,16. eQTL studies in many organisms, from yeast to humans, have established the usefulness of transcriptomic phenotypes for population genetics studies^17,18,19,20. In cell culture, single-cell RNA-seq has seen significant progress towards using transcriptomes as phenotypes with which to test genetic interactions^21,22. More recently, we have identified a new developmental state of C. elegans using whole-organism transcriptome profiling²³. To investigate the ability of whole-organism transcriptomes to serve as quantitative phenotypes for epistasis analysis in metazoans, we sequenced the transcriptomes of of four well-characterized loss of function mutants in the C. elegans hypoxia pathway^24,25,26,27.

Metazoans depend on the presence of oxygen in sufficient concentrations to support aerobic metabolism. Genetic pathways evolved to rapidly respond to any acute or chronic changes in oxygen levels at the cellular or organismal level. Biochemical and genetic approaches identified the Hypoxia Inducible Factors (HIFs) as an important group of oxygen-responsive genes that are involved in a broad range of human pathologies²⁸.

Hypoxia Inducible Factors are highly conserved in metazoans²⁹. A common mechanism for hypoxia-response induction is heterodimerization between a HIFα and a HIFβ subunit; the heterodimer then initiates transcription of target genes³⁰. The number and complexity of HIFs varies throughout metazoans, with humans having three HIFα subunits and two HIFβ subunits, whereas in the roundworm C. elegans there is a single HIFα gene, hif-1 ²⁷ and a single HIFβ gene, ahr-1 ³¹. HIF target genes have been implicated in a wide variety of cellular and extracellular processes including glycolysis, extracellular matrix modification, autophagy and immunity^{32,33,34,35,28}.

Levels of HIFα proteins tend to be tightly regulated. Under conditions of normoxia, HIF-1α exists in the cytoplasm and partakes in a futile cycle of continuous protein production and rapid degradation³⁶. HIF-1α is hydroxylated by three proline hydroxylases in humans (PHD1, PHD2 and PHD3) but is only hydroxylated by one proline hydroxylase (EGL-9) in C. elegans³⁷. HIF-1 hydroxylation increases its binding affinity to Von Hippel Lindau Tumor Suppressor 1 (VHL-1), which allows ubiquitination of HIF-1 leading to its subsequent degradation. In C. elegans, EGL-9 activity is inhibited by binding of CYSL-1, and CYSL-1 activity is in turn inhibited at the protein level by RHY-1, possibly by post-translational modifications to CYSL-1³⁸ (see Fig. 1).

Figure 1.

Genetic and biochemical representation of the hypoxia pathway in C. elegans. Red arrows are arrows that lead to inhibition of HIF-1, and blue arrows are arrows that increase HIF-1 activity or are the result of HIF-1 activity. EGL-9 is known to exert vhl-1-dependent and independent repression on HIF-1 as shown in the genetic diagram. The vhl-1-independent repression of HIF-1 by EGL-9 is denoted by a dashed line and is not dependent on the hydroxylating activity of EGL-9. Technically, RHY-1 inhibits CYSL-1, which in turn inhibits EGL-9, but this interaction was abbreviated in the genetic diagram for clarity.

Here, we show that transcriptomes contain robust signals that can be used to infer relationships between genes in complex metazoans by reconstructing the hypoxia pathway in C. elegans using RNA-seq. Furthermore, we show that the phenomenon of phenotypic epistasis, a hallmark of genetic interaction, holds at the molecular systems level. We also demonstrate that transcriptomes contain sufficient information, under certain circumstances, to order genes in a pathway using only single mutants. Finally, we were able to identify genes that appear to be downstream of egl-9 and vhl-1, but do not appear to be targets of hif-1. Using a single set of genome-wide measurements, we were able to observe and quantitatively assess significant fraction of the known transcriptional effects of hif-1 in C. elegans. A complete version of the analysis, with ample documentation, is available at https://wormlabcaltech.github.io/mprsq.

Results

The hypoxia pathway controls thousands of genes in C. elegans

We selected four single mutants within the hypoxia pathway for expression profiling: egl-9(lf) (sa307), rhy-1(lf) (ok1402), vhl-1(lf) (ok161), hif-1(lf) (ia4). We also sequenced the transcriptomes of two double mutants, egl-9(lf);vhl-1(lf) (sa307, ok161) and egl-9(lf) hif-1(lf) (sa307, ia4) as well as wild-type N2 as a control sample. Each genotype was sequenced in triplicate at a depth of 15 million reads. We performed whole-organism RNA-seq of these mutants at a moderate sequencing depth (~ 7 million mapped reads for each individual replicate) under normoxic conditions. For single samples, we identified around 22,000 different isoforms per sample, which allowed us to measure differential expression of 18,344 isoforms across all replicates and genotypes (this constitutes ~70% of the protein coding isoforms in C. elegans). We also included in our analysis a fog-2(lf) (q71) mutant which we have previously studied²³, because fog-2 is not reported to interact with the hypoxia pathway. We analyzed our data using a general linear model on logarithm-transformed counts. Changes in gene expression are reflected in the regression coefficient, β which is specific to each isoform within a genotype. Statistical significance is achieved when the q-values for each β (p-values adjusted for multiple testing) are less than 0.1. Genes that are significantly altered between wild-type and a given mutant have β values that are statistically significantly different from 0. These coefficients are not equal to the average log-fold change per gene, although they are loosely related to this quantity. Larger magnitudes of β correspond to larger perturbations. These coefficients can be used to study the RNA-seq data in question.

In spite of the moderate sequencing depth, transcriptome profiling of the hypoxia pathway revealed that this pathway controls thousands of genes in C. elegans. The egl-9(lf) transcriptome showed differential expression of 1,806 genes. Similarly, 2,103 genes were differentially expressed in rhy-1(lf) mutants. The vhl-1(lf) transcriptome showed considerably fewer differentially expressed genes (689), possibly because it is a weaker controller of hif-1(lf) than egl-9(lf)²⁶. The egl-9(lf);vhl-1(lf) double mutant transcriptome showed 2,376 differentially expressed genes. The hif-1(lf) mutant also showed a transcriptomic phenotype involving 546 genes. The egl-9(lf) hif-1(lf) double mutant showed a similar number of genes with altered expression (404 genes, see Table 1).

View this table:

Table 1.

Number of differentially expressed genes in each mutant.

Principal Component Analysis visualizes epistatic relationships between genotypes

Principal Component Analysis (PCA) is a well-known technique in bioinformatics that is used to identify relationships between high dimensional data points³⁹ We performed PCA on our data to examine whether each genotype clustered in a biologically relevant manner. PCA identifies the vector that can explain most of the variation in the data;this is called the first PCA dimension. Using PCA, one can identify the first n dimensions that can explain more than 95% of the variation in the data. Sample clustering in these n dimensions often indicates biological relationships between the data, although interpreting PCA dimensions can be difficult.

After applying PCA, we expected hif-1(lf) to cluster near egl-9(lf) hif-1(lf), because hif-1(lf) exhibits no phenotypic defects under normoxic conditions, in contrast to egl-9(lf), which exhibits an egg-laying (Egl) phenotype in the same environment. In egl-9(lf) hif-1(lf) mutants the Egl phenotype of egl-9(lf) mutants is suppressed and instead the grossly wild-type phenotype of hif-1(lf) is observed. On the other hand, we expected egl-9(lf), rhy-1(lf), vhl-1(lf) and egl-9(lf);vhl-1(lf) to form a separate cluster since each of these genotypes is Egl and has a constitutive hypoxic response. Finally, we included as a negative control a fog-2(lf) mutant we have analyzed previously ²³. This data was obtained at a different time from the other genotypes, so we included a batch-normalization term in our equations to account for this. Since fog-2 has not been described to interact with the hypoxia pathway, we expected that it should appear far away from either cluster.

The first dimension of the PCA analysis was able to discriminate between mutants that have constitutive high levels of HIF-1 and mutants that have no HIF-1, whereas the second dimension was able to discriminate between mutants within the hypoxia pathway and outside the hypoxia pathway (see Fig. 2). Therefore expression profiling measures enough signal to cluster genes in a meaningful manner in complex metazoans.

Figure 2.

Principal component analysis of various C. elegans mutants. Genotypes that have an activated hypoxia response (i.e, egl-9(lf), vhl-1(lf), and rhy-1(lf)) cluster far from hif-1(lf). hif-1(lf) clusters with the suppressed egl-9(lf) hif-1(lf) double mutant. The fog-2(lf) transcriptome, used as an outgroup, is far away from either cluster.

Reconstruction of the hypoxia pathway from first genetic principles

Having shown that the signal in the mutants we selected was sufficient to cluster mutants using the values of the regression coefficients β, we set out to reconstruct the hypoxia pathway from genetic first principles. In general, to reconstruct a pathway, we must first assess whether two genes act on the same phenotype. If they do not act on the same phenotype (the set of commonly differentially regulated genes between two mutants is empty), these mutants are independent. If they are not independent, then two mutants have a shared transcriptomic phenotype (STP)—a set of genes or isoforms that are differentially expressed in both mutants, without taking into account what direction they change in. In this case, we must measure whether these genes act additively or epistatically on the measured phenotype; if there is epistasis we must measure whether it is positive or negative, in order to assess whether the epistatic relationship is a genetic suppression or a synthetic interaction.

Genes in the hypoxia mutant act on the same transcriptional phenotype

We observed that all the hypoxia mutants had significant shared transcriptomic phenotypes (fraction of the transcriptomes that was shared between mutants ranged from a minimum of 6.8% shared between hif-1(lf) and egl-9(lf);vhl-1(lf) to a maximum of 31% shared genes between egl-9(lf) and egl-9(lf);vhl-1(lf)). For comparison, we also analyzed a previously published fog-2(lf) transcriptome²³. The fog-2 gene is involved in masculinization of the C. elegans germline, which enables sperm formation, and is not known to be involved in the hypoxia pathway. The hypoxia pathway mutants and the fog-2(lf) mutant also showed shared transcriptomic phenotypes (3.6%–12% genes), but correlations between expression level changes were considerably weaker (see below), suggesting that there is minor cross-talk between these pathways.

We wanted to know whether it was informative to look at quantitative agreement within STPs. For each mutant pair, we rank-transformed the regression coefficients β of each isoform within the STP, and calculated lines of best fit using Bayesian regression with a Student-T distribution to mitigate noise from outliers and plotted the results in a rank plot (see Fig 3). For transcriptomes associated with the hypoxia pathway, we found that these correlations tended to have values higher than 0.9 with a tight distribution around the line of best fit. The correlations for mutants from the hypoxia pathway with the fog-2(lf) mutant were considerably weaker, with magnitudes between 0.6–0.85 and greater variance around the line of best fit. Although hif-1 is known to be genetically repressed by egl-9, rhy-1 and vhl-1^24,25, all the correlations between mutants of these genes and hif-1(lf) were positive.

Figure 3.

Strong transcriptional correlations can be identified between genes that share a positive regulatory connection. We took the egl-9(lf) and the rhy-1(lf) transcriptomes, identified differentially expressed genes common to both transcriptomes and ranked each gene according to its differential expression coefficient β. We plotted the rank of each gene in rhy-1(lf) versus the rank of the same gene in the egl-9(lf) transcriptome. The result is an almost perfect correlation. Green, transparent large points mark inliers to the primary regressions (blue lines); red squares mark outliers to the primary regressions.

After we calculated the pairwise correlation within each STP, we weighted the result of each regression by the number of isoforms within the STP and divided by the total number of differentially expressed isoforms present in the two mutant transcriptomes that contributed to that specific STP, N_overlap/N_g₁∪g2. The weighted regressions recapitulated a module network (see Fig. 4). We identified a strong positive interaction between egl-9(lf) and rhy-1(lf). The magnitude of this weighted correlation derives from the magnitude of the transcriptomes for these mutants (1,806 and 2,103 differentially expressed genes respectively) and the overlap between both genes was extensive, which makes the weighting factor considerably larger than other pairs. The weak correlation between hif-1(lf) and egl-9(lf) results from the small size of the hif-1(lf) transcriptome and the small overlap between the transcriptomes.

Figure 4.

A. Heatmap showing pairwise regression values between all single mutants. B. Correlation network drawn from A. Edge width is proportional to the logarithm of the magnitude of the weighted correlation between two nodes divided by absolute value of the weighted correlation value of smallest magnitude. Edges are also colored according to the heatmap in A. Inhibitors of hif-1 are tightly correlated and form a control module; hif-1 is positively correlated to its inhibitors, albeit weakly; and fog-2, a gene that is not reported to interact with the hypoxia pathway, has the smallest, negative correlation to any gene.

The fine-grained nature of transcriptional phenotypes means that these weighted correlations between transcriptomes of single mutants are predictive of genetic interaction.

A quality check of the transcriptomic data reveals excellent agreement with the literature

One way to establish whether genes are acting additively or epistatically to each other is to perform qPCR of a reporter gene in the single and double mutants. This approach was used to successfully map the relationships within the hypoxia pathway (see, for example^26,25). A commonly used hypoxia reporter gene is nhr-57, which is known to exhibit a several-fold increase in mRNA expression when HIF-1 accumulates^25,34,40. Likewise, increased HIF-1 fucntion is known to cause increased of rhy-1 and egl-9 ⁴¹.

We can selectively look at the expression of a few genes at a time. Therefore, we queried the changes in expression of rhy-1, egl-9, and nhr-57. We included the nuclear laminin gene lam-3 as a representative negative control not believed to be responsive to alterations in the hypoxia pathway. nhr-57 was upregulated in egl-9(lf), rhy-1(lf) and vhl-1(lf), but remains unchanged in hif-1(lf). egl-9(lf);vhl-1(lf) had an expression level similar to egl-9(lf); whereas the egl-9(lf) hif-1(lf) mutant showed wild-type levels of the reporter expression, as reported previously²⁵ (see Fig. 5).

Figure 5.

Top: Observed β values of select genes. We selected four genes (rhy-1, egl-9, nhr-57 and lam-3, shown on the x-axis) and plotted their regression coefficients, β, as measured for every genotype (represented by one of six colors) to study the epistatic relationships between each gene. Asterisks above a bar represent a regression coefficient statistically significantly different from 0, meaning that expression is altered relative to a wild-type control. Error bars show standard error of the mean value of β. nhr-57 is an expression reporter that has been used previously to identify hif-1 regulators^25,26. lam-3 is shown here as a negative control that should not be altered by mutations in this pathway. We measured modest increases in the levels of rhy-1 mRNA when hif-1(lf) is knocked out.

We observed changes in rhy-1(lf) expression consistent with previous literature²⁵ when HIF-1 accumulates. We also observed increases in egl-9 expression in egl-9(lf). egl-9 is known as a hypoxia responsive gene⁴¹. Although changes in egl-9 expression were not statistically significantly different from the wild-type in rhy-1(lf) and vhl-1(lf) mutants, the mRNA levels of egl-9 still trended towards increased expression in these genotypes. As with nhr-57, egl-9 and rhy-1 expression were wild-type in egl-9(lf) hif-1(lf) and egl-9(lf);vhl-1(lf) mutant showed expression phenotypes identical to egl-9(lf). This dataset also showed that knockout of hif-1 resulted in a modest increase in the levels of rhy-1. This suggests that hif-1, in addition to being a positive regulator of rhy-1, also inhibits it, which constitutes a novel observation. Using a single reporter we would have been able to reconstruct an important fraction of the genetic relationships between the genes in the hypoxia pathway—-but would likely fail to observe yet other genetic interactions, such as the evidence for hif-1 negatively regulating rhy-1 transcript levels.

Transcriptome-wide epistasis

Ideally, any measurement of transcriptome-wide epistasis should conform to certain expectations. First, it should make use of the regression coefficients of as many genes as possible. Second, it should be summarizable in a single, well-defined number. Third, it should have an intuitive behavior, such that special values of the statistic should each have an unambiguous interpretation.

One way of displaying transcriptome-wide epistasis is to plot transcriptome data onto an epistasis plot (see Fig 6). In an epistasis plot, the X-axis represents the expected expression of a double mutant a⁻b⁻ if a and b interact additively. In other words, each individual isoform’s x-coordinate is the sum of the regression coefficients from the single mutants a⁻ and b⁻. The Y-axis represents the deviations from the additive (null) model, and can be calculated as the difference between the observed regression coefficient and the predicted regression coefficient. Only genes that are differentially expressed in all three genotypes are plotted. Assuming that the two genes interact via a simple phenotype (for example, if both genes affect a transcription factor that generates the entire transcriptome), these plots will generate specific patterns that can be described through linear regressions. The slope of these lines, s_a,b, is the transcriptome-wide epistasis coefficient.

Figure 6.

(A) Schematic diagram of an epistasis plot. The X-axis on an epistasis plot is the expected coefficient for a double mutant under an additive model (null model). The Y-axis plots deviations from this model. Double mutants that deviate in a systematic manner from the null model exhibit transcriptome-wide epistasis (s). To measure s, we perform a linear regression on the data. The slope of the line of best fit is s. This coefficient is related to genetic architectures. Genes that act additively on a phenotype (Ph) will have s = 0 (orange line); whereas genes that act along an unbranched pathway will have s = −1/2 (blue line). Strong repression is reflected by s = − 1 (red line). Cases where s > 0 correspond to synthetic interactions (purple line), and in the limit as s → ∞, the synthetic interaction must be an OR-gate. Cases where 0 < s < −1/2 correspond to circuits that have multiple positive branches; whereas cases where − 1/2 < s < − 1 correspond to cases where the branches have different valence. Cases where s < − 1 represent inhibitory branches. (B) Epistasis plot showing that the egl-9(lf);vhl-1(lf) transcriptome deviates significantly from a null additive. Points are colored qualitatively according to density (purple—low, yellow— high) and size is inversely proportional to the standard error (S.E.) of the y-axis (larger points, higher accuracy). The purple line is the line of best fit from an orthogonal distance regression. (C) Comparison of simulated epistatic coefficients against the observed coefficient. Green curve shows the bootstrapped observed transcriptome-wide epistasis coefficient for egl-9 and vhl-1. Dashed green line shows the mean value of the data. Using the single mutants, we simulated coefficient distributions for a linear model (light blue, centered at − 0.5); an additive model (orange, centered at 0); a model where either egl-9 or vhl-1 masks the other phenotype (dark blue and black, respectively) and a complete suppression model (red, centered at − 1). The observed coefficient overlaps the predicted epistasis curve for egl-9(lf);vhl-1(lf) = egl-9(lf) (green and dark blue).

Epistasis plots can be understood intuitively for simple cases of genetic interactions. If two genes act additively on the same set of differentially expressed isoforms then all the plotted points will fall along the line y = 0. If two genes interact in an unbranched pathway, then a⁻ and b⁻ should have identical phenotypes for a⁻, b⁻ and a⁻b⁻, if all the genotypes are homozygous for genetic null alleles¹. It follows that the data points should fall along a line with slope equal to . On the other hand, in the limit of complete inhibition of a by b, the plots should show a line of best fit with slope equal to −1¹. Genes that interact synthetically (i.e., through an OR-gate) will fall along lines with slopes > 0. When there is epistasis of one gene over another, the points will fall along a line of best fit with slope s_ab=b or s_ab=a. This slope must be determined from the single-mutant data. From this information, we can use the single mutant data to predict the distribution of slopes that results for each case stated above, as well as for each epistatic combination (a⁻b⁻ = a⁻ or a⁻b⁻ = b⁻). The transcriptome-wide epistasis coefficient (s_a _b), emerges as a powerful way to quantify epistasis because it integrates information from many different genes or isoforms into a single number (see Fig. 6).

In our experiment, we studied two double mutants, egl-9(lf) hif-1(lf) and egl-9(lf);vhl-1(lf). We wanted to understand how well an epistasis analysis based on transcriptome-wide coefficients agreed with the epistasis results reported in the literature, which were based on qPCR of single genes. Therefore, we performed orthogonal distance regression on the two gene combinations we studied (egl-9 and vhl-1; and egl-9 and hif-1) to determine the epistasis coefficient for each gene pair. We also generated models for the special cases mentioned above (additivity, a⁻b⁻ = a⁻, strong suppression, etc…) using the single mutant data. For every simulation, as well as for the observed data, we used bootstraps to generate probability distributions of the epistasis coefficients.

When we compared the predictions for the transcriptome-wide epistasis coefficient, s_egl₋₉,_vhl₋₁ under different assumptions with the observed slope (−0.42). We observed that the predicted slope matched the simulated slope for the case where egl-9 is epistatic over vhl-1 (egl-9(lf) = egl-9(lf);vhl-1(lf), see Fig. 6) and did not overlap with any other prediction. Next, we predicted the distribution of s_{egl−9,hif−1} for different pathways and contrasted with the observed slope. In this case, we saw that the uncertainty in the observed coefficient overlapped significantly with the strong suppression model, where EGL-9 strongly suppresses HIF-1, and also with the model where hif-1(lf) = egl-9(lf) hif-1(lf). In this case, both models are reasonable—HIF-1 is strongly suppressed by EGL-9, and we know from previous literature that the epistatic relationship, hif-1(lf) = egl-9(lf) hif-1(lf), is true for these mutants. In fact, as the repression of HIF-1 by EGL-9 becomes stronger, the epistatic model should converge on the limit of strong repression (see Epistasis).

Another way to test which model best explains the epistatic relationship between egl-9 and vhl-1 is to use Bayesian model selection to calculate an odds ratio between two models to explain the observed data. Models can be placed into two categories: parameter-free and fit. Parameter free models are ‘simpler’ because their parameter space is smaller (0 parameters) than the fit models (n parameters). By Occam’s razor, simpler models should be preferred to more complicated models. However, simple models suffer from the drawback that systematic deviations from them cannot be explained or accomodated, whereas more complicated models can alter the fit values to maximize their explanatory power. In this sense, more complicated models should be preferred when the data shows systematic deviations from the simple model. Odds-ratio selection gives us a way to quantify the trade-off between simplicity and explanatory power.

We reasoned that comparing a fit model (y = α · x, where α is the slope of best fit) against a parameter-free model (y = γ · x, where γ is a single number) constituted a conservative approach towards selecting which theoretical model (if any) best explained the data. In particular, this approach will tend to strongly favor the line of best fit over simpler model for all but very small, non-systematic deviations. We decided that we would reject the theoretical models only if the line of best-fit was 10³ times more likely than the theoretical models (odds ratio, OR > 10³). Comparing the odds-ratio between the line of best fit and the different pathway models for egl-9 and vhl-1 showed similar results to the simulation. Only the theoretical model egl-9(lf) = egl-9(lf);vhl-1(lf) could not be rejected (OR = 0.46), whereas all other models were significantly less likely than the line of best fit (OR > 10⁴⁴). Therefore, egl-9 is epistatic to vhl-1. Moreover, since s_egl₋₉,_vhl₋₁ is strictly between and not equal to 0 and −0.5, we conclude that egl-9 acts on its transcriptomic phenotype in vhl-1-dependent and independent manners. A branched pathway that can lead to epistasis coefficients in this range is a pathway where egl-9 interacts with its transcriptomic phenotype via branches that have the same valence (both positive or both negative)²⁶. When we performed a similar analysis to establish the epistatic relationship between egl-9 and hif-1, we observed that the best alternative to a free-fit model was a model where hif-1 is epistatic over egl-9 (OR= 2551), but the free-fit model was still preferred. All other models were strongly rejected (OR > 10²⁵).

Epistasis can be predicted

Given our success in measuring epistasis coefficients, we wanted to know whether we could predict the epistasis coefficient between egl-9 and vhl-1 in the absence of the egl-9(lf) genotype. Since RHY-1 indirectly activates EGL-9, the rhy-1(lf) transcriptome should contain more or less equivalent information to the egl-9(lf) transcriptome. Therefore, we generated predictions of the epistasis coefficient between egl-9 and vhl-1 by substituting in the rhy-1(lf) data. We predicted s_rhy−₁,_vhl−₁ = −0.45. Similarly, we used the egl-9(lf);vhl-1(lf) double mutant to measure the epistasis coefficient while replacing the egl-9(lf) dataset with the rhy-1(lf) dataset. We found that the epistasis coefficient using this substitution was −0.40. This coefficient was different from −0.50 (OR > 10⁶²), reflecting the same qualitative conclusion that the hypoxia pathway is branched. In conclusion, we were able to obtain a quantitatively close prediction of the epistasis coefficient for two mutants using the transcriptome of a related, upstream mutant. Finally, we showed that in the absence of a single mutant, an upstream locus can under some circumstances be used to estimate epistasis between two genes.

Transcriptomic decorrelation can be used to infer functional distance

So far, we have shown that RNA-seq can accurately measure genetic interactions. However, genetic interactions are far removed from biochemical interactions: Genetic interactions do not require two gene products to interact physically, nor even to be physically close to each other. RNA-seq cannot measure physical interactions between genes, but we wondered whether expression profiling contains sufficient information to order genes along a pathway.

Single genes are often regulated by multiple independent sources. The connection between two nodes can in theory be characterized by the strength of the edges connecting them (the thickness of the edge); the sources that regulate both nodes (the fraction of inputs common to both nodes); and the genes that are regulated by both nodes (the fraction of outputs that are common to both nodes). In other words, we expected that expression profiles associated with a pathway would respond quantitatively to quantitative changes in activity of the pathway. Targeting a pathway at multiple points would lead to expression profile divergence as we compare nodes that are separated by more degrees of freedom, reflecting the flux in information between them.

We investigated the possibility that transcriptomic signals do in fact contain relevant information about the degrees of separation by weighting the robust Bayesian regression between each pair of genotypes by the size of the shared transcriptomic phenotype of each pair divided by the total number of isoforms differentially expressed in either mutant (N_Intersection/N_Union). We plotted the weighted correlation of each gene pair, ordered by increasing functional distance (see Fig. 7). In every case, we see that the weighted correlation decreases monotonically due mainly, but not exclusively, to a smaller STP. We believe that this result is not due to random noise or insufficiently deep sequencing. Instead, we propose a framework in which every gene is regulated by multiple different molecular species, which induces progressive decorrelation. This decorrelation in turn has two consequences. First, decorrelation within a pathway implies that two nodes may be almost independent of each other if the functional distance between them is large. Second, it may be possible to use decorrelation dynamics to infer gene order in a branching pathway, as we have done with the hypoxia pathway.

Figure 7.

Theoretically, transcriptomes can be used to order genes in a pathway under certain assumptions. Arrows in the diagrams above are intended to show the direction of flow, and do not indicate valence. A. A linear pathway in which rhy-1 is the only gene controlling egl-9, which in turn controls hif-1 does not contain information to infer the order between genes. B. If rhy-1 and egl-9 have transcriptomic effects that are separable from hif-1, then the rhy-1 transcriptome should contain contributions from egl-9, hif-1 and egl-9-and hif-1-independent pathways. This pathway contains enough information to infer order. C. If a pathway is branched both upstream and downstream, transcriptomes will show even faster decorrelation. Nodes that are separated by many edges may begin to behave almost independently of each other with marginal transcriptomic overlap or correlation. D. The hypoxia pathway can be ordered. We hypothesize the rapid decay in correlation is due to a mixture of upstream and downstream branching that happens along this pathway. Bars show the standard error of the weighted coefficient from the Monte Carlo Markov Chain computations.

The circuit topology of the hypoxia pathway explains patterns in the data

We noticed that while some of the rank plots contained a clear positive correlation (see Fig. 3), other rank plots showed a discernible cross-pattern (see Fig. 8). In particular, this cross-pattern emerged between vhl-1(lf) and rhy-1(lf) or between vhl-1(lf) and egl-9(lf), even though genetically vhl-1, rhy-1 and egl-9 are all inhibitors of hif-1(lf). Such cross-patterns could be indicative of feedback loops or other complex interaction patterns.

Figure 8.

A feedback loop can generate transcriptomes that are both correlated and anti-correlated. The vhl-1(lf)/rhy-1(lf) STP shows a cross-pattern. Green large points are inliers to the first regression. Red squares are outliers to the first regression. Only the red small points were used for the secondary regression. Blue lines are representative samples of the primary bootstrapped regression lines. Orange lines are representative samples of the secondary bootstrapped regression lines.

If the above is correct, then it should be possible to identify egl-9-independent, rhy-1(lf)-dependent target genes in a logically consistent way. One erroneous way to identify these targets is via subtractive logic. Using subtractive logic, we would identify genes that are differentially expressed in rhy-1(lf) mutants but not in egl-9(lf) mutants. Such a gene set would consist of almost 700 genes. One major drawback of subtractive logic is that it cannot be applied when feedback loops exist between the genes in question. Another problem is that the set of identified genes are statistically indistinguishable from false positive and false negative hits because they have no distinguishing property beyond the condition that they should be differentially expressed in one mutant but not the other. In fact, this is exactly the behavior expected of false-positive or false-negative hits—presence in one, but not multiple, mutants. We need to consider the relationship between two genes before we can begin to identify targets which expression is dependent on one gene and independent of the other.

rhy-1 and egl-9 share a well-defined relationship. RHY-1 inhibits CYSL-1, which in turn inhibits EGL-9³⁸. Therefore, loss of RHY-1 leads to inactivation of EGL-9, which leads to increase in the cellular levels of HIF-1. HIF-1 in turn causes the mRNA levels of rhy-1 and egl-9 to increase, as they are involved in the hif-1-dependent hypoxia response. However, since rhy-1 has been mutated, the observed transcriptome is RHY-1 ‘null’; EGL-9 ‘null’; HIF-1 ‘on’. The situation is similar for egl-9(lf), except that RHY-1 is not inactive, and therefore the observed transcriptome is the result of RHY-1 ‘up’; EGL-9 ‘null’; and HIF-1 ‘on’.

From this pattern, we conclude that the egl-9(lf) and rhy-1(lf) transcriptomes should exhibit a cross-pattern when plotted against each other: The positive arm of the cross is the result of the EGL-9 ‘null’; HIF-1 ‘on’ dynamics; and the negative arm reflects the different direction of RHY-1 activity between transcriptomes. No negative arm is visible (with the exception of two outliers, which are annotated as pseudogenes in WormBase). Therefore, in this dataset we do not find genes that have egl-9 independent, rhy-1-dependent expression patterns.

We also identified a main hypoxia response induced by disinhibiting hif-1 (355 genes) by identifying genes that were commonly up-regulated amongst egl-9(lf), rhy-1(lf) and vhl-1(lf) mutants. Although the hypoxic response is likely to involve between three and seven times more genes (assuming the rhy-1(lf) transcriptome reflects the maximal hypoxic response), this is a conservative estimate that minimizes false positive results, since these changes were identified in four different genotypes with three replicates each. This response included five transcription factors (W02D7.6, nhr-57, ztf-18, nhr-135 and dmd-9). The full list of genes associated with the hypoxia response can be found in the Supplementary Table 1.

hif-1-independent effects of egl-9 have been reported previously⁴⁰, which led us to question whether we could identify similar effects in our dataset. We have observed that hif-1(lf) displays a modest increase in the transcription of rhy-1, from which we speculated that EGL-9 would have increased activity in the hif-1(lf) mutant compared to the wild-type. Therefore, we searched for genes that were regulated in an opposite manner between hif-1(lf) and egl-9(lf) hif-1(lf), and that were regulated in the same direction between all egl-9(lf) genotypes. We did not find any genes that met these conditions.

We also searched for genes with hif-1 independent, vhl-1-dependent gene expression and found 45 genes, which can be found in the Supplementary Table 2. Finally, we searched for candidates directly regulated by hif-1. Initially, we searched for genes that had were significantly altered in hif-1(lf) genotypes in one direction, but altered in the opposite direction in mutants that activate the HIF-1 response. Only two genes (R08E5.3, and nit-1) met these conditions. This could reflect the fact that HIF-1 exists at very low levels in C. elegans, so loss of function mutations in hif-1 might only have mild effects on its transcriptional targets. We reasoned that genes that are overexpressed in mutants that induce the HIF-1 response would be enriched for genes that are direct candidates. We found 195 genes which have consistently increased expression in mutants with a constitutive hypoxic response. These genes can be found in the Supplementary Table 3.

Enrichment analysis of the hypoxia response

To validate that our transcriptomes were correct, and to understand how functionalities may vary between them, we subjected each decoupled response to enrichment analysis using the WormBase Enrichment Suite ^42,43.

We used gene ontology enrichment analysis (GEA) on the main hypoxia response program. This showed that the terms ‘oxoacid metabolic process’ (q < 10⁻⁴, 3.0 fold-change, 24 genes), ‘iron ion binding’ (q < 10^{− 2}, 3.8 fold-change, 10 genes), and ‘immune system process’ (q < 10^{− 3}, 2.9 fold-change, 20 genes) were significantly enriched. GEA also showed enrichment of the term ‘mitochondrion’ (q < 10⁻³, 2.5 fold-change, 29 genes) (see Fig. 9). Indeed, hif-1(lf) has been implicated in all of these biological and molecular functions^44,45,46,47. As benchmark on the quality of our data, we selected a set of 22 genes known to be responsive to HIF-1 levels from the literature and asked whether these genes were present in our hypoxia response list. We found 8/22 known genes, which constitutes a statistically significant result (p < 10¹⁰). The small number of reporters found in this list probably reflects the conservative nature of our estimates. We studied the hif-1-independent, vhl-1-dependent gene set using enrichment analysis but no terms were significantly enriched.

Figure 9.

Gene ontology enrichment analysis of genes associated with the main hypoxia response. A number of terms reflecting catabolism and bioenergetics are enriched.

Identification of non-classical epistatic interactions

hif-1(lf) has traditionally been viewed as existing in a genetic OFF state under normoxic conditions. However, our dataset indicates that 546 genes show altered expression when hif-1 function is removed in normoxic conditions. Moreover, we observed positive correlations between hif-1(lf) β coefficients and egl-9(lf), vhl-1(lf) and rhy-1(lf) β coefficients in spite of the negative regulatory relationships between these genes and hif-1. Such positive correlations could indicate a different relationship between these genes than has previously been reported, so we attempted to substantiate them through epistasis analyses.

To perform epistasis analyses, we first identified genes that exhibited violations of the canonical genetic model of the hypoxia pathway. To this end, we searched for genes that exhibited different behaviors between egl-9(lf) and vhl-1(lf), or between rhy-1(lf) and vhl-1(lf) (we assume that all results from the rhy-1(lf) transcriptome reflect a complete loss of egl-9 activity). We found 31 that satisfied this condition (see Fig. 10, Supplemental Table 4). Additionally, many of these genes exhibited a new kind of epistasis. Namely, egl-9 was epistatic over vhl-1. Identification of a set of genes that have a consistent set of relationships between themselves suggests that we have identified a new aspect of the hypoxia pathway.

Figure 10.

A. 27 genes in C. elegans exhibit non-classical epistasis in the hypoxia pathway, characterized by opposite effects on gene expression, relative to the wild-type, of of the vhl-1(lf) compared to egl-9(lf) (or rhy-1(lf)) mutants. Shown are a random selection of 15 the 27 genes for illustrative purposes. B. Representative genes showing that non-canonical epistasis shows a consistent pattern. vhl-1(lf) mutants have an opposite effect to egl-9(lf), but egl-9 remains epistatic to vhl-1 and loss-of-function mutations in hif-1 suppress the egl-9(lf) phenotype.

To illustrate this, we focused on three genes, nlp-31, ftn-1 and ftn-2, which epistasis patterns that we felt reflected the population well. ftn-1 and ftn-2 are both described in the literature as genes that are responsive to mutations in the hypoxia pathway. Moreover, these genes have been previously described to have aberrant behaviors^45,46, specifically the opposite effects of egl-9(lf) and vhl-1(lf). These studies showed that loss of vhl-1(lf) decreases expression of ftn-1 and ftn-2 using both RNAi and alleles, which allays concerns of strain-specific interference. Moreover, Ackerman and Gems (2012) showed that vhl-1 is epistatic to hif-1 for the ftn-1 expression phenotype, and that loss of HIF-1 is associated with increased expression of ftn-1 and ftn-2. We observed that hif-1 was epistatic to egl-9, and that egl-9 and hif-1 both promoted ftn-1 and ftn-2 expression.

Epistasis analysis of ftn-1 and ftn-2 expression reveals that egl-9 is epistatic to hif-1; that vhl-1 has opposite effects to egl-9, and that vhl-1 is epistatic to egl-9. Analysis of nlp-31 reveals similar relationships. nlp-31 expression is decreased in hif-1(lf), and increased in egl-9(lf). However, egl-9 is epistatic to hif-1. Like ftn-1 and ftn-2, vhl-1 has the opposite effect to egl-9, yet is epistatic to egl-9. We propose in the Discussion a model for how HIF-1 might regulate these targets.

HIF-1 in the cellular context

We identified the transcriptional changes associated with bioenergetic pathways in C. elegans by extracting from WormBase all genes associated with the tricarboxylic acid (TCA) cycle, the electron transport chain (ETC) and with the C. elegans GO term energy reserve. Previous research has described the effects of mitochondrial dysfunction in eliciting the hypoxia response⁴⁸, but transcriptional feedback from HIF-1 into bioenergetic pathways has not been as extensively in C. elegans, as in vertebrates (see, for example^32,28). We also searched for the changes in ribosomal components and the proteasome, as well as for terms relating to immune response (see Fig 11).

Figure 11.

A graphic summary of the genome-wide effects of HIF-1 from our RNA-seq data.

Bioenergetic pathways

Our data shows that most of the enzymes involved in the TCA cycle and in the ETC are down-regulated when HIF-1 is induced in agreement with the previous literature²⁸. However, the fumarase gene fum-1 and the mitochondrial complex II stood out as notable exceptions to the trend, as they were up-regulated in every single genotype that causes deployment of the hypoxia response. FUM-1 catalyzes the reaction of fumarate into malate, and complex II catalyzes the reaction of succinate into fumarate. Complex II has been identified as a source of reserve respiratory capacity in neonatal rat cardiomyocytes previously⁴⁹. We found two energy reserve genes that were down-regulated by HIF-1. aagr-1 and aagr-2, which are predicted to function in glycogen catabolism⁵⁰. Three distinct genes involved in energy reserve were up-regulated. These genes were ogt-1, which encodes O-linked GlcNac Transferase gene; T04A8.7, encoding an ortholog of human glucosidase, acid beta (GBA); and T22F3.3, encoding ortholog of human glycogen phosphorylase isozyme in the muscle (PYGM).

Protein synthesis and degradation

hif-1(lf) is also known to inhibit protein synthesis and translation in varied ways.⁵¹. Most reported effects of HIF-1 on the translation machinery are posttranslational, and no reports to date show transcriptional control of the ribosomal machinery in C. elegans by HIF-1. We used the WormBase Enrichment Suite Gene Ontology dictionary⁴³ to extract 143 protein-coding genes annotated as ‘structural constituents of the ribosome’ and we queried whether they were differentially expressed in our mutants. egl-9(lf), vhl-1(lf), rhy-1(lf) and egl-9(lf);vhl-1(lf) showed differential expression of 91 distinct ribosomal constituents (not all constituents were detected in all genotypes). For every one of these genotypes, these genes were always down-regulated. In contrast, hif-1(lf) showed up-regulation of a single ribosomal constituent.

Next, we asked whether HIF-1 has any transcriptional effects on the proteasomal constituents; no such effects of HIF-1 on the proteasome have been reported in C. elegans. Out of 40 WormBase-annotated proteasomal constituents, we found 31 constituents that were differentially expressed in at least one of the four genotypes that induce a hypoxic response. Every gene we found was down-regulated in at least two out of the four genotypes we studied.

Discussion

The C. elegans hypoxia pathway can be reconstructed entirely from RNA-seq data

In this paper, we have shown that whole-organism transcriptomic phenotypes can be used to reconstruct genetic pathways and to discern previously overlooked or uncharacterized genetic interactions. We successfully reconstructed the hypoxia pathway, and inferred order of action (rhy-1 activates egl-9, egl-9 and vhl-1 inhibit hif-1), and we were able to infer from transcriptome-wide epistasis measurements that egl-9 exerts vhl-1-dependent and independent inhibition on hif-1.

HIF-1 and the cellular environment

In addition to reconstructing the pathway, our dataset allowed us to observe a wide variety of physiologic changes that occur as a result of the HIF-1-dependent hypoxia response. In particular, we observed down-regulation of most components of the TCA cycle and the mitochondrial electron transport chain with the exceptions of fum-1 and the mitochondrial complex II. The mitochondrial complex II catalyzes the reaction of succinate into fumarate. In mouse embryonic fibroblasts, fumarate has been shown to antagonize HIF-1 prolyl hydroxylase domain (PHD) enzymes, which are orthologs of EGL-9⁵². If the inhibitory role of fumarate on PHD enzymes is conserved in C. elegans, upregulation of complex II by HIF-1 during hypoxia may increase intracellular levels of fumarate, which in turn could lead to artificially high levels of HIF-1 even after normoxia resumes. The increase in fumarate produced by the complex could be compensated by increasing expression of fum-1. Increased fumarate degradation allows C. elegans to maintain plasticity in the hypoxia pathway, keeping the pathway sensitive to oxygen levels.

Interpretation of the non-classical epistasis in the hypoxia pathway

The observation of almost 30 genes that exhibit a specific pattern of non-classical epistasis suggests the existence of previously undescribed aspects of the hypoxia pathway. Some of these non-classical epistases had been observed previously^45,46,44, but no satisfactory mechanism has been proposed to explain this biology.⁴⁶ and ⁴⁵ suggest that HIF-1 integrates information on iron concentration in the cell to bind to the ftn-1 promoter, but could not definitively establish a mechanism. It is unclear why deletion of hif-1 induces ftn-1 expression, deletion of egl-9 also causes induction of ftn-1 expression, but deletion of vhl-1 removes this inhibition. Moreover,⁴⁴ have previously reported that certain genes important for the C. elegans immune response against pathogens reflect similar expression patterns. Their interpretation was that swan-1, which encodes a binding partner to EGL-9⁵³, is important for modulating HIF-1 activity in some manner. The lack of a conclusive double mutant analysis in this work means the role of SWAN-1 in modulation of HIF-1 activity remains to be demonstrated. Nevertheless, mechanisms that call for additional transcriptional modulators become less likely given the number of genes with different biological functions that exhibit the same pattern.

One way to resolve this problem without invoking additional genes is to consider HIF-1 as a protein with both activating and inhibiting states. In fact, HIF-1 already exists in two states in C. elegans: unmodified

HIF-1 and HIF-1-hydroxyl (HIF-1-OH). Under this model, HIF-1-hydroxyl antagonizes the effects of HIF-1 for certain genes like ftn-1 or nlp-31. Loss of vhl-1 stabilizes HIF-1-hydroxyl. A subset of genes that are sensitive to HIF-1-hydroxyl will be inhibited as a result of the increase in the amount of this species, in spite of loss of vhl-1 function also increasing the level of non-hydroxylated HIF-1. On the other hand, egl-9(lf) selectively removes all HIF-1-hydroxyl, stimulating accumulation of HIF-1 and promoting gene activity. Whether deletion of hif-1(lf) is overall activating or inhibiting will depend on the relative activity of each protein state under normoxia (see Fig. 12).

Figure 12.

A hypothetical model showing a mechanism where HIF-1-hydroxyl antagonises HIF-1. A. Diagram showing that RHY-1 activates EGL-9. EGL-9 hydroxylates HIF-1 in an oxygen dependent fashion. Under normoxia, HIF-1 is rapidly hydroxylated and only slowly does hydroxylated HIF-1 return to its original state. EGL-9 can also inhibit HIF-1 in an oxygen-independent fashion. HIF-1 hydroxyl is rapidly degraded in a VHL-1 dependent fashion. In our model, HIF-1 and HIF-1 hydroxyl have opposing effects on transcription. The width of the arrows represents the rates under normoxic conditions. B. Table showing the effects of loss-of-function mutations on HIF-1 and HIF-1 hydroxyl activity, showing how this can potentially explain the behavior of jin-1 in each case. S.S = Steady-state.

Multiple lines of circumstantial evidence that HIF-1-hydroxyl plays a role in the functionality of the hypoxia pathway. First, HIF-1-hydroxyl is challenging to study genetically because no mimetic mutations are available with which to study the pure hydroxylated HIF-1 species. Second, although mutations in the Von-Hippel Landau gene stabilize the hydroxyl species, they also increase the quantity of non-hydroxylated HIF-1 by mass action. Finally, since HIF-1 is detected low levels in cells under normoxic conditions⁵⁴, total HIF-1 protein (unmodified HIF-1 plus HIF-1-hydroxyl) is often tacitly assumed to be vanishingly rare and therefore biologically inactive.

Our data show hundreds of genes that change expression in response to loss of hif-1 under normoxic conditions. This establishes that there is sufficient total HIF-1 protein to be biologically active. Our analyses also revealed that hif-1(lf) shares positive correlations with egl-9(lf), rhy-1(lf) and vhl-1(lf), and that each of these genotypes also shows a secondary negative rank-ordered expression correlation with each other. These cross-patterns between all loss of function of inhibitors of HIF-1 and hif-1(lf) can be most easily explained if HIF-1-hydroxyl is biologically active.

A homeostatic argument can be made in favor of the activity of HIF-1-hydroxyl. At any point in time, the cell must measure the levels of multiple metabolites at once. The hif-1-dependent hypoxia response integrates information from O₂, α-ketoglutarate (2-oxoglutarate) and iron concentrations in the cell. One way to integrate this information is by encoding it only in the effective hydroxylation rate of HIF-1 by EGL-9. Then the dynamics in this system will evolve exclusively as a result of the total amount of HIF-1 in the cell. Such a system can be sensitive to fluctuations in the absolute concentration of HIF-1⁵⁵. Since the absolute levels of HIF-1 are low in normoxic conditions, small fluctuations in protein copy-number represent can represent a large fold-change in HIF-1 levels. These fluctuations would not be problematic for genes that must be turned on only under conditions of severe hypoxia—presumably, these genes would be associated with low affinity sites for HIF-1, so that they are only activated when HIF-1 levels are far above random fluctuations.

For yet other sets of genes that must change expression in response to the hypoxia pathway, it may not make as much sense to integrate metabolite information exclusively via EGL-9-dependent hydroxylation of HIF-1. In particular, genes that may function to increase survival in mild hypoxia may benefit from regulatory mechanisms that can sense minor changes in environmental conditions and which therefore benefit from robustness to transient changes in protein copy number. Likewise, genes that are involved in iron or α-ketoglutarate metabolism (such as ftn-1) may benefit from being able to sense, accurately, small and consistent deviations from basal concentrations of these metabolites. For these genes, the information may be better encoded by using HIF-1 and HIF-1-hydroxyl as an activator/repressor pair. Such circuits are known to possess distinct advantages for controlling output in a manner that is robust to transient fluctuations in the levels of their components^56,57.

Our RNA-seq data suggests that one of these atypical targets of HIF-1 may be RHY-1. Although rhy-1 does not exhibit non-classical epistasis, hif-1(lf) and egl-9(lf) hif-1(lf) both had increased expression levels of rhy-1. We speculate that if rhy-1 is controlled by both HIF-1 and HIF-1-hydroxyl, then this might imply that HIF-1 regulates the expression of its pathway (and therefore itself) in a manner that is robust to total HIF-1 levels.

Insights into genetic interactions from vectorial phenotypes

Here, we have described a set of straightforward methods that can be in theory applied to any vectorial phenotype. Genome-wide methods afford a lot of information, but genome-wide interpretation of the results is often extremely challenging. Each method has its own advantages and disadvantages. We briefly discuss these methods, their uses and their drawbacks.

Principal component analysis is computationally tractable and clusters can often be visually detected with ease. However, PCA can be misleading, especially when the dimensions represented do not explain a very large fraction of the variance present in the data. In addition, principal dimensions are the product of a linear combination of vectors, and therefore must be interpreted with extreme care. In this case, the first principal dimension separated genotypes that increase HIF-1 protein levels from those that decrease it, but this dimension is a mix of vectors of change in gene expression. Although PCA showed that there is information hidden in these genotypes, it was not enough by itself to provide biological insight.

Whereas PCA operates on all genotypes simultaneously, correlation analysis is a pairwise procedure that measures how predictable the gene expression changes are in a mutant given the vector of expression changes in another. Like PCA, correlation analysis is easy and fast to perform. Unlike PCA, the product of a correlation analysis is a single number with a straightforward interpretation. However, correlation analysis is particularly sensitive to outliers. Although a common strategy is to rank-transform expression data to mitigate outliers, rank-transformations do not remove the cross-patterns that appear when feedback loops or other complex interactions are present between two genes. Such cross-patterns can still lead to vanishing correlations if both patterns are equally strong. Therefore, correlation analyses must take into account the possible existence of systematic outliers. Moreover, correlation values must be measured for both interactions in cross-patterned rank plots. Weighted correlations could be informative for ordering genes along pathways. A drawback of correlation analysis is that the number of pairwise comparisons that must be made increases combinatorially, though strategies could be used to decrease the total number of effective comparisons.

Epistasis plots are a novel way to visualize epistasis in vectorial phenotypes. Here, we have shown how an epistasis plot can be used to identify interactions between two single mutants and a double mutant. In reality, epistasis plots can be generated for any set of measurements involving a set of N mutants and an N-mutant genotype. Epistasis plots can accumulate an arbitrary number of points within them, possess a rich structure that can be visualized and have straightforward interpretations for special slope values.

Another way to analyze epistasis is via general linear models (GLMs) that include interaction terms between two or more genes. In this way, GLMs can quantify the epistatic effect of an interaction on single genes. We and others^22,23 have previously used GLMs to identify gene sets that are epistatically regulated by two or more inputs. While powerful, GLMs suffer from the multiple comparison problem. Correcting for false positives using well-known multiple comparison corrections such as FDR⁵⁸ tends to increase false negative rates. Moreover, since GLMs attempt to estimate effect magnitudes for individual gene or isoform expression levels, they effectively treat each gene as an independent quantity, which prevents better estimation of the magnitude and direction of the epistasis between two genes.

Epistasis plots do not suffer from the multiple comparison problem because the number of tests performed is orders of magnitudes smaller than the number of tests performed by GLMs. Ideally, in an epistasis plot we need only perform 3 tests—rejection of additive, unbranched and suppressive null models—compared with the tens of thousands of tests that are performed in GLMs. Moreover, the magnitude of epistasis between two genes can be estimated using hundreds of genes, which greatly improves the statistical resolution of the epistatic coefficient. This increased resolution is important because the size and magnitude of the epistasis has specific consequences for the type of pathway that is expected.

Any quantitative use of genome-wide datasets requires a good experimental setup. Here, we have demonstrated that whole-organism RNA-seq can be used to dissect molecular pathways in exquisite detail when paired with experimental designs that are motivated by classical genetics. Much more research will be necessary to understand whether epistasis has different consequences in the microscopic realm of transcriptional phenotypes than in the macroscopic world that geneticists have explored previously. Our hope is that these tools, coupled with the classic genetics experimental designs, will reveal hitherto unknown aspects of genetics theory.

Methods

Nematode strains and culture

Strains used were N2 wild-type Bristol, CB5602 vhl-1 (ok161), CB6088 egl-9(sa307) hif-1 (ia4), CB6116 egl-9 (sa307);vhl-1 (ok161), JT307 egl-9 (sa307), ZG31 hif-1 (ia4), RB1297 rhy-1 (ok1f02). All lines were grown on standard nematode growth media (NGM) plates seeded with OP50 E. coli at 20°C (Brenner 1974).

RNA Isolation

Unsynchronized lines were grown on NGM plates at 20C and eggs harvested by sodium hypochlorite treatment. Eggs were plated on 6 to 9 6cm NGM plates with ample OP50 E. coli to avoid starvation and grown at 20°C. Worms were staged and harvested based on the time after plating, vulva morphology and the absence of eggs. Approximately 30–50 non-gravid young adults were picked and placed in 100 μL of TE pH 8.0 at 4°C in 0.2mL PCR tubes. After settling and a brief spin in microcentrifuge approximately 80μL of TE (Ambion AM 9849) was removed from the top of the sample and individual replicates were snap frozen in liquid N2. These replicate samples were then digested with Proteinase K (Roche Lot No. 03115 838001 Recombinant Proteinase K PCR Grade) for 15min at 60° in the presence of 1% SDS and 1.25 μL RNA Secure (Ambion AM 7005). RNA samples were then taken up in 5 Volumes of Trizol (Tri Reagent Zymo Research) and processed and treated with DNase I using Zymo MicroPrep RNA Kit (Zymo Research Quick-RNA MicroPrep R1050). RNA was eluted in RNase-free water and divided into aliquots and stored at −80°C. One aliquot of each replicate was analyzed using a NanoDrop (Thermo Fisher) for impurities, Qubit for concentration and then analyzed on an Agilent 2100 BioAnalyzer (Agilent Technologies). Replicates were selected that had RNA integrity numbers (RIN) equal or greater than 9.0 and showed no evidence of bacterial ribosomal bands, except for the ZG31 mutant where one of three replicates had a RIN of 8.3.

Library Preparation and Sequencing

10ng of quality checked total RNA from each sample was reverse-transcribed into cDNA using the Clontech SMARTer Ultra Low Input RNA for Sequencing v3 kit (catalog #634848) in the SMARTSeq2 protocol ⁵⁹. RNA was denatured at 70°C for 3 minutes in the presence of dNTPs, oligo dT primer and spiked-in quantitation standards (NIST/ERCC from Ambion, catalog #4456740). After chilling to 4°C, the first-strand reaction was assembled using the LNA TSO primer described in ⁵⁹, and run at 42°C for 90 minutes, followed by denaturation at 70°C for 10 minutes. The entire first strand reaction was then used as template for 13 cycles of PCR using the Clontech v3 kit. Reactions were cleaned up with 1.8X volume of Ampure XP SPRI beads (catalog #A63880) according to the manufacturer’s protocol. After quantification using the Qubit High Sensitivity DNA assay, a 3ng aliquot of the amplified cDNA was run on the Agilent HS DNA chip to confirm the length distribution of the amplified fragments. The median value for the average cDNA lengths from all length distributions was 1076bp. Tagmentation of the full length cDNA for sequencing was performed using the Illumina/Nextera DNA library prep kit (catalog #FC-121–1030). Following Qubit quantitation and Agilent BioAnalyzer profiling, the tagmented libraries were sequenced. Libraries were sequenced on Illumina HiSeq2500 in single read mode with the read length of 50nt to an average depth of 15 million reads per sample following manufacturer’s instructions. Base calls were performed with RTA 1.13.48.0 followed by conversion to FASTQ with bcl2fastq 1.8.4. Spearman correlation of the transcripts per million (TPM) for each genotype showed that every pairwise correlation within genotype was > 0.9.

Read Alignment and Differential Expression Analysis

We used Kallisto to perform read pseudo-alignment and performed differential analysis using Sleuth. We fit a general linear model for a transcript t in sample i: where y_t,_i are the logarithm transformed counts; β_{t, genotype} and β_{t, batch} are parameters of the model, and which can be interpreted as biased estimators of the log-fold change; X_t,_i,Y_t,_i are indicator variables describing the conditions of the sample; and ∈_{t i} is the noise associated with a particular measurement.

Genetic Analysis, Overview

Genetic analysis of the processed data was performed in Python 3.5. Our scripts made extensive use of the Pandas, Matplotlib, Scipy, Seaborn, Sklearn, Networkx, Bokeh, PyMC3, and TEA libraries^{60,61,62,63,64,65,66,42,67}. Our analysis is available in a Jupyter Notebook⁶⁸. All code and required data (except the raw reads) are available at https://github.com/WormLabCaltech/mprsq along with version-control information. Our Jupyter Notebook and interactive graphs for this project can be found at https://wormlabcaltech.github.io/mprsq/. Raw reads were deposited in the Short Read Archive under the study accession number SRP100886.

Weighted Correlations

Pairwise correlations between transcriptomes where calculated by first identifying the set of differentially expressed genes (DEGs) common to both transcriptomes under analysis. DEGs were then rank-ordered according to their regression coefficient, β. Bayesian robust regressions were performed using a Student-T distribution. Bayesian analysis was performed using the PyMC3 library⁶⁴ (pm.glm.families.StudenT in Python). If the correlation has an average value > 1, the correlation coefficient was set to 1.

Weights were calculated as the proportion of genes that were < 1.5 standard deviations away from the primary regression out of the entire set of shared DEGs for each transcriptome.

Epistasis Analysis

For a double mutant X⁻Y⁻, we used the single mutants X⁻ and Y⁻ to find expected value of the coefficient for a double mutant under an additive model for each isoform i. Specifically,

Next, we find the difference, Δ_i, between the observed double mutant expression coefficient, β_XY,obs,i, and the predicted expression coefficient under an additive model for each isoform i.

To calculate the transcriptome-wide epistasis coefficient, we plotted (β_Add,i, Δ _i) and found the line of best fit using orthogonal distance regression using the scipy.odr package in Python. We performed non-parametric bootstrap sampling of the ordered tuples with replacement using 5,000 iterations to generate a probability distribution of slopes of best fit.

There are as many models as epistatic relationships. For quantitative phenotypes, epistatic relationships (except synthetic interactions) can be generally expressed as: where P_i is the quantitative phenotype belonging to the genotype i; G is the set of single mutants {X, Y} that make up the double mutant, XY; and λ_g is the contribution of the phenotype P_g to P_XY. Additive interactions between genes are the result of setting λ_g = 1. All other relationships correspond to setting λ_X = 0, λ_Y = 1 or λ_X = 1, λ_Y = 0.

A given epistatic interaction can be simulated by predicting the double mutant phenotype under that interaction and re-calculating the y-coordinates. The recalculated y-coordinates can then be used to predict the possible epistasis coefficients for the cases where X is epistatic over Y, and Y is epistatic over X.

To select between theoretical models, we implemented an approximate Bayesian Odds Ratio. We defined a free-fit model, M₁, that found the line of best fit for the data: where α is the slope of the model to be determined, x_i.,y_i. were the x- and y-coordinates of each point respectively, and σ_i. was the standard error associated with the y-value. We minimized the negative logarithm of equation 4 to obtain the most likely slope given the data, D (scipy.optimize.minimize in Python). Finally, we approximated the odds ratio as: where α^* is the slope found after minimization, σ_α^* is the standard deviation of the parameter at the point α^* and P(D |M_i) is the probability of the data given the parameter-free model, M_i.

Enrichment Analysis

Tissue, Phenotype and Gene Ontology Enrichment Analysis were carried out using the WormBase Enrichment Suite for Python^43,42.

Author Contributions

This work was supported by HHMI with whom PWS is an investigator and by the Millard and Muriel Jacobs Genetics and Genomics Laboratory at California Institute of Technology. All strains were provided by the CGC, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440). This article was written with support of the Howard Hughes Medical Institute. This article wouldn’t be possible without help from Dr._ Igor Antoshechkin who performed all sequencing. We thank Hillel Schwartz for all of his careful advice. We would like to thank Jonathan Liu, Han Wang, and Porfirio Quintero for helpful discussion.

Footnotes

↵¹ Specifically, this follows from assuming that b⁻ is wild-type under the conditions assayed; and a⁻b⁻ = b⁻ = wild-type

References

1.↵
Huang, L. S. & Sternberg, P. W. Genetic dissection of developmental pathways. WormBook : the online review of C. elegans biology 1–19 (2006).
2.↵
Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9, 855–867 (2008).
OpenUrl CrossRef PubMed Web of Science
3.↵
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621–628 (2008).
OpenUrl CrossRef PubMed
4.↵
Metzker, M. L. Sequencing technologies - the next generation. Nature reviews. Genetics 11, 31–46 (2010).
OpenUrl CrossRef PubMed Web of Science
5.↵
Patro, R., Mount, S. M. & Kingsford, C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature biotechnology 32, 462–464 (2014).
OpenUrl CrossRef PubMed
6.↵
Bray, N. L., Pimentel, H. J., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nature biotechnology 34, 525–7 (2016).
OpenUrl CrossRef PubMed
7.↵
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides accurate, fast, and bias-aware transcript expression estimates using dual-phase inference. bioRxiv 021592 (2016).
8.↵
Pimentel, H. J., Bray, N. L., Puente, S., Melsted, P. & Pachter, L. Differential analysis of RNA-Seq incorporating quantification uncertainty. bioRxiv 058164 (2016).
9.↵
Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology 31, 46–53 (2013).
OpenUrl CrossRef PubMed
10.↵
Singer, M. et al. A Distinct Gene Module for Dysfunction Uncoupled from Activation in Tumor-Infiltrating T Cells. Cell 166, 1500–1511.e9 (2016).
OpenUrl PubMed
11.↵
Shalek, A. K. et al. Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498, 236–40 (2013).
OpenUrl CrossRef PubMed Web of Science
12.↵
Schwarz, E. M., Kato, M. & Sternberg, P. W. Functional transcriptomics of a migrating cell in Caenorhabditis elegans. Proceedings of the National Academy of Sciences of the United States of America 109, 16246–51 (2012).
OpenUrl Abstract/FREE Full Text
13.↵
Van Wolfswinkel, J. C., Wagner, D. E. & Reddien, P. W. Single-cell analysis reveals functionally distinct classes within the planarian stem cell compartment. Cell Stem Cell 15, 326–339 (2014).
OpenUrl CrossRef PubMed
14.↵
Scimone, M. L., Kravarik, K. M., Lapan, S. W. & Reddien, P. W. Neoblast specialization in regeneration of the planarian Schmidtea mediterranea. Stem Cell Reports 3, 339–352 (2014).
OpenUrl
15.↵
Hughes, T. R. et al. Functional Discovery via a Compendium of Expression Profiles. Cell 102, 109–126 (2000).
OpenUrl CrossRef PubMed Web of Science
16.↵
Van Driessche, N. et al. Epistasis analysis with global transcriptional phenotypes. Nature Genetics 37, 471–477 (2005).
OpenUrl CrossRef PubMed Web of Science
17.↵
Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic Dissection of Transcriptional Regulation in Budding Yeast. Science 296 (2002).
18.↵
Schadt, E. E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).
OpenUrl CrossRef PubMed Web of Science
19.↵
Li, Y. et al. Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans. PLoS Genetics 2, e222 (2006).
OpenUrl
20.↵
King, E. G., Sanderson, B. J., McNeil, C. L., Long, A. D. & Macdonald, S. J. Genetic Dissection of the Drosophila melanogaster Female Head Transcriptome Reveals Widespread Allelic Heterogeneity. PLoS Genetics 10, e1004322 (2014).
OpenUrl
21.↵
Adamson, B. et al. A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response. Cell 167, 1867–1882.e21 (2016).
OpenUrl CrossRef PubMed
22.↵
Dixit, A. et al. Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens. Cell 167, 1853–1866.e17 (2016).
OpenUrl CrossRef PubMed
23.↵
Angeles-Albores, D. et al. Transcriptomic Description of an Endogenous Female State in C. elegans. bioRxiv (2016).
24.↵
Epstein, A. C. R. et al. C. elegans EGL-9 and mammalian homologs define a family of dioxygenases that regulate HIF by prolyl hydroxylation. Cell 107, 43–54 (2001).
OpenUrl CrossRef PubMed Web of Science
25.↵
Shen, C., Shao, Z. & Powell-Coffman, J. A. The Caenorhabditis elegans rhy-1 Gene Inhibits HIF-1 Hypoxia-Inducible Factor Activity in a Negative Feedback Loop That Does Not Include vhl-1. Genetics 174, 1205–1214 (2006).
OpenUrl Abstract/FREE Full Text
26.↵
Shao, Z., Zhang, Y. & Powell-Coffman, J. A. Two Distinct Roles for EGL-9 in the Regulation of HIF-1-mediated gene expression in Caenorhabditis elegans. Genetics 183, 821–829 (2009).
OpenUrl
27.↵
Jiang, H., Guo, R. & Powell-Coffman, J. A. The Caenorhabditis elegans hif-1 gene encodes a bHLH-PAS protein that is required for adaptation to hypoxia. Proceedings of the National Academy of Sciences of the United States of America 98, 7916–7921 (2001).
OpenUrl Abstract/FREE Full Text
28.↵
Semenza, G. L. Hypoxia-inducible factors in physiology and medicine. Cell 148, 399–408 (2012).
OpenUrl CrossRef PubMed Web of Science
29.↵
Loenarz, C. et al. The hypoxia-inducible transcription factor pathway regulates oxygen sensing in the simplest animal, Trichoplax adhaerens. EMBO reports 12, 63–70 (2011).
OpenUrl
30.↵
Jiang, B. H., Rue, E., Wang, G. L., Roe, R. & Semenza, G. L. Dimerization, DNA binding, and transactivation properties of hypoxia-inducible factor 1. The Journal of biological chemistry 271, 17771–17778 (1996).
OpenUrl Abstract/FREE Full Text
31.↵
Powell-Coffman, J. A., Bradfield, C. A. & Wood, W. B. Caenorhabditis elegans Orthologs of the Aryl Hydrocarbon Receptor and Its Heterodimerization Partner the Aryl Hydrocarbon Receptor Nuclear Translocator. Proceedings of the National Academy of Sciences 95, 2844–2849 (1998).
OpenUrl Abstract/FREE Full Text
32.↵
Semenza, G. L., Roth, P. H., Fang, H. M. & Wang, G. L. Transcriptional regulation of genes encoding glycolytic enzymes by hypoxia-inducible factor 1. The Journal of Biological Chemistry 269, 23757–63 (1994).
OpenUrl Abstract/FREE Full Text
33.↵
Bishop, T. et al. Genetic Analysis of Pathways Regulated by the von Hippel-Lindau Tumor Suppressor in Caenorhabditis elegans. PLoS Biology 2 (2004).
34.↵
Shen, C., Nettleton, D., Jiang, M., Kim, S. K. & Powell-Coffman, J. A. Roles of the HIF-1 hypoxia-inducible factor during hypoxia response in Caenorhabditis elegans. Journal of Biological Chemistry 280, 20580–20588 (2005).
OpenUrl
35.↵
Bellier, A., Chen, C. S., Kao, C. Y., Cinar, H. N. & Aroian, R. V. Hypoxia and the hypoxic response pathway protect against pore-forming toxins in C. elegans. PLoS Pathogens 5 (2009).
36.↵
Huang, L. E., Arany, Z., Livingston, D. M. & Franklin Bunn, H. Activation of hypoxia-inducible transcription factor depends primarily upon redox-sensitive stabilization of its α subunit. Journal of Biological Chemistry 271, 32253–32259 (1996).
OpenUrl Abstract/FREE Full Text
37.↵
Kaelin, W. G. & Ratcliffe, P. J. Oxygen Sensing by Metazoans: The Central Role of the HIF Hydroxylase Pathway. Molecular Cell 30, 393–402 (2008).
OpenUrl CrossRef PubMed Web of Science
38.↵
Ma, D. K., Vozdek, R., Bhatla, N. & Horvitz, H. R. CYSL-1 Interacts with the O₂-Sensing Hydroxylase EGL-9 to Promote H₂S-Modulated Hypoxia-Induced Behavioral Plasticity in C. elegans. Neuron 73, 925–940 (2012).
OpenUrl
39.↵
Yeung, K. Y. & Ruzzo, W. L. Principal component analysis for clustering gene expression data. Bioinformatics (Oxford, England) 17, 763–774 (2001).
OpenUrl CrossRef PubMed Web of Science
40.↵
Park, E. C. et al. Hypoxia regulates glutamate receptor trafficking through an HIF-independent mechanism. The EMBO Journal 31, 1618–1619 (2012).
OpenUrl FREE Full Text
41.↵
Powell-Coffman, J. A. Hypoxia signaling and resistance in C. elegans. Trends in Endocrinology and Metabolism 21, 435–440 (2010).
OpenUrl
42.↵
Angeles-Albores, D., N. Lee, R. Y., Chan, J. & Sternberg, P. W. Tissue enrichment analysis for C. elegans genomics. BMC Bioinformatics 17, 366 (2016).
OpenUrl CrossRef
43.↵
Angeles-Albores, D., Lee, R. Y., Chan, J. & Sternberg, P. W. Phenotype and gene ontology enrichment as guides for disease modeling in C. elegans. bioRxiv (2017).
44.↵
Luhachack, L. G. et al. EGL-9 Controls C. elegans Host Defense Specificity through Prolyl Hydroxylation-Dependent and -Independent HIF-1 Pathways. PLoS Pathogens 8, 48 (2012).
OpenUrl
45.↵
Ackerman, D. & Gems, D. Insulin/IGF-1 and hypoxia signaling act in concert to regulate iron homeostasis in Caenorhabditis elegans. PLoS Genetics 8 (2012).
46.↵
Romney, S. J., Newman, B. S., Thacker, C. & Leibold, E. A. HIF-1 regulates iron homeostasis in Caenorhabditis elegans by activation and inhibition of genes involved in iron uptake and storage. PLoS Genetics 7 (2011).
47.↵
Semenza, G. L. Hypoxia-inducible factor 1: Regulator of mitochondrial metabolism and mediator of ischemic preconditioning. Biochimica et Biophysica Acta - Molecular Cell Research 1813, 1263–1268 (2011).
OpenUrl
48.↵
Lee, S. J., Hwang, A. B. & Kenyon, C. Inhibition of respiration extends C. elegans life span via reactive oxygen species that increase HIF-1 activity. Current Biology 20, 2131–2136 (2010).
OpenUrl CrossRef PubMed Web of Science
49.↵
Pfleger, J., He, M. & Abdellatif, M. Mitochondrial complex II is a source of the reserve respiratory capacity that is regulated by metabolic sensors and promotes cell survival. Cell Death and Disease 6, 1–14 (2015).
OpenUrl
50.↵
Sikora, J. et al. Bioinformatic and biochemical studies point to AAGR-1 as the ortholog of human acid α-glucosidase in Caenorhabditis elegans. Molecular and Cellular Biochemistry 341, 51–63 (2010).
OpenUrl PubMed
51.↵
Brugarolas, J. et al. Regulation of mTOR function in response to hypoxia by REDD1 and the TSC1/TSC2 tumor suppressor complex. Genes and Development 18, 1–12 (2004).
OpenUrl FREE Full Text
52.↵
Sudarshan, S. et al. Fumarate hydratase deficiency in renal cancer induces glycolytic addiction and hypoxia-inducible transcription factor 1α stabilization by glucose-dependent generation of reactive oxygen species. Mol Cell Biol 29, 4080–4090 (2009).
OpenUrl Abstract/FREE Full Text
53.↵
Shao, Z., Zhang, Y., Ye, Q., Saldanha, J. N. & Powell-Coffman, J. A. C. elegans SWAN-1 binds to EGL-9 and regulates HIF-1-mediated resistance to the bacterial pathogen Pseudomonas aeruginosa PAO1. PLoS Pathogens 6, 91–92 (2010).
OpenUrl
54.↵
Wang, G. L. & Semenza, G. L. Characterization of hypoxia-inducible factor 1 and regulation of DNA binding activity by hypoxia. The Journal of Biological Chemistry 268, 21513–8 (1993).
OpenUrl Abstract/FREE Full Text
55.↵
Goentoro, L., Shoval, O., Kirschner, M. W. & Alon, U. The Incoherent Feedforward Loop Can Provide Fold-Change Detection in Gene Regulation. Molecular Cell 36, 894–899 (2009).
OpenUrl CrossRef PubMed Web of Science
56.↵
Hart, Y., Antebi, Y. E., Mayo, A. E., Friedman, N. & Alon, U. Design principles of cell circuits with paradoxical components. Proceedings of the National Academy of Sciences of the United States of America 109, 8346–8351 (2012).
OpenUrl Abstract/FREE Full Text
57.↵
Hart, Y. & Alon, U. The Utility of Paradoxical Components in Biological Circuits. Molecular Cell 49, 213–221 (2013).
OpenUrl CrossRef PubMed Web of Science
58.↵
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America 100, 9440–5 (2003).
OpenUrl Abstract/FREE Full Text
59.↵
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nature protocols 9, 171–81 (2014).
OpenUrl
60.↵
Bokeh Development Team. Bokeh: Python library for interactive visualization (2014).
61.↵
McKinney, W. pandas: a Foundational Python Library for Data Analysis and Statistics. Python for High Performance and Scientific Computing 1–9 (2011).
62.↵
Oliphant, T. E. SciPy: Open source scientific tools for Python. Computing in Science and Engineering 9, 10–20 (2007).
OpenUrl
63.↵
Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2012).
OpenUrl CrossRef
64.↵
Salvatier, J., Wiecki, T. & Fonnesbeck, C. Probabilistic Programming in Python using PyMC. PeerJ Computer Science 2, 1–24 (2015).
OpenUrl
65.↵
Van Der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: A structure for efficient numerical computation. Computing in Science and Engineering 13, 22–30 (2011).
OpenUrl CrossRef
66.↵
Hunter, J. D. Matplotlib: A 2D graphics environment. Computing in Science and Engineering 9, 99–104 (2007).
OpenUrl CrossRef
67.↵
Waskom, M. et al. seaborn: v0.7.0 (January 2016) (2016).
68.↵
Pérez, F. & Granger, B. IPython: A System for Interactive Scientific Computing Python: An Open and General-Purpose Environment. Computing in Science and Engineering 9, 21–29 (2007).
OpenUrl CrossRef

View the discussion thread.

Posted March 02, 2017.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Genetics

Subject Areas

All Articles

Animal Behavior and Cognition (5201)
Biochemistry (11718)
Bioengineering (8724)
Bioinformatics (29132)
Biophysics (14936)
Cancer Biology (12051)
Cell Biology (17360)
Clinical Trials (138)
Developmental Biology (9406)
Ecology (14146)
Epidemiology (2067)
Evolutionary Biology (18269)
Genetics (12223)
Genomics (16768)
Immunology (11844)
Microbiology (28016)
Molecular Biology (11560)
Neuroscience (60822)
Paleontology (450)
Pathology (1864)
Pharmacology and Toxicology (3231)
Physiology (4940)
Plant Biology (10401)
Scientific Communication and Education (1680)
Synthetic Biology (2878)
Systems Biology (7333)
Zoology (1642)

[1] 1.↵
Huang, L. S. & Sternberg, P. W. Genetic dissection of developmental pathways. WormBook : the online review of C. elegans biology 1–19 (2006).

[2] 2.↵
Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9, 855–867 (2008).
OpenUrl CrossRef PubMed Web of Science

[3] 3.↵
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621–628 (2008).
OpenUrl CrossRef PubMed

[4] 4.↵
Metzker, M. L. Sequencing technologies - the next generation. Nature reviews. Genetics 11, 31–46 (2010).
OpenUrl CrossRef PubMed Web of Science

[5] 5.↵
Patro, R., Mount, S. M. & Kingsford, C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature biotechnology 32, 462–464 (2014).
OpenUrl CrossRef PubMed

[6] 6.↵
Bray, N. L., Pimentel, H. J., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nature biotechnology 34, 525–7 (2016).
OpenUrl CrossRef PubMed

[7] 7.↵
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides accurate, fast, and bias-aware transcript expression estimates using dual-phase inference. bioRxiv 021592 (2016).

[8] 8.↵
Pimentel, H. J., Bray, N. L., Puente, S., Melsted, P. & Pachter, L. Differential analysis of RNA-Seq incorporating quantification uncertainty. bioRxiv 058164 (2016).

[9] 9.↵
Trapnell, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology 31, 46–53 (2013).
OpenUrl CrossRef PubMed

[10] 10.↵
Singer, M. et al. A Distinct Gene Module for Dysfunction Uncoupled from Activation in Tumor-Infiltrating T Cells. Cell 166, 1500–1511.e9 (2016).
OpenUrl PubMed

[11] 11.↵
Shalek, A. K. et al. Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498, 236–40 (2013).
OpenUrl CrossRef PubMed Web of Science

[12] 12.↵
Schwarz, E. M., Kato, M. & Sternberg, P. W. Functional transcriptomics of a migrating cell in Caenorhabditis elegans. Proceedings of the National Academy of Sciences of the United States of America 109, 16246–51 (2012).
OpenUrl Abstract/FREE Full Text

[13] 13.↵
Van Wolfswinkel, J. C., Wagner, D. E. & Reddien, P. W. Single-cell analysis reveals functionally distinct classes within the planarian stem cell compartment. Cell Stem Cell 15, 326–339 (2014).
OpenUrl CrossRef PubMed

[14] 14.↵
Scimone, M. L., Kravarik, K. M., Lapan, S. W. & Reddien, P. W. Neoblast specialization in regeneration of the planarian Schmidtea mediterranea. Stem Cell Reports 3, 339–352 (2014).
OpenUrl

[15] 15.↵
Hughes, T. R. et al. Functional Discovery via a Compendium of Expression Profiles. Cell 102, 109–126 (2000).
OpenUrl CrossRef PubMed Web of Science

[16] 16.↵
Van Driessche, N. et al. Epistasis analysis with global transcriptional phenotypes. Nature Genetics 37, 471–477 (2005).
OpenUrl CrossRef PubMed Web of Science

[17] 17.↵
Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic Dissection of Transcriptional Regulation in Budding Yeast. Science 296 (2002).

[18] 18.↵
Schadt, E. E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).
OpenUrl CrossRef PubMed Web of Science

[19] 19.↵
Li, Y. et al. Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans. PLoS Genetics 2, e222 (2006).
OpenUrl

[20] 20.↵
King, E. G., Sanderson, B. J., McNeil, C. L., Long, A. D. & Macdonald, S. J. Genetic Dissection of the Drosophila melanogaster Female Head Transcriptome Reveals Widespread Allelic Heterogeneity. PLoS Genetics 10, e1004322 (2014).
OpenUrl

[21] 21.↵
Adamson, B. et al. A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response. Cell 167, 1867–1882.e21 (2016).
OpenUrl CrossRef PubMed

[22] 22.↵
Dixit, A. et al. Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens. Cell 167, 1853–1866.e17 (2016).
OpenUrl CrossRef PubMed

[23] 23.↵
Angeles-Albores, D. et al. Transcriptomic Description of an Endogenous Female State in C. elegans. bioRxiv (2016).

[24] 24.↵
Epstein, A. C. R. et al. C. elegans EGL-9 and mammalian homologs define a family of dioxygenases that regulate HIF by prolyl hydroxylation. Cell 107, 43–54 (2001).
OpenUrl CrossRef PubMed Web of Science

[25] 25.↵
Shen, C., Shao, Z. & Powell-Coffman, J. A. The Caenorhabditis elegans rhy-1 Gene Inhibits HIF-1 Hypoxia-Inducible Factor Activity in a Negative Feedback Loop That Does Not Include vhl-1. Genetics 174, 1205–1214 (2006).
OpenUrl Abstract/FREE Full Text

[26] 26.↵
Shao, Z., Zhang, Y. & Powell-Coffman, J. A. Two Distinct Roles for EGL-9 in the Regulation of HIF-1-mediated gene expression in Caenorhabditis elegans. Genetics 183, 821–829 (2009).
OpenUrl

[27] 27.↵
Jiang, H., Guo, R. & Powell-Coffman, J. A. The Caenorhabditis elegans hif-1 gene encodes a bHLH-PAS protein that is required for adaptation to hypoxia. Proceedings of the National Academy of Sciences of the United States of America 98, 7916–7921 (2001).
OpenUrl Abstract/FREE Full Text

[28] 28.↵
Semenza, G. L. Hypoxia-inducible factors in physiology and medicine. Cell 148, 399–408 (2012).
OpenUrl CrossRef PubMed Web of Science

[29] 29.↵
Loenarz, C. et al. The hypoxia-inducible transcription factor pathway regulates oxygen sensing in the simplest animal, Trichoplax adhaerens. EMBO reports 12, 63–70 (2011).
OpenUrl

[30] 30.↵
Jiang, B. H., Rue, E., Wang, G. L., Roe, R. & Semenza, G. L. Dimerization, DNA binding, and transactivation properties of hypoxia-inducible factor 1. The Journal of biological chemistry 271, 17771–17778 (1996).
OpenUrl Abstract/FREE Full Text

[31] 31.↵
Powell-Coffman, J. A., Bradfield, C. A. & Wood, W. B. Caenorhabditis elegans Orthologs of the Aryl Hydrocarbon Receptor and Its Heterodimerization Partner the Aryl Hydrocarbon Receptor Nuclear Translocator. Proceedings of the National Academy of Sciences 95, 2844–2849 (1998).
OpenUrl Abstract/FREE Full Text

[32] 32.↵
Semenza, G. L., Roth, P. H., Fang, H. M. & Wang, G. L. Transcriptional regulation of genes encoding glycolytic enzymes by hypoxia-inducible factor 1. The Journal of Biological Chemistry 269, 23757–63 (1994).
OpenUrl Abstract/FREE Full Text

[33] 33.↵
Bishop, T. et al. Genetic Analysis of Pathways Regulated by the von Hippel-Lindau Tumor Suppressor in Caenorhabditis elegans. PLoS Biology 2 (2004).

[34] 34.↵
Shen, C., Nettleton, D., Jiang, M., Kim, S. K. & Powell-Coffman, J. A. Roles of the HIF-1 hypoxia-inducible factor during hypoxia response in Caenorhabditis elegans. Journal of Biological Chemistry 280, 20580–20588 (2005).
OpenUrl

[35] 35.↵
Bellier, A., Chen, C. S., Kao, C. Y., Cinar, H. N. & Aroian, R. V. Hypoxia and the hypoxic response pathway protect against pore-forming toxins in C. elegans. PLoS Pathogens 5 (2009).

[36] 36.↵
Huang, L. E., Arany, Z., Livingston, D. M. & Franklin Bunn, H. Activation of hypoxia-inducible transcription factor depends primarily upon redox-sensitive stabilization of its α subunit. Journal of Biological Chemistry 271, 32253–32259 (1996).
OpenUrl Abstract/FREE Full Text

[37] 37.↵
Kaelin, W. G. & Ratcliffe, P. J. Oxygen Sensing by Metazoans: The Central Role of the HIF Hydroxylase Pathway. Molecular Cell 30, 393–402 (2008).
OpenUrl CrossRef PubMed Web of Science

[38] 38.↵
Ma, D. K., Vozdek, R., Bhatla, N. & Horvitz, H. R. CYSL-1 Interacts with the O₂-Sensing Hydroxylase EGL-9 to Promote H₂S-Modulated Hypoxia-Induced Behavioral Plasticity in C. elegans. Neuron 73, 925–940 (2012).
OpenUrl

[39] 39.↵
Yeung, K. Y. & Ruzzo, W. L. Principal component analysis for clustering gene expression data. Bioinformatics (Oxford, England) 17, 763–774 (2001).
OpenUrl CrossRef PubMed Web of Science

[40] 40.↵
Park, E. C. et al. Hypoxia regulates glutamate receptor trafficking through an HIF-independent mechanism. The EMBO Journal 31, 1618–1619 (2012).
OpenUrl FREE Full Text

[41] 41.↵
Powell-Coffman, J. A. Hypoxia signaling and resistance in C. elegans. Trends in Endocrinology and Metabolism 21, 435–440 (2010).
OpenUrl

[42] 42.↵
Angeles-Albores, D., N. Lee, R. Y., Chan, J. & Sternberg, P. W. Tissue enrichment analysis for C. elegans genomics. BMC Bioinformatics 17, 366 (2016).
OpenUrl CrossRef

[43] 43.↵
Angeles-Albores, D., Lee, R. Y., Chan, J. & Sternberg, P. W. Phenotype and gene ontology enrichment as guides for disease modeling in C. elegans. bioRxiv (2017).

[44] 44.↵
Luhachack, L. G. et al. EGL-9 Controls C. elegans Host Defense Specificity through Prolyl Hydroxylation-Dependent and -Independent HIF-1 Pathways. PLoS Pathogens 8, 48 (2012).
OpenUrl

[45] 45.↵
Ackerman, D. & Gems, D. Insulin/IGF-1 and hypoxia signaling act in concert to regulate iron homeostasis in Caenorhabditis elegans. PLoS Genetics 8 (2012).

[46] 46.↵
Romney, S. J., Newman, B. S., Thacker, C. & Leibold, E. A. HIF-1 regulates iron homeostasis in Caenorhabditis elegans by activation and inhibition of genes involved in iron uptake and storage. PLoS Genetics 7 (2011).

[47] 47.↵
Semenza, G. L. Hypoxia-inducible factor 1: Regulator of mitochondrial metabolism and mediator of ischemic preconditioning. Biochimica et Biophysica Acta - Molecular Cell Research 1813, 1263–1268 (2011).
OpenUrl

[48] 48.↵
Lee, S. J., Hwang, A. B. & Kenyon, C. Inhibition of respiration extends C. elegans life span via reactive oxygen species that increase HIF-1 activity. Current Biology 20, 2131–2136 (2010).
OpenUrl CrossRef PubMed Web of Science

[49] 49.↵
Pfleger, J., He, M. & Abdellatif, M. Mitochondrial complex II is a source of the reserve respiratory capacity that is regulated by metabolic sensors and promotes cell survival. Cell Death and Disease 6, 1–14 (2015).
OpenUrl

[50] 50.↵
Sikora, J. et al. Bioinformatic and biochemical studies point to AAGR-1 as the ortholog of human acid α-glucosidase in Caenorhabditis elegans. Molecular and Cellular Biochemistry 341, 51–63 (2010).
OpenUrl PubMed

[51] 51.↵
Brugarolas, J. et al. Regulation of mTOR function in response to hypoxia by REDD1 and the TSC1/TSC2 tumor suppressor complex. Genes and Development 18, 1–12 (2004).
OpenUrl FREE Full Text

[52] 52.↵
Sudarshan, S. et al. Fumarate hydratase deficiency in renal cancer induces glycolytic addiction and hypoxia-inducible transcription factor 1α stabilization by glucose-dependent generation of reactive oxygen species. Mol Cell Biol 29, 4080–4090 (2009).
OpenUrl Abstract/FREE Full Text

[53] 53.↵
Shao, Z., Zhang, Y., Ye, Q., Saldanha, J. N. & Powell-Coffman, J. A. C. elegans SWAN-1 binds to EGL-9 and regulates HIF-1-mediated resistance to the bacterial pathogen Pseudomonas aeruginosa PAO1. PLoS Pathogens 6, 91–92 (2010).
OpenUrl

[54] 54.↵
Wang, G. L. & Semenza, G. L. Characterization of hypoxia-inducible factor 1 and regulation of DNA binding activity by hypoxia. The Journal of Biological Chemistry 268, 21513–8 (1993).
OpenUrl Abstract/FREE Full Text

[55] 55.↵
Goentoro, L., Shoval, O., Kirschner, M. W. & Alon, U. The Incoherent Feedforward Loop Can Provide Fold-Change Detection in Gene Regulation. Molecular Cell 36, 894–899 (2009).
OpenUrl CrossRef PubMed Web of Science

[56] 56.↵
Hart, Y., Antebi, Y. E., Mayo, A. E., Friedman, N. & Alon, U. Design principles of cell circuits with paradoxical components. Proceedings of the National Academy of Sciences of the United States of America 109, 8346–8351 (2012).
OpenUrl Abstract/FREE Full Text

[57] 57.↵
Hart, Y. & Alon, U. The Utility of Paradoxical Components in Biological Circuits. Molecular Cell 49, 213–221 (2013).
OpenUrl CrossRef PubMed Web of Science

[58] 58.↵
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America 100, 9440–5 (2003).
OpenUrl Abstract/FREE Full Text

[59] 59.↵
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nature protocols 9, 171–81 (2014).
OpenUrl

[60] 60.↵
Bokeh Development Team. Bokeh: Python library for interactive visualization (2014).

[61] 61.↵
McKinney, W. pandas: a Foundational Python Library for Data Analysis and Statistics. Python for High Performance and Scientific Computing 1–9 (2011).

[62] 62.↵
Oliphant, T. E. SciPy: Open source scientific tools for Python. Computing in Science and Engineering 9, 10–20 (2007).
OpenUrl

[63] 63.↵
Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2012).
OpenUrl CrossRef

[64] 64.↵
Salvatier, J., Wiecki, T. & Fonnesbeck, C. Probabilistic Programming in Python using PyMC. PeerJ Computer Science 2, 1–24 (2015).
OpenUrl

[65] 65.↵
Van Der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: A structure for efficient numerical computation. Computing in Science and Engineering 13, 22–30 (2011).
OpenUrl CrossRef

[66] 66.↵
Hunter, J. D. Matplotlib: A 2D graphics environment. Computing in Science and Engineering 9, 99–104 (2007).
OpenUrl CrossRef

[67] 67.↵
Waskom, M. et al. seaborn: v0.7.0 (January 2016) (2016).

[68] 68.↵
Pérez, F. & Granger, B. IPython: A System for Interactive Scientific Computing Python: An Open and General-Purpose Environment. Computing in Science and Engineering 9, 21–29 (2007).
OpenUrl CrossRef