Abstract
Objectives Tumor hypoxia is associated with metastasis and resistance to chemotherapy and radiotherapy. Genes involved in oxygen-sensing are clinically relevant and have significant implications on prognosis.
Methods We identified of two signatures, signature 1 (good prognosis) and signature 2 (adverse prognosis), each consisting of 5 genes using three pancreatic cancer cohorts (n=681). We validated the signatures’ performance in predicting survival in ten cancers using Cox regression and receiver operating characteristic (ROC) analyses.
Results Signature 1 and signature 2 were associated with good and poor overall survival respectively. Prognosis of signature 1 in 8 cohorts representing 6 cancers (n=2,627): bladder (hazard ratio [HR]=0.68, P=0.039), papillary renal cell (HR=0.35, P=0.013), liver (HR=0.64, P=0.033 and HR=0.49, P=0.025), lung (HR=0.66, P=0.014) and pancreatic (HR=0.42, P<0.001 and HR=0.64, P=0.04) and endometrial (HR=0.40, P<0.001). Prognosis of signature 2 in 12 cohorts representing 9 cancers (n=4,134): bladder (HR=1.46, P=0.039), cervical (HR=1.97, P=0.035), head and neck (HR=1.39, P=0.038), renal clear cell (HR=1.47, P=0.012), papillary renal cell (HR=3.89, P=0.0015), liver (HR=5.10, P<0.0001 and HR=2.26, P<0.001), lung (HR=1.54, P=0.011), pancreatic (HR=2.09, P=0.002, HR=1.46, P=0.018, and HR=1.99, P<0.0001) and stomach (HR=1.78, P=0.004). Multivariate Cox regression confirmed independent clinical relevance of signatures in these cancers. ROC analyses confirmed superior performance of signatures to current tumor staging benchmarks. KDM8 is a potential tumor suppressor downregulated in liver and pancreatic cancers and is an independent prognostic factor. KDM8 expression negatively correlated with cell cycle regulators. Low KDM8 in tumors was associated with loss of cell adhesion phenotype through HNF4A signaling.
Conclusions Pan-cancer signatures of oxygen-sensing genes used for risk assessment in 10 cancers (n=6,761) could guide individualized treatment plans.
Introduction
Solid tumors face considerable demands in maintaining oxygen availability due to their unique vasculature systems[1]. Rapid neoplastic cell proliferation and overexpression of angiogenic factors leading to the formation of disorganized blood vessels result in insufficient oxygen supply to tumor cells[2,3]. Hence, there is a requirement for tumors to evolve systems that detect changes in oxygen homeostasis. The discovery of hypoxia-inducible factor (HIF) as one of the key oxygen-sensing gene represents a quantum leap forward in tumor biology[4,5]. HIF has co-evolved with the need for more sophisticated oxygen delivery system but intriguingly, it is found in primitive animals that lack oxygen-delivery structures altogether[6]. Its discovery has led to the development of drugs used to treat anemia and cancer[7,8].
In addition to HIF, 2-oxoglutarate (2OG) oxygenases represent another family of oxygen-sensing proteins. As suggested by its name, this group of enzymes has an absolute requirement for molecular oxygen. They catalyze a range of oxidative modifications and their activities are affected by nutrient and oxygen availability[9], both of which are altered within the tumor microenvironment. Several members from this gene family have been implicated in cancer. Ten-Eleven Translocation 2 (TET2) is frequently found to be mutated in leukemia[10] and other solid malignancies[11]. JmjC lysine demethylases (KDMs) epigenetically regulate cancer-related genes[12] and inactivating mutations are observed in multiple cancers; glioblastoma/neural, breast, colorectal and renal[13,14].
We hypothesize that 2OG oxygenases expression could help predict prognosis in solid malignancies that are characteristically oxygen-deprived. Additionally, we hypothesize this would be applicable to different types of cancers as they share a uniform need to overcome hypoxia for survival. Starting from an initial set of 61 2OG oxygenases, we developed two prognostic gene signatures, each consisting of a minimal 5 genes, that could facilitate risk stratification and predict overall survival in cancer patients. We confirmed their prognostic performance through a multi-cohort pan-cancer validation process.
Methods
Detailed methods are available in supplementary information.
Results
Multi-cohort analyses revealed two prognostic 2OG oxygenases signatures
We analyzed the prognostic significance of 61 2OG oxygenase genes in 25 cancers (n = 19,781) from The Cancer Genome Atlas[15](Table S1 & S2). Prognostic genes were defined as those whose expression levels were significantly correlated with patients’ overall survival. Since the highest number of prognostic genes (29/61) was observed in pancreatic cancer (n=178) and it is one of the most difficult to treat cancers, two additional pancreatic cancer cohorts from the International Cancer Genome Consortium [ICGC][16] (n=269 and n=234) were used in combination as training cohorts (Fig.1A). We defined two gene signatures; signature 1 and signature 2 as favorable and unfavorable prognostic factors respectively, by taking into consideration genes that were significant in univariate Cox analyses in two out of three pancreatic cancer cohorts (Fig.1A,1B). Signature 1 constitutes 5 genes: KDM8, KDM6B, P4HTM, ALKBH4 and ALKBH7. Likewise, 5 genes make up signature 2: KDM3A, P4HA1, ASPH, PLOD1 and PLOD2 (Fig.1A).
Patients were median-dichotomized based on mean expression scores of signatures 1 and 2 genes. Patients with high expression of signature 1 genes had significantly better survival in 6 cancers: bladder (hazard ratio [HR]=0.68, P=0.039), papillary renal cell (HR=0.35, P=0.013), liver (HR=0.64, P=0.033 and HR=0.49, P=0.025), lung (HR=0.66, P=0.014), pancreatic (HR=0.42, P<0.001 and HR=0.64, P=0.04) and endometrial (HR=0.40, P<0.001); consistent with the fact that signature 1 is a marker of good prognosis (Fig.2A; Table S3). In contrast, patients with high expression of signature 2 genes had significantly worse prognosis in 9 cancers: bladder (HR=1.46, P=0.039), cervical (HR=1.97, P=0.035), head and neck (HR=1.39, P=0.038), renal clear cell (HR=1.47, P=0.012), papillary renal cell (HR=3.89, P=0.0015), liver (HR=5.10, P<0.0001 and HR=2.26, P<0.001), lung (HR=1.54, P=0.011), pancreatic (HR=2.09, P=0.002, HR=1.46, P=0.018, and HR=1.99, P<0.0001) and stomach (HR=1.78, P=0.004) (Fig.2B).
Cross-platform subgroup and multivariate analyses confirmed the validity of signatures 1 and 2 as independent prognostic factors
To assess the independence of signatures 1 and 2 over current tumor staging systems, we performed subgroup analyses of their prognostic effect in patients with early (stages 1 and/or 2), intermediate (stages 2 and/or 3) and late (stages 3 and/or 4) cancer stages. Kaplan-Meier analyses revealed that signature 1 successfully identified high-risk (low-expression score) and low-risk (high-expression score) patients with early (bladder, liver, lung, pancreatic and endometrial), intermediate (liver and endometrial) late (papillary renal cell) disease stages (Fig.3A; Fig.S1A). For signature 2, patients with high-expression scores had a significantly poorer survival (Fig.3B). Signature 2 was also independent of disease stage as it successfully predicted survival in early (liver, lung and pancreatic), intermediate (bladder, liver, lung and stomach) and late (bladder, head and neck, papillary renal cell, liver and stomach) stages (Fig.3B; Fig.S1B).
Receiver operating characteristic (ROC) analyses were employed to determine the predictive performance (sensitivity and specificity) of signatures 1 and 2 on 5-year overall survival. Signature 1 performed best, as measured by area under the curve (AUC), in pancreatic cancer (AUC=0.754) followed by papillary renal cell (AUC=0.738), liver (AUC=0.652 and AUC=0.613), bladder (AUC=0.645), endometrial (AUC=0.635) and lung (AUC=0.625) cancers (Fig.3C; Fig.S1C). Signature 2 performance in papillary renal cell cancer (AUC=0.810) was the best, followed by cervical (AUC=0.692), liver (AUC=0.675 and AUC=0.625), pancreatic (AUC=0.668), renal clear cell (AUC=665), head and neck (AUC=0.632), lung (AUC=0.623), stomach (0.618) and bladder (AUC=0.605) cancers (Fig.3D; Fig.S1D). Performance of both signatures 1 and 2 is superior to current tumor staging benchmarks except for the following: signature 1 (liver, lung and endometrial) and signature 2 (bladder, renal clear cell, liver and lung) (Fig.3C,D; Fig.S1C,D). Remarkably, when used in combination with tumor, node and metastasis (TNM) staging parameters, both signatures consistently outperformed each of the individual classifiers (TNM staging or gene signatures when considered alone), reinforcing their incremental prognostic values (Fig.3C,D; Fig.S1C,D). Significantly, while tumor staging could not predict outcome in cervical cancer patients (AUC<0.5), signature 2 sufficiently served as an adverse prognostic factor (Fig.3D).
Univariate Cox regression analyses revealed that tumor stage was associated with survival except for cervical and pancreatic cancers (Table S3). This is expected given the low AUC values for tumor stage in both cancers: pancreatic (AUC=0.593) and cervical (AUC=0.455), suggesting that current staging systems for these cancers are inadequate (Fig.3C,D). Multivariate Cox analyses after adjusting for tumor staging showed that signatures 1 and 2 remained significantly associated with survival (Table S3). For 2 liver cancer cohorts we considered additional clinicopathological features. The GSE14520 cohort consisted of Chinese patients with hepatitis B-associated hepatocellular carcinoma[17] while LIRI-JP is a Japanese-based cohort of mixed etiology[18]. Tumor size, cirrhosis, TNM staging, Barcelona Clinic Liver Cancer staging and alpha-fetoprotein levels were all significantly associated with survival in GSE14520 (Table S3). Tumor size could also predict survival in LIRI-JP (Table S3). When these significant covariates along with signatures 1 or 2 were included in multivariate Cox models, the signatures remained significant risk factors: signature 1 (LIRI-JP: HR=0.54, P=0.043) and signature 2 (LIRI-JP: HR=4.539, P=1.8e-4 and GSE14520: HR=2.01, P=0.003) (Table S3). These results highlight the potentially superior prognostic ability of our signatures; signatures 1 and 2 identified high-and low-risk patients in 8 (n=2,627) and 12 (n=4,134) independent cohorts respectively to collectively represent 10 diverse cancers (Fig.1A).
Significance of somatic mutations in risk stratified patients
Patients were risk stratified into low-and high-risk groups using signatures 1 and 2. For signature 1, high-risk patients had significantly lower expression levels of good prognosis genes ALKBH4, ALKBH7, KDM8, KDM6B and P4HTM (Fig.S2A). In contrast, high-risk patients as stratified by signature 2 had significantly higher expression levels of adverse prognosis genes ASPH, KDM3A, P4HA1, PLOD1 and PLOD2 (Fig.S2B). To ascertain the relationship between tumor hypoxia and expression of signature genes, hypoxia scores were computed for each patient as mean expression values (log2) of 52 hypoxia signature genes[19]. Signature 1 expression scores in patients negatively correlated with hypoxia score (Fig.S3A). Since tumor hypoxia is associated with distant metastasis, recurrence and reduced therapeutic response[20], patients with high expression of signature 1 genes (low hypoxia score) correlated with less advance disease states consistent with it being a marker of good prognosis (Fig.S3A).
Conversely, signature 2 scores positively correlated with tumor hypoxia and hence poor survival outcomes (Fig.S4A). We anticipate that patients’ individual risks of death, as determined from signatures 1 and 2, will positively correlate tumor hypoxia. Indeed, risk scores for each patient, as calculated by taking the sum of Cox regression coefficient for each of the individual genes multiplied with its corresponding expression value[21], significantly correlated with hypoxia scores (Fig.S3B,S4B). Hence, high risk patients have more hypoxic tumors suggesting that our gene signatures are efficient and adequate in predicting death.
To ascertain the association between patients’ risks, as determined by our gene signatures, and somatic mutations, we retrieved the top five most commonly mutated genes for each cancer. Mutations in PCDHA1, a cell adhesion gene from the cadherin superfamily, were associated with poor survival in bladder (HR=1.65, P=0.027) and stomach cancers (HR=1.53, P=0.046) but improved survival in endometrial cancer (HR=0.52, P=0.042) (Table S3). Mutations in another gene from the protocadherin alpha cluster, PCDHA2, were also associated with adverse outcomes in stomach cancer (HR=1.60, P=0.025) (Table S3). Mutations in TTN (connectin) and the tumor suppressor TP53 were associated with poor survival in bladder (HR=1.61, P=0.016) and endometrial cancers (HR=1.78, P=0.041) (Table S3). Interestingly, another tumor suppressor PTEN, when mutated, is linked to better outcomes in endometrial cancer (HR=0.43, P=0.006) (Table S3). Similar observations were made for a lipid kinase gene PIK3CA in endometrial cancer (HR=0.36, P=0.002) (Table S3). Likewise, mucin MUC4 mutations improved survival in clear cell renal cancer patients (HR=0.57, P=0.012) (Table S3); an observation that is consistent with another study[22].
Multivariate Cox analyses on signatures 1 and 2 while controlling for significant somatic mutation variables revealed that the gene signatures were independent survival predictors: bladder (signature 1: HR=0.69, P=0.047 and signature 2: HR=1.58, P=0.043), renal clear cell (signature 2: HR=1.52, P=0.007), stomach (signature 2: HR=1.80, P=0.006) and endometrial (signature 1: HR=0.52, P=0.024) (Table S3). Patients stratified by signatures 1 or 2 and mutation status indicated that they collectively were significantly associated with survival (Fig.4). In bladder cancer, high risk patients (low signature 1 score) harboring mutant alleles of PCDHA1 had ~50% increased mortality at 5 years compared to low risk patients (high signature 1 score) with wild-type PCDHA1 (P=0.016) (Fig.4). Although results were less dramatic for PCDHA1 and signature 2, we still observed a ~25% elevated mortality at 5 years for these two patient groups with bladder cancer (P=0.04) (Fig.4). In patients with stomach cancer, high-risk patients (high signature 2 score) with mutant PCDHA1 had the worst outcomes (P=0.0017) (Fig.4). Conversely, PCDHA1 mutation was associated with good prognosis in endometrial cancer, hence high-risk patients with wild-type PCDHA1 had the lowest survival rates while survival is improved by ~20% in low-risk patients with mutant PCDHA1 (P=0.003) (Fig.4). PIK3CA (P<0.001) and PTEN (P=0.0013) mutations were associated with good outcomes in endometrial cancer (Fig.4). Mutations in another cadherin gene PCDHA2 when considered alongside signature 2 were also associated with survival in stomach cancer (P<0.001) (Fig.4). Survival rates were reduced by ~37% in high-risk patients with mutant PCDHA2 (Fig.4). Joint relation between TP53 mutation and signature 1 significantly influenced survival in endometrial cancer (P=0.0015) (Fig.4). Since MUC4 mutations were associated with better outcomes, survival rates were the lowest in high-risk patients (high signature 2 scores) with wild-type MUC4 (P=0.0025) (Fig.4).
Tumor suppressive roles of KDM8 through cell cycle regulation and cell adhesion maintenance
Of all the signature genes, KDM8 was identified as one of the most downregulated genes in tumors compared to non-tumor samples (Fig.S5A). Patients with high KDM8 levels had significantly lower risk of death in pancreatic (HR=0.57, P=0.016) and liver cancer cohorts (HR=0.52, P=0.0031; HR=0.49, P=0.026 and HR=0.64, P=0.039) (Fig.5). Prognostic significance of KDM8 was also independent of tumor stage (Fig.5). KDM8 expression decreased as tumor malignant grade increased in that the least advanced stage 1 tumors had the highest median KDM8 values (Fig.S5B). Moreover, KDM8 expression negatively correlated with hypoxia score, indicating that patients with low levels of KDM8 had more hypoxic tumors and poorer survival outcomes (Fig.S5C). Together, these observations suggest that KDM8 may function as a tumor suppressor. This hypothesis is corroborated by an independent report on the role of KDM8 in cell cycle regulation[23]. Indeed, we observed that KDM8 expression negatively correlated with the expression levels of canonical cell-cycle genes: cyclins (CCNA2, CCNB1, CCNB2, CCND1, CCNE1 and CCNE2) and cyclin-dependent kinases (CDK1, CDK2, CDK4, CDK6, CDK7 and CDK8), which were consistent across all liver and pancreatic cancer cohorts (Fig.S5D). This implied that KDM8 is required for tight control of the cell-cycle machinery and its reduction may lead to aberrant proliferation commonly seen in cancer cells.
To ascertain the biological consequences of deregulated KDM8 expression, we conducted different expression analysis on liver cancer patients categorized into low and high groups based on KDM8 expression. A total of 745 genes were differentially expressed (DEGs) between the two groups (fold change>2 or <-2, P<0.05) (Table S4). Significant enrichments of biological pathways involved in metabolism, immune regulation, VEGF production, inflammation and cell adhesion were observed (Fig.S5E;Table S5). Furthermore, DEGs were overrepresented as targets of HNF4A, HNF4G, FOXA1, FOXA2 and NR2F2 transcription factors (TFs) (Fig.S5F). These TFs play central roles in cell polarity maintenance and epithelial differentiation[24–26], hence downregulation of KDM8 may drive epithelial-mesenchymal transition (EMT) and tumor progression. HNF4A is one of the key TFs responsible for regulating a myriad of hepatic functions including cell junction assembly[26,27]. Pathway analysis of KDM8 DEGs revealed enrichment of processes related to cell adhesion, suggesting a potential crosstalk between KDM8 and HNF4A. Of the 745 DEGs, analysis on a hepatoma-based HNF4A chromatin immunoprecipitation-sequencing dataset demonstrated that 148 genes were directly bound by HNF4A[28]. To further reinforce the interplay between HNF4A and KDM8, we observed that 110 of the 745 DEGs were overrepresented in HNF4A-null mice[26] and 45 of these genes were direct HNF4A targets (Fig.S5G). Differential expression analysis between HNF4A-deficient and wild-type mice showed that a majority of the 45 genes were downregulated, as expected, suggesting that HNF4A directly activates their gene expression, many of which are involved in a multitude of cell adhesion processes (Fig.S5H).
Discussion
A multi-cohort retrospective study led to the identification of two novel pan-cancer prognostic gene signatures derived from oxygen-sensing genes. Cross-platform validations confirmed prognosis in 10 cancers to collectively include 6,761 patients spanning 20 diverse cohorts (Fig.1A). The gene signatures had opposing prognostic value; signature 1 is a marker of good prognosis while signature 2 is associated with poor outcomes. The key strengths of our signatures as powerful prognostic tools are: 1) pan-cancer utility, 2) involving a mere 5 genes each that provide continuous assessment of death risks and 3) superior to current tumor staging parameters. The prognostic capability of our signatures in diverse cancers is evidence that conserved biological processes; oxygen-sensing, cell-cycle deregulation and loss of cell polarity, governs disease progression and thus clinical outcomes.
Anti-tumorigenic functions have been reported for several genes from signature 1. Loss of KDM6B resulted in more aggressive pancreatic ductal adenocarcinoma[29]. In colorectal cancer, high KDM6B is a predictor of good prognosis and knock-down of KDM6B augmented cell proliferation and inhibited apoptosis[30]. Yet, KDM6B function is enigmatic. Others have reported that high KDM6B expression is associated with increased metastasis and invasion in renal clear cell carcinoma that correlated with poor prognosis[31]. KDM6B also promotes TGF-B induced EMT and invasiveness in breast cancer[32]. While we could neither confirm nor deny the validity of these studies, it is striking that our observation of favorable prognosis associated with high KDM6B in pancreatic ductal adenocarcinoma was consistent with the report from Yamamoto and colleagues[29]. We also did not observe any prognostic significance of signature 1, in which KDM6B is part of, in either breast or renal clear cell cancers, which indirectly substantiates findings from the two other reports on KDM6B not being a marker of good prognosis[31,32].
KDM8 is a gene associated with favorable prognosis. Our results suggest determinative crosstalk between KDM8 and HNF4A, particularly in the context of morphogenesis, cell adhesion, maintenance of cell polarity and epithelial formation. Moreover, survival rates for bladder cancer at 5 years dropped to ~12% in patients that simultaneously harbor low expression of signature 1 genes (high-risk), which included KDM8, and PCDHA1 mutations. Additive effects conferred by mutations in this cell-adhesion protein provides support for the hypothesis that KDM8 is likely a tumor suppressor and downregulation of this gene may lead to a loss of epithelial phenotype and cell adhesion to promote cancer invasion. Additionally, loss of KDM8 expression is correlated with increased expression of cell-cycle genes that may contribute to deranged cell cycle regulation and tumor progression (Fig.S5D). Like KDM6B, the function of KDM8 appears to be cell-type dependent. While KDM8 expression is downregulated in liver and pancreatic tumors compared to adjacent cells (Fig.S5A), it is overexpressed in breast cancer to induce EMT and invasion[33]. Nonetheless, KDM8 roles are not limited to cell cycle regulation. KDM8 exert tumor suppressive functions in hematopoietic cancer by mediating DNA repair[34]. Collectively, imbalance in the JmjC subfamily of lysine demethylases such as KDM8 and KDM6B are likely to result in broad-ranging but cell-type specific biological effects.
Overall, our gene signatures will enhance decision making in clinic and provide individualized treatment plans based on patients’ tumor biology to maximize treatment efficacy and prolong lifespan by directing resources to those most in need. Upon diagnosis, the signatures can be used for risk stratification and patients with poor prognosis can be subjected to additional neoadjuvant treatments. These signatures would need to be assessed prospectively and we aim to evaluate their use in clinical decision making of patients presented with these cancers.
Authors contribution
WHC and AGL designed the study and analyzed the data. All authors interpreted the data. WHC and AGL wrote the initial manuscript draft. All authors revised the manuscript and approved the final version.
Competing interests
None.
Acknowledgements
A special thank you to Graham Foster for his critical insights and help on manuscript preparation.