Abstract
Malignant pleural mesothelioma (MPM) is an aggressive cancer of the thorax with a median survival of one year. We constructed an ‘MPM interactome’ with over 300 computationally predicted PPIs and over 1300 known PPIs of 62 literature-curated genes whose activity affects MPM. Known PPIs of the 62 MPM associated genes were derived from BioGRID and HPRD databases. Novel PPIs were predicted by applying the HiPPIP algorithm, which computes features of protein pairs such as cellular localization, molecular function, biological process membership, genomic location of the gene, gene expression in microarray experiments, protein domains and tissue membership, and classifies the pairwise features as interacting or non-interacting based on a random forest model. To our satisfaction, the interactome is significantly enriched with genes differentially expressed in MPM tumors compared with normal pleura, and with other thoracic tumors. The interactome is also significantly enriched with genes whose high expression has been correlated with unfavorable prognosis in lung cancer, and with genes differentially expressed on crocidolite exposure. 28 of the interactors of MPM proteins are targets of 147 FDA-approved drugs. By comparing differential expression profiles induced by drug to profiles induced by MPM, potentially repurposable drugs are identified from this drug list. Development of PPIs of disease-specific set of genes is a powerful approach with high translational impact – the interactome is a vehicle to piece together an integrated view on how genes associated with MPM through various high throughput studies are functionally linked, leading to clinically translatable results such as clinical trials with repurposed drugs. The PPIs are made available on a webserver, called Wiki-Pi MPM at http://severus.dbmi.pitt.edu/wiki-MPM with advanced search capabilities.
One Sentence Summary Mesothelioma Interactome with 367 novel protein-protein interactions may shed light on the mechanisms of cancer genesis and progression
Introduction
Internal organs such as heart and lung, and body cavities such as thoracic and abdominal cavities, are covered by a thin slippery layer of cells called the “mesothelium”. This protective layer prevents organ adhesion and plays a number of important roles in inflammation and tissue repair (1). The mesothelia that cover the heart, lung and abdominal cavity are specifically called pericardium, pleura and peritoneum, respectively. Mesothelioma is the cancer that affects these layers. Most types of mesothelioma invade surrounding tissues and blood vessels to form secondary tumors and metastasize to different locations in the body (2). Mesotheliomas of pleura account for ~90% of malignant mesotheliomas and have a short median survival, of only 1 year (3, 4).
Malignant pleural mesothelioma (MPM) is associated with exposure to asbestos over a long period of time (5). The disease onset occurs after a long period after exposure, and this latency makes its detection difficult (2). Only a small fraction of the population exposed to asbestos develops malignant mesothelioma, and the disease tends to cluster in families, which suggests the involvement of a genetic component in the disease (6). Germline mutations were observed in BAP1 gene (7, 8), and inactivating mutations in NF2 gene in pleural mesothelioma (9, 10). Recently, twenty four germline mutations were identified in 13 genes from 198 patients with malignant mesothelioma of pleural or peritoneal origin (11). Non-invasive pre-malignant phase is not seen in mesothelioma unlike other tumors, which necessitates expeditious discovery of genetic predispositions, molecular mechanisms and therapeutics for the disease (12).
60% of the disease-associated missense mutations perturb protein-protein interactions (PPIs) in human genetic disorders (13). PPIs are intricately involved in biological functions and disease mechanisms, and may be exploited for drug-discovery (13). The molecular mechanisms of disease are often revealed by the PPIs of disease-associated genes. For example, the involvement of transcriptional deregulation in the pathogenesis of MPM was identified through mutations detected in BAP1 and its interactions with several proteins including BRCA1 revealed by co-immunoprecipitation (14). The PPI of BRCA1 with BAP1 was also central in understanding its role in growth-control pathways and cancer (15). Studies on BAP1 and BRCA1 later provided the basis for several clinical trials including testing of the drug Vinorelbine as a second line therapy for patients with MPM (16).
Thus, the knowledge of PPIs, if available to biologists specializing in the study of a certain disease, gene or a pathway, would lead to insightful results that advance understanding of disease biology. Despite their importance, only about 10-15% of expected PPIs in the human protein interactome are currently known, while the remaining 85-90% of the estimated 600,000 PPIs remain to be discovered; for more than 5,000 of the human proteins, not even a single PPI is currently known (17). Discovery of PPIs is, therefore, essential to advancing research in biomedical science. Performing wet-lab experiments to detect all of these “missing” interactions is currently impossible due to limitations on funds, reagents, technical methods, and expertise over individual proteins. It becomes imperative that these unknown PPIs be predicted with computational methods or be detected with high-throughput biotechnological methods. We developed a computational model, called HIPPIP (high-precision protein-protein interaction prediction) that was deemed accurate by computational evaluations and experiments (18). Novel PPIs predicted using this model are making translational impact. For example, by constructing interactomes (i.e. PPI networks) of various disease associated genes, we have highlighted the role of cilia in congenital heart disease (19) and the role of mitochondrial proteins in hypoplastic left heart syndrome (20). Even individual PPIs have a potential to make vast impact. For example, the PPI between oligoadenylate synthetase like protein (OASL) and retinoic acid-inducible gene I (RIG-I) protein led to the discovery that OASL activates host response through RIG-I signaling during viral infections (21).
A number of databases and web applications make protein annotations and interactions available to biologists. Typically these web-based resources present comprehensive annotations of genes including their protein interactions (e.g., GeneCards, Wiki Genes, UniProt); some are exclusively designed for presenting lists of protein interactions of genes and also present network view of the PPIs (e.g., BioGRID, STRING,…). We developed a web resource called Wiki-Pi that presents individual PPIs with comprehensive annotations of the two proteins involved in an interaction (22). It allows search and retrieval of PPIs of interest based on their biological annotations and has helped several biologists interpret their experimental results. For example, it helped show a link between downregulation of the gene NDRG1 in oligodendrocytes of multiple sclerosis patients and altered translation of myelin genes resulting in partial demyelination (23) and pointing at the role of KPNA1 in regulating gene expression (24).
Our goal here is to apply HiPPIP to predict novel PPIs of MPM associated genes, and to make their predicted as well as previously known PPIs (namely, the MPM interactome) available in Wiki-Pi, so as to accelerate biomedical research on MPM. We demonstrate here the various ways in which system level analyses of the MPM interactome could lead to biologically and clinically relevant translatable results.
Results
We collected MPM-associated genes from Ingenuity Pathway Analysis (IPA) suite, which gave a list of 62 genes curated from literature, which will be referred to as ‘MPM genes’ here; these genes have been reported to affect MPM through gene expression changes, or by harboring genetic variants, or by being targeted by drugs proven to be clinically active against MPM (see details in Data File 1) (25). PPIs of the 62 MPM genes were collected from two literature-curated databases, namely Human Protein Reference Database (HPRD) (26) and the Biological General Repository for Interaction Datasets (BioGRID) (27). Next, we applied HiPPIP, a computational model that we developed previously to predict novel PPIs of MPM genes. In HiPPIP, PPIs are predicted by computing features of protein pairs and developing a random forest model to classify the pairwise features as interacting or non-interacting. Protein annotations that were used in this work are: cellular localization, molecular function and biological process membership, genomic location of the gene, gene expression in hundreds of microarray experiments, protein domains and tissue membership of proteins. Computation of features of protein-pairs is described earlier in (28). A random forest with 30 trees was trained using the feature offering maximum information gain out of 4 random features to split each node; minimum number of samples in each leaf node was set to be 10. The random forest outputs a continuous valued score in the range of [0,1]. The threshold to assign a final label was varied over the range of the score for positive class (i.e., 0 to 1) to find the precision and recall combinations that are observed. This prediction model is referred to as High-precision Protein-Protein Interaction Prediction (HiPPIP) model. Applying HiPPIP, we discovered 367 novel PPIs of MPM genes, which are deemed highly accurate according to prior evaluation, including experimental validation, of the HiPPIP model (18). The MPM interactome has 1,387 known PPIs and 367 novel PPIs among the 62 MPM-associated genes and 1,620 interactors (Figure 1 and Data File 2). Nearly half of the MPM genes had fewer than 10 known PPIs each, whereas 150 novel PPIs are predicted for these together (Figure 2).
Experimental Validation of Selected PPIs
We carried out experimental validations of five predicted PPIs, namely, BAP1-PARP3, ALB-KDR, ALB-PDGFRA, CUTA-HMGB1 and CUTA-CLPS. These PPIs were chosen for their biological relevance as well as to their proximity to MPM genes. The first three are novel interactions of MPM genes, and the last two are in close proximity to multiple MPM genes in this biophysical interaction network. All 5 PPIs were validated using protein pull-down followed by protein identification. The pulled down sample is analyzed with either mass spectrometry (SF Table 1) or size-based protein detection assay (see Methods) to identify whether the prey protein has been pulled down along with the bait protein. Each bait protein was also paired with a random prey protein to serve as control (specifically, BAP1-phospholambin, ALB-FGFR2 and CUTA-FGFR2). All predicted PPIs but validated to be true, while control pairs tested negative. In addition to these 5 PPIs, another PPI from MPM interactome, namely HMGB1-FLT1 was validated in our prior work through co-immunoprecipitation (18).
We hypothesize that the BAP1-PARP3 interaction may enhance cancer growth in MPM. BAP1 is a tumor suppressor protein playing a role in cell cycle progression, repair of DNA breaks, chromatin remodeling, and gene expression regulation, and variants in BAP1 have been implicated in hereditary and sporadic mesothelioma (29). PARP3 is involved in DNA repair, regulation of apoptosis and maintenance of genomic stability by stabilizing the mitotic spindle, and maintaining telomere integrity (30). In clinical trials (31), PARP inhibitors were found to influence cancers in which mutations in BRCA1 or BRCA2 are observed. BAP1 interacts with BRCA1, another tumor suppressor protein, inhibiting breast cancer growth (14).(14)(14)(14)(14) In the absence of BRCA1 activity or with a perturbation in its interaction with BAP1, cancerous growth is enhanced (31). Then, the novel interaction of BAP1 with PARP3 in cancerous cells may promote cancerous growth, possibly through regulation of DNA repair and apoptosis. BAP1and PARP3 were found to be mildly over expressed in sarcomatoid MPM tumors compared with normal pleural tissue (log2(fold change) or log2FC=0.575, p-value=0.028, log2FC=0.695, p-value=0.0212, respectively) (GSE42977 (32)). Low levels of ALB have been correlated with poor prognosis in MPM patients (33). The two MPM genes, KDR and PDGFRA, that ALB is predicted to interact with, are members of the PI3K/AKT pathway which is aberrantly active in mesothelioma (34). High expression of CUTA has been correlated with favorable prognosis in lung cancer (Pathology Atlas). It was found to be overexpressed in MPM tumors vs. normal pleura (log2FC = 0.871, p-value = 0.0039) (GSE2549 (35)) and in MPM tumors vs. other thoracic tumors (log2FC = 0.454, p-value = 0.0029) (GSE42977 (32)). CLPS inhibits metastasis of the melanoma cell line, B16F10, to lungs by blocking the signaling pathway involving β1 integrin, FAK and paxillin (36). CLPS has a novel interaction with NEDD9, which has been shown to mediate β1 integrin signaling and promote metastasis of non-small lung cancer cells(37). CD26, a cancer stem cell marker of malignant mesothelioma, has been shown to associate with the integrin α5β1 (or ITGA5, a novel interactor of the MPM gene, FGFR2) integrin and promote cell migration and invasion in mesothelioma cells (37). Another cancer stem cell marker of malignant mesothelioma, CD9, inhibits this metastatic effect mediated by CD26. Depletion of CD26 and CD9 was shown to lead to decreased and increased expression of NEDD9 and FAK in mesothelioma cells lines, hinting at the involvement of NEDD9 in mesothelioma tumor invasiveness (37). NEDD9 has a known interaction with LYN, an MPM gene, shown to play a negative role in the regulation of integrin signaling in neutrophils (38). CUTA has a novel interaction with HMGB1, which has been shown to activate the integrin αMβ2 (or ITGAM, a novel interactor of the MPM gene, TYMS) and the cell adhesion and migratory function of neutrophils mediated by αMβ2 (39). HMGB1 also has a novel interaction with the MPM gene, FLT1, shown to be involved in the migration of multiple myeloma cells by associating with β1 integrin, and mediating PKC activation (40).
Web Server
We made the MPM interactome available on a webserver called Wiki-Pi MPM (http://severus.dbmi.pitt.edu/wiki-MPM). It has advanced-search capabilities, and presents comprehensive annotations, namely their Gene Ontology annotations, diseases, drugs and pathways, of the two proteins of each PPI side-by-side. Here, a user can query for results such as “show me PPIs where one protein is involved in mesothelioma and the other is involved in immunity”, and then see the results with the functional details of the two proteins side-by-side. The PPIs and their annotations also get indexed in major search engines like Google and Bing, thus a user searching for ‘KDR and response to starvation’ would find the PPIs KDR-CAV1 and KDR-ALB, where the interactors are each involved in ‘response to starvation’. This is a unique feature of this web server, as such results are not supported by any other PPI web database. The novel PPIs have a potential to accelerate biomedical discovery in mesothelioma, and making them available on this web server brings them to the biologists in an easily-discoverable and usable manner.
Analysis with other High-Throughput Data
The interactome of MPM genes may serve as a vehicle to piece together an integrated view of functional interconnections among genes linked to MPM through various high throughput studies. We analyzed the overlap of the interactome with various high throughput datasets such as MPM-associated genetic variants, differential expression, and methylation in MPM, correlation of gene expression with lung cancer prognosis and differential gene expression on asbestos exposure, details of which are presented here.
We collected the genetic variants identified through mutational profiling of MPM tumors in Bueno et al. (41) and found that 253 genes in the MPM interactome had either germline mutations, or somatic single nucleotide variants (SNVs) or indels (insertions or deletions) (Data File 3). Of these 253 genes, 39 were novel interactors of MPM genes: MGMT carried germline mutations while the following carried somatic mutations: KIAA1524, SLC20A1, LATS2, ASTN2, BARX1, BRD2, CALML5, CAPRIN1, CLK1, CPS1, DPYD, EIF3H, EPB41L3, GMPS, GPR12, ITGAM, KMT2D, KRT4, MGAT4A, NBR2, NDUFV2, NFIB, NFX1, NUDC, PLCL1, PRKAG1, PRMT1, PTPRT, PTRH2, RBBP6, SGK3, SMCHD1, SPOCK1, TMPRSS15, TNC, XPO4, ZNF687 and PRDM2 carried somatic mutations. Of these 39 novel interactors, twelve interact with MPM genes that also harbored a genetic variant. The PPIs in which both genes carried genetic variants are: CDKN2A-NFX1, FLT1-LATS2, TUBA3C-XPO4, PDGFRA-SPOCK1, TYMS-SMCHD1, TYMS-EPB41L3, GART-TMPRSS15, TYMS-NDUFV2, TYMS-ITGAM, RRM2-BARX1, RRM2-MGAT4A and ATIC-CPS1.
We collected the methylation profile of pleural mesothelioma (42), and found nine novel interactors to be hypomethylated in pleural mesothelioma compared with non-tumor pleural tissue, namely, ACVR1B, CDKN2B, IL6, MGMT, NRG1, OAT, PHLDA2, PLAUR and TNC (SF Table 2). Some of them have little or no expression in lung tissue but are overexpressed in MPM. PLAUR is a prognostic biomarker of human MPM, and is also associated with shorter survival time in a rat xenograft model of malignant mesothelioma (43). Similarly, FGFR1 and its novel interactor NRG1 had elevated mRNA expression in H2722 mesothelioma cell lines and in MPM tissue, both contributing to increased cell growth under tumorigenic conditions (44, 45). TNC, which contributes to invasive growth, is a prognostic biomarker overexpressed in MPM and in malignant pleural effusion, having low expression in normal lung tissues(46, 47). Thus, these novel interactors, which are not normally expressed in lung tissue, may be hypomethylated in MPM leading to their over-expression, contributing to MPM etiology. We computed the overlap of genes differentially expressed in mesothelioma tumors vs. normal pleural tissue adjacent to tumor (GSE12345 (48)) and found statistically significant overlap with the MPM interactome (genes with fold-change >2 or <½ was considered as overexpressed and under expressed respectively at a P-value<0.05). 336 genes out of the 1,682 genes in the MPM interactome overlapped with this dataset (p-value of overlap=2.263e-13), out of which 53 genes were novel interactors. Similarly, 656 out of the 1,682 genes in the MPM interactome were differentially expressed in MPM tumors vs. other thoracic cancers such as thymoma and thyroid cancer (GSE42977 (32) (p-value of overlap = 1.36E-42). 113 out of the 367 novel interactors were differentially expressed in this dataset (p-value of overlap = 0.0034). This shows that the MPM interactome is enriched with genes whose expression helps in distinguishing MPM from other thoracic tumors and also with genes differentially expressed in mesothelioma tumors compared with adjacent normal pleural tissue (Figure 3 and Data File 3). 325 genes in the MPM interactome have high/medium expression in normal lung tissue (median transcripts-per-million (TPM)≥9), using RNA-sequencing data available in GTEx (Figure 3 and Data File 3) (49). Of these, 61 were novel interactors.
According to Pathology Atlas data, 63 genes from the interactome (of which are 10 novel interactors) were found to be those whose high expression was positively correlated with unfavorable prognosis for lung cancer, namely, SPOCK1, SLC7A5, SCARB1, PLIN3, PLAUR, PIEZO1, KRT6A, GJB3, B3GNT3 and ARL2BP (2.08 odds ratio; p-value =1.91E-08) (50). In small cell lung cancer cells, expression of ARL2BP has been shown to be associated with high expression of YAP, which has oncogenic action in lung cancer (Data File 2)(51). In our interactome, ARL2BP has been predicted to interact with FLT1, a VEGF receptor expressed in MPM cells. VEGF levels in the serum of MPM patients has been inversely correlated with patient survival, i.e. unfavorable prognosis, and non-small cell lung cancer tumors expressing FLT1 have been associated with higher malignancy and poor prognosis (52, 53).
A recent study has reported the genes that were differentially expressed in lungs of mice exposed to crocidolite and erionite fibers compared to a control group (54). Crocidolite and erionite are asbestos or asbestos-like particles that are capable of inducing lung cancer and/or mesothelioma in humans and animal models (54). Out of the 1,710 genes differentially expressed on crocidolite exposure, 160 genes were part of the MPM interactome (p-value of overlap =3.63E-05), and 6 out of 78 genes differentially expressed on erionite exposure occured in the MPM interactome (Figure 4 and Data File 3).
Pathway analysis
We compiled the list of pathways that any of the proteins of MPM interactome are associated with, using IPA suite (55). A number of pathways having highly significant association with genes in the MPM interactome such as NF-kB Signaling (p-value=1.25E-39), PI3/AKT signaling (p-value= 1.58E-36), VEGF signaling (p-value=3.98E-36) and natural killer cell signaling (p-value=6.3E-32) were identified (Table 1 and Data File 4). Top 30 pathways by statistical significance of association are also shown in Figure 5A. These pathways are highly relevant to mesothelioma etiology. For example, the PI3K/AKT signaling pathway, which regulates the cell cycle and is involved in cell proliferation, becomes aberrantly active in MPM (Figure 5B) (56). The supplementary data (Data Files 2 and 4) made available here allows a cancer biologist to study PPIs, including hitherto unknown novel PPIs, that connect MPM genes to a pathway that they are interested in studying.
Potentially Repurposable Drugs
Our previous work on schizophrenia interactome analysis led to the identification of drugs potentially repurposable for schizophrenia of which one of them is currently in clinical trials (57). Following this methodology, we constructed the MPM drug-protein interactome that shows the drugs that target any protein in the MPM interactome. There are 513 unique drugs that target 206 of these proteins (of which 28 are novel interactors that are targeted by 147 drugs) (Figure 6 and Data File 5). Using an approach of comparing differential expression induced by the disease versus a drug (58), we identified five drugs that could be potentially repurposable for MPM. These are: cabazitaxel, a cancer drug used in the treatment of refractory prostate cancer; primaquine and pyrimethamine, two anti-parasitic drugs; trimethoprim, an antibiotic; and gliclazide, an anti-diabetic drug. Method used for identifying repurposable drugs has been detailed in Supplementary Methods.
We adopted an approach of comparing differential expression induced by the disease versus a drug (58) using the BaseSpace correlation software (https://www.nextbio.com) (59) to identify five drugs that are potentially repurposable for MPM (Supplementary Methods). Drugs were selected based on whether they were already tested against lung cancer in clinical trials and/or showed overall negative correlation with lung cancer expression studies, because both mesothelioma and lung cancers have been shown to share common pathways that are initiated on exposure to asbestos fibres in mesothelial cells and lung epithelial cells respectively (1).
Another criterion used was whether the genes targeted by the drugs showed high differential expression in MPM tumours/cell lines (GSE51024 (60) and GSE2549 (35)). Although in each case, there would be some genes that are differentially expressed in the same direction for both the drug and disorder (i.e. both the drug & disease cause some genes to over-express; or both the drug and disease cause other genes to under express), the overall effect on the entire transcriptome has an anti-correlation. A correlation score is generated by the tool based on the strength of the overlap between the two datasets. Other statistical criteria such as correction for multiple hypothesis testing are applied and the correlated datasets are then ranked by statistical significance. A numerical score of 100 is assigned to the most significant result, and the scores of the other results are normalized with respect to the top-ranked result. We excluded drugs with unacceptable toxicity (e.g. minocycline) or unsuitable pharmacokinetics. The final list comprised of 15 drugs, out of which 11 have already been tested against mesothelioma in clinical trials/animal models and several of them were found to display clinical activity. Gemcitabine and pemetrexed are being used as first-line therapy for mesothelioma, in combination with cisplatin (61) (62). Ipilimumab has been identified to be a potential second- or third-line therapy in combination with nivolumab (63). Ixabepilone stabilizes cancer progression for up to 28 months (64) Zoledronate, which showed modest activity in MPM, induced apoptosis and S-phase arrest in human mesothelioma cells and inhibited tumor growth in the pleural cavity of an orthotopic animal model (65, 66). Sirolimus/Cisplatin increased cell death and decreased cell proliferation in cell lines of malignant pleural mesothelioma (MPM) (67). α-tocopheryl succinate increased survival of orthotopic animal models of malignant peritoneal mesothelioma (68). Testing of Vitamin E and its analogs are being carried out in various pre-clinical settings (69, 70). Eliminating those drugs which are being/have already been tested in mesothelioma with varying results, we arrived at a list of 5 potentially repurposable drugs in the descending order of negative correlation scores: pyrimethamine, cabazitaxel, primaquine, trimethoprim and gliclazide (Table 2). Cabazitaxel targets the MPM genes, TUBB1 and TUBA4A, and was effective in treating NSCLC that was resistant to docetaxel, a drug that targets TUBB1 along with other known interactors of MPM genes (71). Pyrimethamine and Trimethoprim target two MPM genes, involved in folate metabolism, that were highly differentially expressed in MPM tumors (GSE51024 (60)): TYMS (log2FC = 1.82, P-value = 4.10E-17) and DHFR (log2FC = 0.89, P-value =1.20E-14). MPM tumors have been shown to be responsive to anti-folates (72). Primaquine targets KRT7, a novel interactor of KRT5, whose high expression has been correlated with tumour aggressiveness and drug resistance in malignant mesothelioma (73-75). Primaquine may be re-purposed for MPM treatment at least as an adjunctive drug with pemetrexed, the drug currently used for first line therapy. Primaquine enhanced the sensitivity of the multi-drug resistant cell line KBV20C to cancer drugs (76). Gliclazide is an anti-diabetic drug inhibits VEGFA, a known interactor of KDR8, and is significantly upregulated in MPM tumour (Log2FC = 1.83, P-value = 0.0018). Glicazide inhibits neovascularization, a process mediated by VEGF (77). High levels of VEGF have been correlated with both asbestos exposure in MPM and an advanced stage of the disease (78, 79). Glibenclamide, a drug with a similar mechanism of action as that of glicazide, increases caspase activity in MPM cell lines and primary cultures, leading to apoptosis mediated by TRAIL (TNF-related apoptosis inducing ligand) (80).
Discussion
Currently, only a handful of genes, such as BAP1, CDKN2A and NF2, are being actively studied for their relation to mesothelioma. In an effort to shed light onto other genes associated with MPM, whose functions are uncharacterized, we assembled the ‘MPM interactome’ with ~1,300 previously known PPIs and 367 computationally predicted PPIs. The primary objectives of this paper are to make the interactome and its annotations available to researchers and to demonstrate the power of interactome-scale analyses to generate biological results that contribute to understanding the underlying etiology as well as those that may be directly translated to clinical research. We demonstrated the biological validity of the interactome by showing that it had highly significant overlaps with relevant biological datasets such as MPM-associated genetic variants, differential expression and methylation in MPM, correlation of gene expression with lung cancer prognosis and differential gene expression on asbestos exposure. Next, pathway analysis on the interactome revealed several cancer-related pathways significantly enriched in the interactome. We then expanded the MPM interactome to include drugs that target the proteins in the interactome. An integrated computational analysis with this drug-protein interactome, based on comparing differential expression induced by the disease versus a drug led us to shortlist five potentially repurposable drugs for MPM - an example of a clinically translatable result.
The interactome accelerates discovery by revealing genes or pathways which are not intuitively associated with MPM but are valuable to understanding its underlying biology, such as the association of axon guidance signaling (p-value=2.51E-37), and by allowing biologists to formulate novel biological hypotheses. An illustrative example demonstrating the use of interactome and its annotations to formulate a novel hypothesis is shown in Figure 7. The hypothesis drawn here is that the interplay of genes in semaphorin signaling could be contributing to the development of MPM. While interactome level analysis revealed that axon signaling pathway is significantly associated with MPM, a closer look at an individual novel PPI (TUBA4A-DPYSL2) in this pathway helped us to generate a testable hypothesis that inactivation of SEMA3A-mediated signaling may promote microtubule assembly and increase cell proliferation in MPM.
Specifically, genes involved in axon guidance signaling were extracted from the interactome. Literature-based evidences were then assembled to re-construct these pathway-relevant interactions under non-disease conditions. Further MPM-related biological evidences (MPM-associated genetic variants and gene expression changes associated with MPM/lung cancer/asbestos exposure) related to novel PPI and other relevant proteins were considered from Figure 3 and Data File 2 to hypothesize pathway perturbations in MPM. Below, we present some testable hypotheses of novel interactions in the MPM interactome. Literature based study showed that semaphorins have been known to be involved in the development of non-neural organs and are also implicated in the repulsion of growth-cones of nerves during axon guidance (81). SEMA3A is an axon guidance molecule inhibiting branching morphogenesis in lung organ culture, expressed in airway and alveolar epithelial cells of lungs (82). It is a tumor suppressor gene, underexpressed in malignant mesothelioma and in non-small cell lung cancer (83, 84). DPYSL2 (‘CRMP2’) is a known mediator of SEMA3A, expressed during the embryonic and alveolar stages of lung development in mice, alluding to its role in pulmonary innervation and alveolarization (85). SEMA3A signaling through the receptor complex NRP1/PLXNA1 inactivates DPYSL2 by phosphorylation, decreasing its binding affinity to tubulin molecules, preventing microtubule assembly and thereby interfering with proliferation of cells (86). Conversely, lower expression of SEMA3A in malignant mesothelioma may allow activation of DPYSL2, resulting in increased proliferation. In line with this, DPYSL2 is overexpressed in mesothelioma and in non-small cell lung cancer, possibly accompanied by a change in the phosphorylation status (87). Its novel interaction with TUBA4A (Figure 8) may contribute to promotion of microtubule assembly and increased cell proliferation. Known interaction of TUBA4A with TBCE may contribute to microtubule assembly because TBCE has been shown to help in proper folding of β-tubulin molecules to maintain networks of neuronal microtubules (88). Overexpression of NCALD, a known interactor of TUBA4A, has been negatively correlated with neurite outgrowth (89). This interaction may be disrupted in MPM, as NCALD was found to be significantly down regulated in asbestos-related histology of lung cancer (90). Reduced activity of DPYSL2 also increase the expression of CDKNA1 (‘p21’) and TP53 (‘p53’) causing cell cycle arrest and apoptosis through activation of CASP3. Hence, increased DPYSL2 activity may inhibit apoptosis (87). CDK5, a downstream effector in SEMA3A-mediated signaling, and NF2, a tumor suppressor gene inactivated in MPM, inhibit the RAC/FAK pathway regulating actin filament assembly required for cell migration and tumor invasiveness (91). This inhibition on actin assembly may be removed due to dampening of SEMA3A-mediated signaling, leading to inactivation of CDK5 and independent inactivation of NF2 in MPM, promoting cell migration. Upregulation of the RAC/FAK pathway and increased tumor invasiveness have been linked to inactivation of NF2 in malignant mesothelioma (92). The interaction of ACTB which is usually synthesized in response to guidance cues, with CFL1 that promotes disassembly of actin filaments may also be disrupted in MPM (93, 94). Overexpression of CFL1 suppresses the growth of non-small cell lung cancer tumor, through suppression of cell motility and invasion (95). Thus, we conclude that the novel interaction of DPYSL2 with TUBA4A and its involvement in SEMA3A-mediated signaling may throw light on the role played by axon guidance signaling in mesothelioma.
The work flow illustrated here is applied to other proteins, PPIs and pathways in the MPM interactome to draw the following additional conclusions.
Novel interactions of FLT genes perturbed on crocidolite exposure may influence angiogenesis and metastasis
The novel interactions of FLT1 and FLT3 may provide insights into various aspects of angiogenesis that affect the pathophysiology of MPM in the context of differentially expression upon asbestos or asbestos-like particle exposure. A study has demonstrated the formation of vascular network or tubules in human endothelial cells on exposure to crocidolite fibers (96). Inhibition of EGFR signaling that regulates angiogenic growth factors resulted in a significant reduction in release of IL-8, VEGFA, VEFGR1 and VEGFR2 (96). Novel interactions of FLT1 with HMGB1 and LATS2, and FLT3 with FMO1 and NFIB may perturb VEFG signaling pathway on crocidolite exposure (Figure 8). LATS2 is a tumor suppressor gene that is inactivated in one-third of mesothelioma cell lines or subjected to copy number deletion on exposure to asbestos fibers (97, 98). MicroRNA miR-93 promotes angiogenesis and metastasis of tumors by suppressing the expression of LATS2, thereby increasing cell survival, tube formation and invasion (99). The effect of LATS2 on angiogenesis and metastasis was previously unexplained.
The novel interaction of LATS2 with FLT1 uncovered in our study opens up the possibility of LATS2 being involved in VEFG signaling. The expression of miR-93 in mouse metastatic model promoted the metastasis of tumor cells to lung tissue (99). miR-93 was also upregulated in non-small cell lung cancer (100). MicroRNAs serve as potential diagnostic and prognostic markers of mesothelioma and efforts have been underway to correlate their expression with asbestos exposure (90, 101). Hyperoxia in tissue has been shown to enhance formation of new blood vessels (102). In NFIB hemizygous mice, the expression of HIF1α, which targets pro-angiogenic factors, was significantly increased under hyperoxia conditions (103).
Novel interactions of PROTOR may contribute to aberrant activation of PI3K/AKT signaling
The novel interactions of PRR5 (or ‘PROTOR’) with WNT7B, SCUBE1and TTC38, may shed light on the mechanism by which the PI3K/AKT signaling becomes aberrantly active in MPM, which has not been understood before (Figure 8) (56). PRR5 has been implicated in malignant mesothelioma (104). It binds directly to mTOR and RICTOR within the mTORC2 complex which phosphorylates and activates AKT in a TSC1/TSC2 complex-dependent manner, promoting cell growth and proliferation (105, 106). A role for PRR5 in the regulation of RICTOR-mediated recruitment of mTOR substrates or other signaling molecules has been conjectured (107). Silencing of PRR5 inhibits phosphorylation of AKT and S6K1, both of which are downstream targets of mTORC2 (107). Silencing of RICTOR also inhibits AKT activity (107). We hypothesize that the activation of AKT is brought about by WNT7B binding to PRR5, aided by SCUBE1 and TTC38.
It is known that Wnt ligands activate other signaling molecules including mTORC1 and mTORC2 (108, 109). WNT7B, which is implicated in the proliferation of the lung mesenchyme, is also the only Wnt protein that is expressed in the airway epithelium (109, 110). The novel interactor SCUBE1 belongs to a family of secreted proteins involved in inflammatory pathways and organ development, and is implicated in the pathogenesis of lung adenocarcinoma and non-small cell lung cancer (111, 112). SCUBE1 forms a hetero-oligomer with SCUBE2 (112), which promotes AKT activity (113). Proteins that possess tetratricopeptide repeat domains provide binding surfaces for PPIs and also promote the formation of multi-protein complexes (114). So in our case, the novel interaction of PRR5 with TTC38 may be serving to facilitate the formation of such a multi-protein complex that also constitutes WNT7B and SCUBE1. TTC38 was differentially expressed in a spheroid model of mesothelioma (115, 116). TTC38 interacts with TK1 which is involved in non-canonical Wnt pathway (117).
Novel Interactions of ATIC may underlie malignant transformation of pleural cells aided by biosynthetic-cell survival pathways
Cancerous cells have different metabolic needs compared to normal cells. In order to support over-proliferation of cells, nutrients are consumed excessively and channeled into biosynthetic pathways (118). In such a scenario, many important oncogenic pathways converge to modulate metabolism of tumor cells (119). This convergence may even serve specific purposes over the course of cancer progression such as malignant transformation of tumor cells (Figure 8) (119). ATIC is a bi-functional enzyme that catalyzes the last two steps in biosynthesis of purines in addition to being involved in metabolism of folates (120). CPS1 is a mitochondrial enzyme that catalyzes the first committed step of the urea cycle and is critical in preventing accumulation of toxic ammonia. When CPS1 is knocked down in LKB1-inactivated lung adenocarcinoma cells, there was a reduction in the level of metabolites associated with purine synthesis, and in cell growth (121). LKB1is a master kinase found to be mutated in 20% of non-small cell lung cancers and in mesothelioma cell lines. It regulates the activity of AMPK-related kinases which are similar to AMPK. AMPK acts as a sensor of energy status in cells upon phosphorylation (122-124). AMPK blocks growth of cancer cells, and inhibits anabolic processes and instead promotes catabolic processes in order to conserve cellular energy (125). LKB1-mutant cells showed a decrease in AMPK phosphorylation even upon treatment with a known AMPK activator AICAR which is an intermediate of the purine biosynthesis pathway acted upon by ATIC, the rate-limiting enzyme of this pathway (126). Inducing overexpression of CPS1 in LKB1-inactivated lung adenocarcinoma cell lines or CPS1-negative lung adenocarcinoma cell lines did not induce cell growth (121). This suggested that the role played by CPS1in cell growth might be complemented by other genes in CPS1-negative cell lines. CPS1 overexpression in LKB1-inactivated cells failing to induce cell growth may indicate that purine biosynthesis (even though shown to be affected by CPS1 knockdown) was regulated differently in cancer cells (121). The fact that the treatment of LKB1-mutant cells with AICAR failed to activate AMPK, which is necessary for inactivation of the purine biosynthesis pathway and thereby control of cell growth, might also point at a different pathway or set of interactions regulating purine biosynthesis and growth of these cancerous cells (126). Novel interactions of ATIC may throw some light in this direction, viz., a convergence of a metabolic pathway, purine biosynthesis in this case, with oncogenic pathways. ATIC has been predicted to interact with CIP2A. Close correlation has been found between expression levels of CIP2A and c-MYC, an oncogene that promotes cell growth (127). CIP2A inhibits PP2A which can then no longer de-phosphorylate and inactivate members of the MAP kinase family (128). CIP2A is also required for anchorage-independent growth and malignant transformation of human cells (129). A gene-gene interaction (GGI) in which MYC overexpression causes lethality in cells in which CPS1 has been mutated was previously reported (130). This GGI between MYC and CPS1, and the novel PPI between ATIC and CPS1 may be perturbed in MPM. Another novel interactor of ATIC is MAP3K7. Members of the MAP kinase pathway are well-characterized as downstream targets of IRS proteins contributing to malignant phenotype in breast cancer (131). MPM cell lines showed increased MAPK activity in response to IGF1 treatment (132). Another novel interactor of ATIC is DES, a biomarker used to distinguish between malignant mesothelioma and reactive mesothelioma cells, raising the suspicion that these novel interactions may serve malignant transformation of pleural cells (133). A combination of positive epithelial membrane antigen and negative DES as identified from immunohistochemistry is an indicator of malignant mesothelioma (133). VWC2L, another novel interactor of ATIC interacts with CHEK1, a regulator of the cell cycle also involved in breast cancer metastasis and TRAF2, a protein that suppresses death receptor 5 enhancing invasion and metastasis of cancer cells (unpublished AP-MS results obtained from BioGRID) (134-136). Since Pemetrexed, a drug used in treatment of mesothelioma that targets ATIC along with other key enzymes involved in synthesis of nucleic acids, interferes with folate metabolism, the involvement of the latter in this convergence of pathways cannot be ruled out (137).
Novel interactions of FLT genes may underlie induction of angiogenic responses and de-regulation of endocytic receptor trafficking in mesothelioma
Cell survival and growth are intricately linked to inflammatory pathways in pleural mesothelioma (138). Asbestos induces necrosis of cells causing them to release HMGB1, a protein that recognizes danger associated molecular patterns (DAMPs), i.e., signals associated with cellular stress (139). HMGB1 then activates the inflammasome NALP3 that induces pro-inflammatory responses leading to secretion of IL-1β and TNFα and activation of NFκβ, all of which contribute to increase in cell survival and growth (139). Thus, the role of HMGB1 as a pro-inflammatory cytokine has been critical to explaining the pathogenesis of mesothelioma.
The emerging role of HMGB1 as an inducer of angiogenic factors is more interesting in terms of drug development (140–142). Overexpression of HMGB1 increases the angiogenic potential of endothelial cells through stimulation of VEGFR (141). It induced the expression of proangiogenic factors such as VEGFA, and VEGF receptors/co-receptors FLT1, KDR and NRP1, and stimulated abnormalities in angiogenesis at microvascular level, apart from increasing vessel density and dilation (33, 143, 144). Studies reveal a complex interplay between HMGB1 and VEGF signaling in tumor cells and tumor associated macrophages. While it was shown that HMGB1 indirectly influences VEGF receptors, the novel interaction of HMGB1 with FLT1, which has been experimentally validated here, highlights the possibility of a direct connection between this cytokine and the VEGF receptors (Figure 8). HMGB1 expression parallels FGF activation of endothelial cells, and from its known PPIs in the interactome, we could find that the receptors FGFR1, FGFR2 and FGFR3 interacted with VEGF receptors through intermediaries, viz. NCK1 and GRB2 (143). In particular, GRB2 is an adaptor protein, which acts as a critical link between growth factor receptors and the Ras signaling pathway, and is also involved in downstream signaling of FGFR2 (145). NCK1 is another adaptor protein that associates with growth factor receptors such as KDR or their cellular substrates (146). HMGB1 is reported to interact with TLR4 receptor to mediate pro-inflammatory responses (147). In the interactome, TLR4 was found to interact with HSP90B1 that participates in VEGF signaling by stabilizing and folding other proteins (148). It appears that the receptor activity of TLR4 and its indirect interaction with FLT1 may be mediated by HSP90B1 and some kinases including SRC. It is known that HMGB1 increases permeability of endothelial cells through SRC family tyrosine kinases (149).
The novel interaction of both the receptors FLT1 and FLT3 with CDK8 may throw some light into the manner in which expression of proangiogenic factors in the VEGF signaling pathway may be regulated (Figure 8). HIF1A employs CDK8 for its downstream activities and overexpression of HIF1A has been correlated with increased expression of its target genes, viz. proangiogenic factors of the VEGF signaling pathway (150). In the interactome, HIF1A was found to interact with FLT1 through HSP90AA1 that participates in VEGF signaling and mediates proper folding of target proteins aided by other co-chaperones (148). A novel interaction between the receptors FLT1 and FLT3 was also found (Figure 8). Such receptor-receptor interactions are not without precedence – the orphan receptor kinase TIE1, involved in vascular development, was identified in endothelial cells to bind to TEK (151). It has been speculated that receptors that are physiologically related, such as VEGF receptors, may display weak interactions among their transmembrane domains (152).
VEGF signaling pathway is tightly regulated by processes such as receptor endocytosis, internalization and intracellular trafficking (153). Endosomes play a critical role in these processes by allowing signaling from their compartments, serving as scaffolds to facilitate the assembly and degradation of signaling complexes, and trafficking receptors to many subcellular locations (153). For example, VEGFR2 is rapidly internalized to endosomes and degraded on binding of VEGF (153). Similarly, the assembly of VEGFR2 is contingent upon endocytic trafficking (153). We think that the novel interactions of RASSF9 and ARL2BP with FLT1 may be relevant in this respect (Figure 8). RASSF9 is an endosomal protein that helps in the trafficking of PAM through the secretory or endosomal pathways (154). ARL2BP is a regulator of vesicle formation during intracellular trafficking and was found to be overexpressed in conditions of TLR4 dependent inflammation and tumorigenesis (155). Moreover, crocidolite fibers that induce changes in the expression pattern of HMGB1, a ligand of the TLR4 receptor, are actively transported in endosomes to secondary lysosomes located near the nucleus in a size-dependent manner in epithelial cells of lung (156).
Novel interactions of IFNAR1 and GART may point at the influence of immune system over metastatic pathways converging to serve tumor metabolism in mesothelioma
During the different stages of cancer progression, tumor cells adopt a number of strategies to evade surveillance by immune cells (157). Pleural effusions containing malignant mesothelial cells are replete with a wide range of immune cells that have infiltrated these tumors; yet, the immunological status of mesothelioma patients is tolerant towards these cancerous cells (158). The complex interplay of tumor-immune interactions enables tumors to evade immune cells. Some of the novel interactions in the MPM interactome may throw light on immunological pathways that are deregulated in cancer to aid processes such as tumor invasion and metastasis.
In mice lacking IFNAR1, loss of type-1 interferon signaling is sufficient to promote metastasis of breast cancer to lung, independently of the growth of primary tumors (159). Reduced expression of CD69 and IFNγ, two primary biomarkers of Natural Killer (NK) cells, also showed that the homeostasis of the NK population in the immune system which is dependent on signaling through IFNAR1 receptors was impaired in the process (159). The cytokine IL-2 is used as an immunotherapeutic agent to induce anti-metastatic and cytotoxic effects in cancer cells and has been approved for the treatment of metastatic renal cell carcinoma and melanoma. In mice lacking IFNAR1 receptors, immunotherapy using IL-2 was unsuccessful. This showed that the cytotoxic effect mediated by this cytokine was abolished in the absence of IFNAR1 signaling (160). The interleukin-2 receptor, IL2Rβ, was found to be significantly downregulated in a small-cell lung cancer cell line, compared with normal lung tissue (SF Table 3). The exact mechanism underlying this influence of the immune system over metastatic process was unknown. Novel interaction of IFNAR1 with GART and novel interactions of GART- a protein involved in de novo purine biosynthesis- with a host of proteins that regulate tumor invasion viz. TIAM1, NMI, PXN, KPNA2, TMPRSS15 and JUN may throw some light in this direction (Figure 8). The said novel interactions may depict a convergence of many oncogenic pathways to modulate metabolism of tumor cells and serve specific purposes, viz. malignant transformation of tumor cells, as has been mentioned previously for the novel interactions of ATIC. However, this convergence may also be immunologically regulated by NK-cell receptors, viz. KIR2DL3 which interacts with LCK, a known interactor of IFNAR1, and novel interactors of PDCD1 such as MCL1 and COPS8 (Figure 8) potentially serving as downstream targets (probably influenced by DHFR and its novel interactor ASGR1) with HLA-DQA1 as the ligand on the metastatic cell.
In non-small cell lung cancer, a higher proportion of NK cells express KIR2DL3, which can be induced by IL-2 and reduces with advancing disease (161, 162). LCK belongs to the family of SRC kinases that regulate cell proliferation, differentiation, survival and cytoskeletal alterations (163). HLA-DQA1 is significantly downregulated in small cell lung cancer cell lines compared to their expression in control tissue samples of lung or lymph node (SF Table 3). Hence, these novel interactions may point to NK-tumor cell interactions in metastatic locations of the lung or lymph nodes critical to the progression of MPM.
High expression of GART is observed in cell lines of non-small cell lung cancer; GART and other genes involved in purine biosynthesis such as PPAT, PAICS, PKM2 and ATIC, have been positively correlated with increased proliferation of lung cancer cells (164). One of the novel interactors of GART is a type II transmembrane serine protease that takes part in epithelial-mesenchymal transitions called TMPRSS15, the activity of which is dysregulated in cancer in order to degrade and remodel intercellular junctions and the extracellular matrix (165). Silencing of TMPRSS15 leads to tumor migration and matrix degradation in lung cancer cell lines (165). NMI, another novel interactor of GART, negatively regulates epithelial-mesenchymal transitions by inhibiting p65 acetylation in the NF-Kβ pathway (166). Knockdown of KPNA2, a novel interactor of GART, suppresses the proliferative and migratory abilities of lung cancer cells and downregulation of its regulatory miRNA miR-26b enhances cell migration and invasion in vitro, and metastasis in vivo (167). One of the downstream targets of KPNA2, which is overexpressed in non-small cell lung cancer, is another novel interactor of GART called JUN, a component of the transcription factor AP-1 promoting progression of cell cycle and involved in breakdown of the ECM (168). TIAM1, a novel interactor, is a guanosine exchange factor that takes part in the activation of c-Jun-terminal kinase, p38 MAPK and ERKs, and regulation of genes involved in cellular migration, invasion and metastases (169). PXN, a novel interactor of GART, is a cytoskeletal protein involved in membrane attachment of actin at sites of cell adhesion to extra-cellular matrix (169). Mutations in PXN are associated with lung adenocarcinoma and its overexpression induced by suppression of miR-218 is correlated with increased cell proliferation and invasion (169). Overexpression of PDCD1, playing a key role in immune evasion and the formation of tumor microenvironment has been correlated with poor prognosis and high invasiveness in non-small cell lung cancer (170). MCL1, a novel interactor of PDCD1, is highly expressed in NK cells and upregulated by IL-15 through IL-2Rβ/γ (which is also a receptor of IL-2) (171). Post mortem analyses of mice having MCL1 deleted and injected with murine melanoma cells revealed overwhelming spread of melanoma cells to lung (171). miR-146a that targets COPS8, another novel interactor of PDCD1, suppresses the expression of tumor promoting cytokines and growth factors in gastric adenocarcinoma (172). DHFR, which is involved in folate metabolism and cell growth, has also been predicted to interact with COPS8 (173). Folate metabolism is critical to cancer development as is indicated by the efficacy of anti-folates such as Pemetrexed in treatment of cancer (174). Pemetrexed, a drug used for treatment of MPM in combination with Cisplatin, targets DHFR (62).
Both mesothelioma and lung cancers share common pathways that are initiated on exposure to asbestos fibers in mesothelial cells and lung epithelial cells respectively (175). So, RNA-Sequencing data obtained from pathology atlas for normal lung or lymph tissue were compared with data from the small cell lung cancer cell line, SCLC-21H. It was observed that all the genes that perform anti-metastatic functions in normal tissue were down regulated in SCLC-21H, viz. IFNAR1, LCK, HLA-DQA1 and NMI (SF Table 3). On the other hand, genes that perform pro-metastatic functions, viz. KPNA2, GART, PDCD1 and COPS8 were found to be upregulated (SF Table 3). On comparing RNA-sequencing data obtained from GSE9586 it was found that the natural killer receptor KIR2DL3 was significantly down regulated under conditions favorable to metastasis, which in this case is a metastatic breast cancer cell line in which the microRNA miR-335 is not expressed (Log2FC = −2.659, P-value = 0.018). miR-335 inhibits metastatic cell invasion by regulating a set of genes in tumors that are associated with risk of metastasis to distant sites (176).
Limitations of results and interpretations
It is beyond the scope of our expertise to validate the large number of computationally predicted PPIs in a tissue or cell line of interest. However, we demonstrated the validity of computational predictions on a small number of PPIs on purified proteins with appropriate controls. The computational model has also been validated through additional experiments previously and the novel PPIs predicted in other contexts have translated into results of biomedical significance (19-21). This will catalyze further investigations into these particular PPIs and may lead to biologically or clinically translatable results.
Secondly, biologists often formulate hypotheses around individual ‘components’ of a system, for example, a single protein or a pathway based on their knowledge and by studying existing databases and literature, which are then validated through resource-intensive and time-consuming experiments. In this paper, we presented a large set of PPIs relevant to MPM. However, by making these results available on a searchable web database, we enable biologists to choose the PPI of interest from the MPM interactome by querying for it (http://severus.dbmi.pitt.edu/wiki-MPM). Our website provides advanced search capabilities which allows a user to ask questions such as “show me the PPIs in which one protein is involved in mesothelioma and the other is involved in inflammation” and then see the results with the functional details of the two proteins side-by-side. This will help biologists to generate testable hypotheses around individual PPIs and advance the biology surrounding each PPI by performing suitable experiments. A further advantage is that these resources will also be indexed in popular internet search engines like Google and Bing, and the novel and known PPIs relevant to mesothelioma would also be found through them.
Translation of results for clinical applications
We presented 5 drugs, namely cabazitaxel, primaquine, pyrimethamine, trimethoprim and gliclazide, which may potentially be repurposed for treating MPM. Preclinical studies may be conducted in vitro to validate these computational results.
Conclusions
Diagnosis of mesothelioma, which has a long latency period of more than 30 years, is difficult due to a lack of distinctive symptoms in patients. It recurs after surgical procedures such as extrapleural pneumonectomy, radiation therapy and post-surgical chemotherapy (177). Moreover, the median survival time for mesothelioma is 12.1 months after first-line therapy with pemetrexed/cisplatin. An increase in its incidence has been predicted in western countries and economically emerging nations based on studies on asbestos utilization (177). In this scenario, it is essential that we accelerate biomedical discovery in this field, to catalyze the emergence of clinically translatable results. In this paper, we show that interactome-level analysis may be a right step in this direction. The MPM interactome with MPM-associated proteins and their interacting partners will help biologists, bioinformaticians and clinicians to piece together an integrated view on how genes associated with MPM through various high throughput studies are functionally linked. Such biological insights will lead to clinically translatable results such as testable hypotheses centered on individual protein-protein interactions based on pathway analysis, and drugs repurposable for mesothelioma.
Materials and Methods
Data collection
Genes associated with malignant pleural mesothelioma were obtained from IPA (Ingenuity Pathway Analysis) (178). A search on IPA using the keyword “malignant pleural mesothelioma” retrieved genes that are causally relevant to the disease. The genes are retrieved from the Ingenuity Knowledge Base, a structured collection of nearly 5 million findings with experimental basis which have been manually curated from biomedical literature or incorporated from other databases (178).
Prediction model-HiPPIP
PPIs were predicted by computing features of protein pairs and developing a random forest model to classify the pairwise features as interacting or non-interacting. Protein annotations that were used in this work are: cellular localization, molecular function and biological process membership, location of the gene on the genome, gene expression in hundreds of microarray experiments, protein domains and tissue membership of proteins. Computation of features of protein-pairs is described earlier in Thahir et al (28). A random forest with 30 trees was trained using the feature offering maximum information gain out of 4 random features to split each node; minimum number of samples in each leaf node was set to be 10. The random forest outputs a continuous valued score in the range of [0,1]. The threshold to assign a final label was varied over the range of the score for positive class (i.e., 0 to 1) to find the precision and recall combinations that are observed. This prediction model is referred to as High-confidence Protein-Protein Interaction Prediction (HiPPIP) model.
Evaluation of PPI prediction model
Evaluations on a held-out test data showed a precision of 97.5% and a recall of 5% at a threshold of 0.75 on the output score. Next, we created ranked lists for each of the hub genes (i.e., genes that had more than 50 known PPIs), where we considered all pairs that received a score greater than 0.5 to be novel interactions. The predicted interactions of each of the hub genes are arranged in descending order of the prediction score, and precision versus recall is computed by varying the threshold of predicted score from 1 to 0. Next, by scanning these ranked lists from top to bottom, the number of true positives versus false positives was computed.
Novel PPIs in the MPM interactome
Each MPM gene, say Z, is paired with each of the other human genes (G1, G2 … GN), and each pair is evaluated with the HiPPIP model. The predicted interactions of each of the MPM genes (namely, the pairs whose score is greater than the threshold 0.5) were extracted. These PPIs, combined with the previously known PPIs of MPM genes collectively form the MPM interactome. Interactome figures were created using Cytoscape (179).
Note that 0.5 is the threshold chosen not because it is the midpoint between the two classes, but because the evaluations with hub proteins showed that the pairs that received a score greater than 0.5 are highly confident to be interacting pairs. This aspect was further validated by experimentally validating a few novel PPIs above this score.
In vitro pull-down assays
An in vitro pull-down assay method was employed to determine physical interactions between proteins and to perform initial screening of some of the predicted novel protein-protein interactions. This technique relies on utilizing a tag-fused protein (e.g., His-tag, biotin-tag) immobilized on an affinity column or a resin as the bait protein and a passing-through solution containing the ‘prey’ protein that binds to the ‘bait’ protein. The subsequent elution will pull down both the target (prey) and tagged-protein (bait) for further analysis by immunoblotting to confirm the predicted interactions. The pull-down assays to validate some of the chosen predicted interactions were conducted using the Pull-Down PolyHis Protein:Protein Interaction Kit (Pierce™) according to the manufacturer’s instructions. Briefly, the His-tagged bait proteins (i.e., CLPS, HMGB1, VEGFR2/KDR, PDGFRA) and untagged prey proteins (i.e., CUTA and ALB), purchased from either MyBiosource or Abcam, were diluted to desired concentrations (50 μg/ml) in the tris-buffered saline solution (TBS; 25mM Tris·HCl, 0.15M NaCl; pH 7.2). 200 μl of TBS solution containing each bait protein was incubated with 25 μl of Cobalt Resin (HisPur™; Pierce) at room temperature for 30 min to capture the His-tagged bait protein. The beads were then washed 5x with 200 jliI of wash buffer (TBS containing lOmM Imidazole) and by centrifuging at 1250 x g for 1 minute to remove unbound bait protein. 200 μl of TBS buffer containing the desired untagged prey protein was then added to the above bait containing beads. This mixture was incubated at RT for ~1 hour. Following this, the beads were washed thrice in the same manner described above and finally the bound complex in each case was eluted using 300 μl of elution buffer (wash buffer containing 290mM Imidazole). Thus eluted samples in each case were concentrated to ~50ul and were further analyzed using Wes™ Simple Western, a capillary western blot technology which separates and analyzes proteins by size from 2-440 kDa. A total protein simple western size-based assay was performed according to the ProteinSimple user manual. In brief, 4ul of concentrated samples from the elution’s of pull-down assay were mixed with a master mix (ProteinSimple, Santa Clara, CA) to a final concentration of 1×sample buffer, l × fluorescent molecular weight markers and 40mM DTT. Thus prepared protein samples were then heated at 95 °C for 5 minutes before loading on to the plate. These samples along with biotinylated ladder, diluent, biotin labeling reagent, streptavidin-HRP, wash buffer and luminol-peroxide mix were also dispensed in to corresponding wells as indicated in the plate diagram. Thus prepared plate was centrifuged at RT for 5mins @ 2500 rpm. The run was carried out at room temperature using instrument’s default settings. The chemiluminescence generated was captured by a CCD camera. The digital image was analyzed with Compass software (ProteinSimple).
The pseudo-gel and electro-pherogram profiles, showing bands and peaks corresponding to both ‘bait’ and ‘prey’ proteins in the pull-down samples, highlight and suggest the potential for the interaction of ALB with KDR/VEGFR2 and PDGFRA (Figure 9). It has to be noted that some of the selected proteins under reducing conditions may exhibit higher apparent mass than absolute molecular weights, due to post-translational modifications (e.g., KDR and PDGFRA) or in the presence of metals (e.g., CUTA) (180). In the case of CUTA interactions, a band or peak around 74kDa was observed in the washes after CUTA protein binding as ‘prey’ (Figure 2, lane 6/9). It has been previously reported that CUTA can form multimers under certain experimental conditions including the presence of metals/metal ions. The monomeric form of CUTA migrates at 17-18kDa, CLPS at 13kDa and HMGB1 migrates at 28 kDa and 32 kDa. While HMGB1-CUTA pull-down sample resulted in a broad band/peak around 74-80kDa and 36-52kDa, CLPS-CUTA showed a broad band/peak around 13-24KDa along with several minor peaks at 34kDa, 74kDa and 115kDa. This may suggests a strong interaction of CLPS with CUTA, which interferes with the agglomeration state of CUTA. This is further supported by the fact that the intensity of the bands and peak height corresponding to CUTA in the wash after binding to CLPS as a ‘bait’ is less or almost non-existent compared to wash after adding to HMGB1. CUTA-CLPS was also validated through LC-MS (SF Table 1).
Protein identification methods
Peptide sequencing experiments were performed using an EASY-nLC 1000 coupled to a Q Exactive Orbitrap Mass Spectrometer (Thermo Scientific, San Jose, CA) operating in positive ion mode. An EasySpray C18 column (2 μm particle size, 75 μm diameter by 15cm length) was loaded with 500 ng of protein digest in 22 μL of solvent A (water, 0.1% formic acid) at a pressure of 800 bar. Separations were performed using a linear gradient ramping from 5% solvent B (75% acetonitrile, 25% water, 0.1% formic acid) to 30% solvent B over 120 minutes, flowing at 300 nL/min.
The mass spectrometer was operated in data-dependent acquisition mode. Precursor scans were acquired at 70,000 resolution over 300-1750 m/z mass range (3e6 AGC target, 20 ms maximum injection time). Tandem MS spectra were acquired using HCD of the top 10 most abundant precursor ions at 17,500 resolution (NCE 28, 1e5 AGC target, 60 ms maximum injection time, 2.0 m/z isolation window). Charge states 1, 6-8 and higher were excluded for fragmentation and dynamic exclusion was set to 20.0 s.
Mass spectra were searched for peptide identifications using Proteome Discoverer 2.1 (Thermo Scientific) using the Sequest HT and MSAmanda algorithms, peptide spectral matches were validated using Percolator (target FDR 1%). Initial searches were performed against the complete UniProt database (downloaded 19 March, 2018). Peptide matches were restricted to 10 ppm MS1 tolerance, 20 mmu MS2 tolerance, and 2 missed tryptic cleavages. Fixed modifications were limited to cysteine carbamidomethylation, and dynamic modifications were methionine oxidation and protein N-terminal acetylation. Peptide and protein grouping and results validation was performed using Scaffold 4.8.4 (Proteome Software, Portland, OR) along with the X! Tandem algorithm against the previously described database. Proteins were filtered using a 99% FDR threshold.
Differential gene expression in the small cell lung cancer cell line SCLC-21H
Gene expression data of the small-cell lung cancer cell line, SCLC-21H, and normal lung tissue were obtained from Human Cell Atlas(181) and Human Tissue Atlas(182) respectively. For each gene, the expression values were duplicated in SCLC-21H, and replicated nine times in the normal lung tissue. The deviation of the gene expression values of the different replicates from the mean value was calculated using where x̅ is the mean average of the replicated gene expression values within a group (i.e. within the test or control group) and n is the sample size.
P-value of significance of the observed difference between the means of the test and control groups was calculated using the t-test, with the value t calculated as , where x̅ is the mean average and se is the standard error of the difference between the means of the test and control groups.
The fold change in expression for each gene was calculated as the ratio of the average TPM (transcripts per million) value of the gene in SCLC-21H (test) and the corresponding value in normal lung tissue (control). Genes with fold change >2 or ½ were considered as significantly overexpressed and underexpressed respectively at P-value<0.05.
Analysis of DNA methylation in MPM tumors
The dataset GSE16559 (42) deposited in GEO was used to analyze the methylation profile of pleural mesotheliomas. In this study, genes found to be differentially methylated in mesothelioma were identified from a set of 773 cancer-related genes associated with 1413 autosomal CpG loci. Methylation values (or M-values) were computed as M = log2 [β (1-β)] for both control (non-tumor pleural tissue) and test (pleural mesothelioma) cases, where β is the ratio of methylated probe intensity and overall intensity. Difference between M-values of test and control cases was then computed and genes with M-value>1 and M-value<1 were considered to be hypermethylated and hypomethylated respectively at P-value<0.05.
Analysis of differential gene expression in pleural mesothelioma tumors and normal tissue in lungs
The overlap of the MPM interactome with genes differentially expressed in pleural mesothelioma tumors compared with normal pleural tissue adjacent to mesothelioma was computed using the dataset GSE12345(48). A total of 22876 were assayed in this dataset, out of which 3162 were differentially expressed in pleural mesothelioma tumors. 142 genes in the MPM interactome were not assayed in this dataset and for computing the overlap, this was added to the set of genes assayed in the expression study, giving a total of 22918 genes. The overlap of the MPM interactome with genes that distinguish MPM tumors from other thoracic cancers was computed using the dataset GSE42977 (32) containing the expression of MPM tumors vs. other thoracic cancers such as thymoma and thyroid cancer. A total of 24996 genes were assayed in this dataset, out of which 6176 were differentially expressed in MPM tumors. 222 genes in the MPM interactome were not assayed in this dataset and for computing the overlap, this was added to the set of genes assayed in the expression study, giving a total of 25218 genes. Genes with fold-change >2 or <½ was considered as overexpressed and under expressed respectively at P-value<0.05. Genes which have high/medium expression in normal lung tissue (median TPM>9), were identified using RNA-sequencing data available in GTEx (49).
Correlating expression of MPM genes with lung cancer prognosis
Data for correlation of gene expression and fraction of patient population surviving after treatment for lung cancer was taken from Pathology Atlas (50). A total of 19622 genes were assayed in this dataset, out of which the expression of 354 genes were correlated with unfavorable prognosis. Log-rank P-values which indicate the significance of this correlation were examined. Genes with log-rank P-value<0.001 were considered to be prognostic. Unfavorable prognosis indicates positive correlation of high gene expression with reduced patient survival.
Interactome of MPM genes differentially expressed on particle exposure
Genes differentially expressed in the lungs of mice exposed to crocidolite and erionite fibers were obtained from the dataset GSE100900(54). A total of 24434 genes were assayed in this dataset, out of which 1710 were differentially expressed on crocidolite exposure. Those genes that were differentially expressed on crocidolite and erionite exposure and their interactions with MPM genes were selected from the MPM interactome. This network was further extended to show how these genes interact with other known genes in the MPM interactome. Interactome figures were created using Cytoscape (179).
Identification of repurposable drugs in the MPM drug-protein interactome
To identify repurposable drugs already tested in non-small cell lung cancer (NSCLC), drugs tested in (completed) clinical trials of NSCLC were obtained from NIH Clinical Trials (https://clinicaltrials.gov/). Then, the list of drugs targeting MPM genes that had negative correlation with lung cancer expression studies was compared with drugs tested in NSCLC to identify overlaps. Drugs potentially repurposable for MPM were chosen from the list of overlapping drugs based on literature review.
To identify repurposable drugs targeting MPM genes or novel interactors, drugs targeting MPM genes or novel interactors that had negative correlation with lung cancer expression studies were identified. Next, the genes targeted by these drugs were chosen and their differential expression in MPM tumour/cell lines (GSE51024 (60) and GSE2549(35)) was checked. Genes with high fold changes and low P-values were chosen and literature review was conducted to check whether their corresponding drugs were potentially repurposable.
To identify repurposable drugs targeting known interactors, those known interactors which had high fold changes in MPM tumours/cell lines (GSE51024 (60) and GSE2549 (35)) were chosen. Drugs targeting these known interactors, that had negative correlation with lung cancer expression studies, were identified. Literature review was conducted to check whether these drugs were potentially repurposable.
Negative correlation between lung cancer and drugs were studied using the BaseSpace correlation software (https://www.nextbio.com), which uses a non-parametric rank-based approach to compute the extent of enrichment of a particular set of genes (or ‘bioset’) in another set of genes (59).
Funding
This work has been funded by U24OH009077 (Becich) from National Institute of Occupational Safety and Health (NIOSH) and R01MH094564 (Ganapathiraju) from National Institute of Mental Health (NIMH), of National Institutes of Health (NIH), USA. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIOSH or NIMH, NIH, USA.
Author contributions
In sequence of work: MKG conceptualized and supervised the study and carried out interactome construction and analysis of pathway and drug associations. KBK carried out studies of overlap of the interactome with various high-throughput data, literature-based evidence gathering, and formulation of biological hypotheses. Experimental validations were carried out by NY and GB. Methods of experimental validation were provided by NY and GB. Manuscript has been written by KBK and edited by MKG. Manuscript has been read and approved by all authors.
Competing interests
None.
Data and materials availability
On journal/preprint archive website and at http://severus.dbmi.pitt.edu/wiki-MPM.
Acknowledgments
We thank Prof. Michael Becich (PI, National Mesothelioma Virtual Bank) for support and for funding for MKG and for comments on the manuscript. We thank Prof. N. Balakrishnan for support and for funding for KBK. MKG thanks Dr. David Boone (Department of Biomedical Informatics) and Dr. J. Richard Chaillet (Office of Research Health Sciences) of University of Pittsburgh for detailed and valuable feedback on the writing of the manuscript. MKG thanks Sai Supreetha Varanasi for system administration assistance in hosting the website.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.
- 74.
- 75.
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵
- 121.↵
- 122.
- 123.
- 124.
- 125.↵
- 126.↵
- 127.↵
- 128.↵
- 129.↵
- 130.↵
- 131.↵
- 132.↵
- 133.↵
- 134.
- 135.
- 136.
- 137.↵
- 138.↵
- 139.↵
- 140.↵
- 141.↵
- 142.↵
- 143.↵
- 144.↵
- 145.↵
- 146.↵
- 147.↵
- 148.↵
- 149.↵
- 150.↵
- 151.↵
- 152.↵
- 153.↵
- 154.↵
- 155.↵
- 156.↵
- 157.↵
- 158.↵
- 159.↵
- 160.↵
- 161.↵
- 162.↵
- 163.↵
- 164.↵
- 165.↵
- 166.↵
- 167.↵
- 168.↵
- 169.↵
- 170.↵
- 171.↵
- 172.↵
- 173.↵
- 174.↵
- 175.↵
- 176.↵
- 177.↵
- 178.↵
- 179.↵
- 180.↵
- 181.↵
- 182.↵
- 183.
- 184.
- 185.
- 186.
- 187.
- 188.
- 189.
- 190.
- 191.
- 192.
- 193.
- 194.
- 195.
- 196.
- 197.
- 198.
- 199.
- 200.
- 201.
- 202.
- 203.
- 204.
- 205.
- 206.
- 207.
- 208.