Summary
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced extensive mass spectrometry based proteomics data for selected breast, colon and ovarian tumors from The Cancer Genome Atlas (TCGA). We have incorporated the CPTAC proteomics data into the cBioPotal to support easy exploration and integrative analysis of these proteomic datasets in the context of the clinical and genomics data from the same tumors. cBioPortal is an open source platform for exploring, visualizing, and analyzing multi-dimensional cancer genomics and clinical data. The public instance of the cBioPortal (http://cbioportal.org/) hosts more than 100 cancer genomics studies including all of the data from TCGA. Its biologist-friendly interface provides many rich analysis features, including a graphical summary of gene-level data across multiple platforms, correlation analysis between genes or other data types, survival analysis, and network visualization. Here, we present the integration of the CPTAC mass spectrometry based proteomics data into the cBioPortal, consisting of 77 breast, 95 colorectal, and 174 ovarian tumors that already have been profiled by TCGA for mutations, copy number alterations, gene expression, and DNA methylation. As a result, the CPTAC data can now be easily explored and analyzed in the cBioPortal in the context of clinical and genomics data. By integrating CPTAC data into cBioPortal, limitations of TCGA proteomics array data can be overcome while also providing a user-friendly web interface, a web API and an R client to query the mass spectrometry data together with genomic, epigenomic, and clinical data.
Footnotes
Abbreviations: CPTAC, Clinical Proteomic Tumor Analysis Consortium, a consortium funded by the National Cancer Institute that aims to produce high-quality tumor proteomics data using mass spectrometry; TCGA, The Cancer Genome Atlas, a collaborative effort by the National Cancer Institute and the National Human Genome Research Institute to map the key genomic changes in cancer; CDAP, Common Data Analysis Pipeline, the standard pipeline for mass spectrometry data processing for CPTAC; MS, mass spectrometry; MS/MS, tandem mass spectrometry; RPPA, reverse phase protein array, a method of protein detection and quantitation that uses an antibody-based microarray; HUGO, Human Genome Organization, the host of a committee that sets standards for gene nomenclature; LTQ, linear trap quadrupole, a linear ion trap mass spectrometer produced by Thermo Scientific; iTRAQ, isobaric tag for relative quantitation, a labelling method for protein quantitation; FASTA, a format for encoding protein and nucleotide sequences; TSV, tab-separated values, a file format for encoding data using tabs as a cell delimiter; PTM, post-translational modification; API, application programming interface, a set of class objects and methods that allow programmers to build software or perform analysis on web resources; IHC, immunohistochemistry, a staining procedure to detect proteins in tissue samples.