Abstract
Cancers converge onto shared patterns that arise from constraints placed by the biology of the originating cell lineage and microenvironment on recurrent programs driven by oncogenic events. This structure should be transferable to molecular stratification. We exploit expression data resources and a parsimonious and computationally efficient network analysis method to define consistent expression modules in colon and breast cancer. Comparison between cancer types identifies principles of gene co-expression: cancer hallmarks, functional and structural gene batteries, copy number variation and biology of originating lineage. Mapping outcome data at gene and module level onto these networks generates a detailed interactive resource. Testing the utility of the resulting modules in TCGA data defines specific associations of module expression with mutation state, identifying striking associations such as mast cell gene expression and mutation pattern in breast cancer. These analyses provide evidence for a generalizable framework to enhance molecular stratification in cancer.