Abstract
With a growing amount of (multi-)omics data being available, the extraction of knowledge from these datasets is still a difficult problem. Classical enrichment-style analyses require predefined pathways or gene sets that are tested for significant deregulation to assess whether the pathway is functionally involved in the biological process under study. De novo identification of these pathways can reduce the bias inherent in predefined pathways or gene sets. At the same time, the definition and efficient identification of these pathways de novo from large biological networks is a challenging problem. We present a novel algorithm, DeRegNet, for the identification of maximally deregulated subnetworks on directed graphs based on deregulation scores derived from (multi-)omics data. DeRegNet can be interpreted as maximum likelihood estimation given a certain probabilistic model for de-novo subgraph identification. We use fractional integer programming to solve the resulting combinatorial optimization problem. We can show that the approach outperforms related algorithms on simulated data with known ground truths. On a publicly available liver cancer dataset we can show that DeRegNet can identify biologically meaningful subgraphs suitable for patient stratification. DeRegNet is freely available as open-source software.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵1 Since sT x ≤ eT x and eT x > 0 under the assumption that the subgraphs are constrained to have at least one node and p′ > 0.
↵2 For example the requirement of the subgraphs to be of a certain predefined size k ∈ ℕ.
↵3 I.e. adding any node not in the subgraph would render the resulting subgraph to be not strongly connected anymore.
↵4 While the (absolute) gap λabs is defined as .
↵5 For example a omics-readout for every case in the cohort.
↵6 Some cases dropped out due to incomplete or missing survival data.
↵8 Given tw o (induced) subgraphs V′, V′′ ⊂ V and node scores s′, s′′ : V → {−1, 0, 1} the deregulation-aware node overlap is defined as .