Abstract
Identifying driver genes is a central problem in cancer biology and has received great attentions from researchers. However, existing methods for detecting driver genes from somatic mutation data struggle to distinguish positive selection signals from highly heterogeneous background mutational processes. Here, we present a powerful statistical approach, driverMAPS (Model-based Analysis of Positive Selection) for driver gene identification. The key feature of driverMAPS is its modeling of mutation rates at the base-level, reflecting both background mutational processes and positive selection. Its selection model captures elevated mutation rates in functionally important sites using multiple external annotations, as well as spatial clustering of mutations. Its background mutation model accounts for both known covariates and local, gene-specific, variation caused by unknown factors. Applying driverMAPS to TCGA data across 20 tumor types identified 159 new potential driver genes. Cross-referencing this list with data from external sources strongly supports these findings. The novel genes include the mRNA methytransferases METTL3-METTL14, and we experimentally validated the functional importance of somatic mutations in METTL3, confirming it as a potential tumor suppressor gene in bladder cancer.