Abstract
Cancer progression models (CPMs) use cross-sectional samples to identify restrictions in the order of accumulation of driver mutations. CPMs implicitly encode all the possible tumor progression paths or evolutionary trajectories during cancer progression, which can be of help for diagnostic, prognostic, and treatment purposes. Here we examine whether CPMs can be used to predict the true distribution of tumor progression paths and to estimate evolutionary unpredictability. Using simulations we show that the agreement between the true and the predicted distributions of paths is generally poor, unless sample sizes are very large and fitness landscapes are single peaked (have a single global fitness maximum). Under other fitness landscapes, performance is poor and only improves slightly with increasing sample size. Detection regime can be a key determinant of performance, and evolutionary unpredictability hurts performance except under regimes with very low sample variability. Estimates of evolutionary unpredictability from CPMs tend to overestimate the true unpredictability and the bias is affected by detection regime; CPMs could be useful for estimating upper bounds to the true evolutionary unpredictability. Analysis of eleven cancer data sets supports the relevance of detection regime and shows estimates of evolutionary unpredictability in regions where useful prediction might possible for at least some data sets. But the evolutionary trajectory predictions themselves are unreliable. Our results indicate that, currently, obtaining useful predictions of tumor progression paths from CPMs is dubious and emphasize the need for methodological work that can account for the probably multi-peaked fitness landscapes in cancer.