Abstract
We introduce Pyvolve, a flexible Python module for simulating genetic data along a phylogeny according to continuous-time Markov models of sequence evolution. Pyvolve is easily incorporated into Python bioinformatics pipelines, and it can simulate sequences according most standard models of nucleotide, amino-acid, and codon sequence evolution. All model parameters are fully customizable. Users can additionally specify custom evolutionary models, with custom rate matrices and/or states to evolve. This flexibility makes Pyvolve a convenient framework not only for simulating sequences under a wide variety of conditions, but also for developing and testing new evolutionary models. Moreover, Pyvolve includes several novel sequence simulation features, including a new rate matrix scaling algorithm and branch-length perturbations. Pyvolve is an open-source project freely available, along with a detailed user-manual and example scripts, under a FreeBSD license from http://github.com/sjspielman/pyvolve.