PT - JOURNAL ARTICLE AU - Diego Mallo AU - Leonardo de Oliveira Martins AU - David Posada TI - <em>SimPhy</em>: Phylogenomic Simulation of Gene, Locus and Species Trees AID - 10.1101/021709 DP - 2015 Jan 01 TA - bioRxiv PG - 021709 4099 - http://biorxiv.org/content/early/2015/06/30/021709.short 4100 - http://biorxiv.org/content/early/2015/06/30/021709.full AB - We present here a fast and flexible software –SimPhy– for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer –all three potentially leading to the species tree/gene tree discordance– and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible. We validate SimPhy’s output using theoretical expectations and other programs, and show that it scales extremely well with complex models and/or large trees, being an order of magnitude faster than the most similar program (DLCoal-Sim). In addition, we demonstrate how SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods. SimPhy is available at https://github.com/adamallo/SimPhy, where users can find the source code, pre-compiled executables, a detailed manual and example cases.