PT - JOURNAL ARTICLE AU - Massimo Maiolo AU - Xiaolei Zhang AU - Manuel Gil AU - Maria Anisimova TI - Progressive Multiple Sequence Alignment with the Poisson Indel Process AID - 10.1101/123513 DP - 2017 Jan 01 TA - bioRxiv PG - 123513 4099 - http://biorxiv.org/content/early/2017/04/03/123513.short 4100 - http://biorxiv.org/content/early/2017/04/03/123513.full AB - Sequence alignment lies at the heart of many evolutionary and comparative genomics studies. However, the optimal alignment of multiple sequences is NP-hard, so that exact algorithms become impractical for more than a few sequences. Thus, state of the art alignment methods employ progressive heuristics, breaking the problem into a series of pairwise alignments guided by a phylogenetic tree. Changes between homologous characters are typically modelled by a continuous-time Markov substitution model. In contrast, the dynamics of insertions and deletions (indels) are not modelled explicitly, because the computation of the marginal likelihood under such models has exponential time complexity in the number of taxa. Recently, Bouchard-Côté and Jordan [PNAS (2012) 110(4):1160–1166] have introduced a modification to a classical indel model, describing indel evolution on a phylogenetic tree as a Poisson process. The model termed PIP allows to compute the joint marginal probability of a multiple sequence alignment and a tree in linear time. Here, we present an new dynamic programming algorithm to align two multiple sequence alignments by maximum likelihood in polynomial time under PIP, and apply it a in progressive algorithm. To our knowledge, this is the first progressive alignment method using a rigorous mathematical formulation of an evolutionary indel process and with polynomial time complexity.