TY - JOUR T1 - Perturbative formulation of general continuous-time Markov model of sequence evolution via insertions/deletions, Part IV: Incorporation of substitutions and other mutations JF - bioRxiv DO - 10.1101/023622 SP - 023622 AU - Kiyoshi Ezawa AU - Dan Graur AU - Giddy Landan Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/08/04/023622.abstract N2 - Background Insertions and deletions (indels) account for more nucleotide differences between two related DNA sequences than substitutions do, and thus it is imperative to develop a stochastic evolutionary model that enables us to reliably calculate the probability of the sequence evolution through indel processes. In a separate paper (Ezawa, Graur and Landan 2015a), we established the theoretical basis of our ab initio perturbative formulation of a continuous-time Markov model of the evolution of an entire sequence via insertions and deletions along time axis. In other separate papers (Ezawa, Graur and Landan 2015b,c), we also developed various analytical and computational methods to concretely calculate alignment probabilities via our formulation. In terms of frequencies, however, substitutions are usually more common than indels. Moreover, many experiments suggest that other mutations, such as genomic rearrangements and recombination, also play some important roles in sequence evolution.Results Here, we extend our ab initio perturbative formulation of a genuine evolutionary model so that it can incorporate other mutations. We give a sufficient set of conditions that the probability of evolution via both indels and substitutions is factorable into the product of an overall factor and local contributions. We also show that, under a set of conditions, the probability can be factorized into two sub-probabilities, one via indels alone and the other via substitutions alone. Moreover, we show that our formulation can be extended so that it can also incorporate genomic rearrangements, such as inversions and duplications. We also discuss how to accommodate some other types of mutations within our formulation.Conclusions Our ab initio perturbative formulation thus extended could in principle describe the stochastic evolution of an entire sequence along time axis via major types of mutations.[This paper and three other papers (Ezawa, Graur and Landan 2015a,b,c) describe a series of our efforts to develop, apply, and extend the ab initio perturbative formulation of a general continuous-time Markov model of indels.] ER -