TY - JOUR T1 - Efficient strategies for calculating blockwise likelihoods under the coalescent JF - bioRxiv DO - 10.1101/016469 SP - 016469 AU - Konrad Lohse AU - Martin Chmelik AU - Simon H. Martin AU - Nicholas H. Barton Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/10/27/016469.abstract N2 - The inference of demographic history from genome data is hindered by a lack of efficient computational approaches. In particular, it has proven difficult to exploit the information contained in the distribution of genealogies across the genome. We have previously shown that the generating function (GF) of genealogies can be used to analytically compute likelihoods of demographic models from configurations of mutations in short sequence blocks (Lohse et al., 2011). Although the GF has a simple, recursive form, the size of such likelihood computations explodes quickly with the number of individuals and applications of this framework have so far been limited to small samples (pairs and triplets) for which the GF can be written down by hand. Here we investigate several strategies for exploiting the inherent symmetries of the coalescent. In particular, we show that the GF of genealogies can be decomposed into a set of equivalence classes which allows likelihood calculations from non-trivial samples. Using this strategy, we used Mathematica to automate block-wise likelihood calculations based on the GF for a very general set of demographic scenarios that may involve population size changes, continuous migration, discrete divergence and admixture between multiple populations. To give a concrete example, we calculate the likelihood for a model of isolation with migration (IM), assuming two diploid samples without phase and outgroup information, and compare the power of our approach to that of minimal pairwise samples. We demonstrate the new inference scheme with an analysis of two individual butterfly genomes from the sister species Heliconius melpomene rosina and Heliconius cyndo. ER -