Abstract
Modern techniques used in the elucidation of population genomic variation generate potentially overwhelming quantities of data. The nature and scale of this data demands the development of computationally efficient methods to determine genetic relatedness in an unbiased de novo manner. We present the k-mer Weighted Inner Product (kWIP), a novel assembly and alignment free estimator of genetic similarity. We show kWIP can recapitulate observed relationships among samples across diverse datasets and reconstruct the true relatedness between samples with simulated sequence. kWIP is licensed under the GNU GPL, and available from https://github.com/kdmurray91/kwip.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.