LNISKS: Reference-free mutation identification for large and complex crop genomes

Radosław Suchecki; Ajay Sandhu; Stéphane Deschamps; Victor Llaca; Petra Wolters; Nathan S. Watson-Haigh; Margaret Pallotta; Ryan Whitford; Ute Baumann

doi:10.1101/580829

Abstract

Mutation discovery is often key to the identification of genes responsible for major phenotypic traits. In the context of bulked segregant analysis, common reference-based computational approaches are not always suitable as they rely on a genome assembly which may be incomplete or highly divergent from the studied accession. Reference-free methods based on short sequences of length k (k-mers), such as NIKS, exploit redundancy of information across pools of recombinant genomes. Building on concepts from NIKS we introduce LNISKS, a mutation discovery method which is suited for large and repetitive crop genomes. In our experiments, it rapidly and with high confidence, identified mutations from over 700 Gbp of bread wheat genomic sequence data. LNISKS is publicly available at https://github.com/rsuchecki/LNISKS.