TY - JOUR T1 - <em>K</em>-mer similarity, networks of microbial genomes and taxonomic rank JF - bioRxiv DO - 10.1101/125237 SP - 125237 AU - Guillaume Bernard AU - Paul Greenfield AU - Mark A. Ragan AU - Cheong Xin Chan Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/04/07/125237.abstract N2 - Alignment-free (AF) methods have recently been adopted to infer phylogenetic trees. However, the evolutionary relationships among microbes, impacted by common phenomena such as lateral genetic transfer and rearrangement, cannot be adequately captured in a strictly tree-like structure. Bacterial and archaeal genomes consist of highly conserved regions, e.g. ribosomal RNA genes (commonly used as phylogenetic markers), more-variable regions and extrachromosomal elements, i.e. plasmids (that contain genes critical under a selective condition e.g. antibiotic resistance). The impact of these elements on genome-scale inference of microbial phylogeny remains little known. Here, using an AF approach, we inferred phylogenomic networks of microbial life based on 2785 completely sequenced bacterial and archaeal genomes, and systematically assessed the impact of ribosomal RNA genes and plasmid sequences in this network. Our results indicate that k-mer similarity can correlate with taxonomic rank of microbes. Using a relational database approach, we linked the implicated k-mers to annotated genomic regions (thus functions), and defined core functions in specific phyletic groups and genera. We found that, in most phyla, highly conserved functions are often related to Amino acid metabolism and transport, and Energy production and conversion. Our findings indicate that AF phylogenomics can be used to infer reticulate relationships in a scalable manner and provide new perspective into microbial biology and evolution. ER -