TY - JOUR T1 - Average genome size estimation enables accurate quantification of gene family abundance and sheds light on the functional ecology of the human microbiome JF - bioRxiv DO - 10.1101/009001 SP - 009001 AU - Stephen Nayfach AU - Katherine S. Pollard Y1 - 2014/01/01 UR - http://biorxiv.org/content/early/2014/09/11/009001.abstract N2 - Average genome size (AGS) is an important, yet often overlooked property of microbial communities. We developed MicrobeCensus to rapidly and accurately estimate AGS from short-read metagenomics data and applied our tool to over 1,300 human microbiome samples. We found that AGS differs significantly within and between body sites and tracks with major functional and taxonomic differences. For example, in the gut, AGS ranges from 2.5 to 5.8 megabases and is positively correlated with the abundance of Bacteroides and polysaccharide metabolism. Furthermore, we found that AGS variation can bias comparative analyses, and that normalization improves detection of differentially abundant genes.List of AbbreviationsAGSaverage genome size of a microbial communityCVcoefficient of variationMbmegabaseCPUcentral processing unitNCBINational Center for Biotechnology InstituteHMPHuman Microbiome ProjectT2Dtype-2 diabetes metagenomics sequencing projectMetaHITMetagenomics of the Human Intestinal TractOTUoperational taxonomic unitKEGGKyoto Encyclopedia of Genes and GenomesKOKEGG Orthology Group ER -