TY - JOUR T1 - RefSoil: A reference database of soil microbial genomes JF - bioRxiv DO - 10.1101/053397 SP - 053397 AU - Jinlyung Choi AU - Fan Yang AU - Ramunas Stepanauskas AU - Erick Cardenas AU - Aaron Garoutte AU - Ryan Williams AU - Jared Flater AU - James M Tiedje AU - Kirsten S. Hofmockel AU - Brian Gelder AU - Adina Howe Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/05/14/053397.abstract N2 - A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 922 genomes of soil-associated organisms (888 bacteria and 34 archaea). Using this database, we evaluated phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to sequence 14 genomes. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps. ER -