RT Journal Article SR Electronic T1 RefSoil: A reference database of soil microbial genomes JF bioRxiv FD Cold Spring Harbor Laboratory SP 053397 DO 10.1101/053397 A1 Jinlyung Choi A1 Fan Yang A1 Ramunas Stepanauskas A1 Erick Cardenas A1 Aaron Garoutte A1 Ryan Williams A1 Jared Flater A1 James M Tiedje A1 Kirsten S. Hofmockel A1 Brian Gelder A1 Adina Howe YR 2016 UL http://biorxiv.org/content/early/2016/05/14/053397.abstract AB A database of curated genomes is needed to better assess soil microbial communities and their processes associated with differing land management and environmental impacts. Interpreting soil metagenomic datasets with existing sequence databases is challenging because these datasets are biased towards medical and biotechnology research and can result in misleading annotations. We have curated a database of 922 genomes of soil-associated organisms (888 bacteria and 34 archaea). Using this database, we evaluated phyla and functions that are enriched in soils as well as those that may be underrepresented in RefSoil. Our comparison of RefSoil to soil amplicon datasets allowed us to identify targets that if cultured or sequenced would significantly increase the biodiversity represented within RefSoil. To demonstrate the opportunities to access these underrepresented targets, we employed single cell genomics in a pilot experiment to sequence 14 genomes. This effort demonstrates the value of RefSoil in the guidance of future research efforts and the capability of single cell genomics as a practical means to fill the existing genomic data gaps.