RT Journal Article SR Electronic T1 Using null models to infer microbial co-occurrence networks JF bioRxiv FD Cold Spring Harbor Laboratory SP 070789 DO 10.1101/070789 A1 Nora Connor A1 Albert Barberán A1 Aaron Clauset YR 2016 UL http://biorxiv.org/content/early/2016/08/23/070789.abstract AB Although microbial communities are ubiquitous in nature, relatively little is known about the structural and functional roles of their constituent organisms’ underlying interactions. A common approach to study such questions begins with extracting a network of statistically significant pairwise co-occurrences from a matrix of observed operational taxonomic unit (OTU) abundances across sites. The structure of this network is assumed to encode information about ecological interactions and processes, resistance to perturbation, and the identity of keystone species. However, common methods for identifying these pairwise interactions can contaminate the network with spurious patterns that obscure true ecological signals. Here, we describe this problem in detail and develop a solution that incorporates null models to distinguish ecological signals from statistical noise. We apply these methods to the initial OTU abundance matrix and to the extracted network. We demonstrate this approach by applying it to a large soil microbiome data set and show that many previously reported patterns for these data are statistical artifacts. In contrast, we find the frequency of three-way interactions among microbial OTUs to be highly statistically significant. These results demonstrate the importance of using appropriate null models when studying observational microbiome data, and suggest that extracting and characterizing three-way interactions among OTUs is a promising direction for unraveling the structure and function of microbial ecosystems.Author Summary Microbes are ubiquitous in the environment. We know that microbial communities – the groups of microbes that live together, interact, and depend on one another – vary across environments. Multiple processes, ranging from competition between microbes to environmental stress, are believed to alter microbial community composition. Here, we describe a set of statistical techniques that can more accurately identify the underlying taxa relationships that structure the observed abundances of microbes across habitats. Using a large data set of soil samples collected across North and South America, we both illustrate the statistical artifacts that incorrect methods can introduce and describe proper techniques based on appropriate null models for studying how the abundances of taxa vary across soil samples. These tools improve our ability to distinguish ecologically meaningful interactions from simple statistical noise in such observational data. Our application of these tools suggests some previous claims about the network structure of microbial communities may be statistical artifacts. Furthermore, we find that three-way interactions among microbial taxa are significantly more common than we would expect at random, and thus may provide a novel means for identifying ecologically meaningful interactions.