Abstract
The human gut microbiome is a complex ecosystem, in which hundreds of microbial species and metabolites coexist, in part due to an extensive network of cross-feeding interactions. However, both the large-scale trophic organization of this ecosystem, and its effects on the underlying metabolic flow, remain unexplored. Here, using a simplified model, we provide quantitative support for a multi-level trophic organization of the human gut microbiome, where microbes consume and secrete metabolites in multiple iterative steps. Using a manually-curated set of metabolic interactions between microbes, our model suggests about four trophic levels, each characterized by a high level-to-level metabolic transfer of byproducts. It also quantitatively predicts the typical metabolic environment of the gut (fecal metabolome) in approximate agreement with the real data. To understand the consequences of this trophic organization, we quantify the metabolic flow and biomass distribution, and explore patterns of microbial and metabolic diversity in different levels. The hierarchical trophic organization suggested by our model can help mechanistically establish causal links between the abundances of microbes and metabolites in the human gut.
Introduction
The human gut microbiome is a complex ecosystem with several hundreds of microbial species [1, 2] consuming, producing and exchanging hundreds of metabolites [3, 4, 5, 6, 7]. With the advent of high-throughput genomics and metabolomics techniques, it is now possible to simultaneously measure the levels of individual metabolites (the fecal metabolome), as well as the abundances of individual microbial species [8]. Quantitatively connecting these levels with each other, requires knowledge of the relationships between microbes and metabolites in their shared environment: who produces what, and who consumes what? [9, 10] In recent studies, information about these relationships for all of the common species and metabolites in the human gut has been gathered using both manual curation from published studies [6] and automated genome reconstruction methods [3]. This has laid the foundation for mechanistic models which would allow one to relate metabolome composition to microbiome composition [11, 12].
More generally, the construction of mechanistic models has been hindered by the complexity of dynamical processes taking place in the human gut, which in addition to cross-feeding and competition, includes differential spatial distribution and species motility, interactions of microbes with host immune system and bacteriophages, changes in activity of metabolic pathways in individual species in response to environmental parameters, etc. This complexity can be tackled on several distinct levels. For 2-3 species it is possible to construct a detailed dynamical model taking into account the spatial organization and flow of microbes and nutrients within the lower gut [13, 14], or optimizing the intracellular metabolic flows as well as competition for extracellular nutrients using dynamic flux balance analysis (dFBA) models [15, 4].
For around 10 microbial species, and a comparable number of metabolites, it is possible to construct a Consumer Resource Model (CRM) taking into account microbial competition for nutrients, the generation of metabolic byproducts, and the different tolerance of species to various environmental factors like pH. Using the existing experimental data on consumption and production kinetics of different metabolites, it is possible to fit some (but not all) of around 80 parameters in such a model [16].
However, modeling 100s of species and metabolites, typically present in an individual’s gut microbiome, requires thousands of parameters, which cannot be estimated from the current experimental data. Therefore, any such model must instead resort to a few “global parameters” that appropriately coarse-grain the relevant ecosystem dynamics. Here, we propose such a coarse-grained model of the human gut microbiome, hierarchically organized into several distinct trophic levels. In each level, metabolites are consumed by a subset of microbial species in the microbiome, and partially converted to microbial biomass. A remainder of these metabolites is excreted as metabolic byproducts, which then form the next level of metabolites. The metabolites in this level can then be consumed as nutrients by another subset of microbial species. Our model needs two global parameters: (1) the fraction of nutrients converted to metabolic byproducts by any microbial species, and (2) the number of trophic levels into which the ecosystem is hierarchically organized.
While previous studies have suggested that such cross-feeding of metabolic byproducts is common in the microbiome, the extent to which this ecosystem is hierarchically organized has not been quantified. Our model suggests that both, the gut microbiome, and its relevant metabolites, are organized into roughly 4 trophic levels, which interconnect these microbes and metabolites in quantitative agreement with their experimentally measured levels. We also show that this model can predict the flow of biomass and metabolites through these trophic levels, quantify the relative contribution of the observed microbes and metabolites to these levels, and thereby allows us to study how microbial competition and cooperation for nutrients maintain diversity at each level.
Model and Results
Multi-level trophic model of the human gut microbiome
Our model aims to approximate the metabolic flow through the intricate cross-feeding network of microbes in the lower intestine (hereafter, “gut”) human individuals (figure 1A). This flow begins with metabolites entering the gut, which are subsequently consumed and processed by multiple microbial species. We assume that each microbial species grows by converting a certain fraction of its metabolic inputs (nutrients) to its biomass and secretes the rest as metabolic byproducts (figure 1B). We define the byproduct fraction, f, one of the two key parameters of our model, as the fraction of nutrients secreted as byproducts. The complementary biomass fraction, 1 - f, is the fraction of nutrient inputs converted to microbial biomass. The metabolic byproducts produced from the nutrients entering the gut, can be further consumed by some species in the microbiome, in turn generating a set of secondary metabolic byproducts. We call each step of this process of metabolite consumption and byproduct generation, a trophic level. Due to factors such as limited gut motility, and a finite length of the lower gut, we assume that this process only continues for a finite number of levels, N𝓁, the second key parameter of our model. At the end of this process, metabolites left unconsumed after passing through N𝓁 trophic levels are assumed to leave the gut as a part of the feces (figure 1B).
In order to quantitatively describe all the steps of this process, our model requires the following information:
The metabolic capabilities of different microbial species in the gut, i.e. which microbes can consume which metabolites, and secrete which others. For this, we used a manually curated database connecting 567 common human gut microbes to 235 gut-relevant metabolites they are capable of either consuming or producing as byproducts [6] (see Methods for details).
The nutrient intake to the gut, which is the first set of metabolites that are consumed by the microbiome. Since the levels of these metabolites in a given individual are generally unknown, we first curated a list of 19 metabolites likely to constitute the bulk of this nutrient intake, and subsequently fitted their levels to best describe the observed microbial abundances in the gut of each individual (see Methods). We collected such microbial abundance data from various sources, in particular: 380 samples from the large-scale whole-genome sequencing (WGS) studies of healthy individuals (Human Microbiome Project (HMP) [1] and the MetaHIT consortium [2, 17]), 41 samples from a recent 16S rRNA study of 10 year old children in Thailand [18].
The kinetics of nutrient uptake and byproduct release, i.e. the rates we refer to as λ’s, at which different microbial species obtain and secrete different metabolites in the gut environment. Since this information is unknown for most of our microbes and metabolites, we made some simplifying assumptions. We assumed that, in a given level, when species consume the same metabolite, they receive it in proportion to their abundance in the microbiome. When secreting metabolic byproducts, we assumed equal splitting, such that every metabolite secreted by a given species was released in the same fraction. However, we later verified that the predictions of our model was relatively insensitive to the exact values of these parameters, by repeating our simulations with randomized values of these parameters (see figure S1).
Calibrating the key parameters of the model
To calibrate the two key parameters of our model, f and N𝓁, we used data from the 41 individuals from a recent 16S rRNA sequencing study of Thai children [18] for which both, 16S rRNA metagenomic profiles, as well as quantitative levels of 214 metabolites in the fecal metabolome, were available. In each individual we fitted the nutrient intakes of the 19 metabolites to best agree with experimental microbial abundances. A representative example comparing the predicted and measured bacterial abundances is shown in Fig. 2B. The Pearson correlation coefficient for data shown in this plot is 0.94, while in individual participants it ranged between 0.81 ± 0.17.
We carried out these fits of microbial abundances for each of the 41 individuals studied in Ref. [18] for a broad range of two parameters of our model - the byproduct fraction f ranging between 0.1 and 0.9 and the number of trophic levels N𝓁 between 2 and 10. For each individual and each pair of parameters f and N𝓁 we used our model to predict the fecal metabolome profile. This predicted metabolome was subsequently compared to the experimental data of Ref. [18] measured in the same individual. Around 19 of our predicted metabolites (variable across individuals) were actually among the ones experimentally measured in Ref. [18]. The quality of this comparison was quantified using the Pearson correlation coefficient (see Fig. 2A). The model with parameters f = 0.9 and N𝓁 = 4 best agrees with the experimental data (Pearson correlation 0.7 ± 0.2; median P value 8 × 10-4) compared with all other values we tried. Hence, we used this combination of parameters in all subsequent simulations of our model.
We found predicted and experimental observed metabolic profiles to be in reasonable agreement with each other. Fig. 2C shows the predicted and observed fecal metabolome data plotted against each other for the same individual used in Fig. 2B. Note that, while the agreement between the observed and predicted microbial abundances shown in Fig. 2B is the outcome of our fitting the levels of intake metabolites, the fecal metabolome is an independent prediction of our model. It naturally emerges from the trophic organization of the metabolic flow and agrees well with the experimentally observed metabolome. Thus our simplified model supports the organization of the microbiome into roughly four trophic levels with byproduct fraction around 0.9.
Predictions of the multi-level trophic model
Metabolite and biomass flow through trophic levels
With a well-calibrated and tested model we are now in a position to apply it to a broader set of human microbiome data. To this end we chose data for 380 healthy adult individuals from several countries (Europe [2], USA [1], and China [17]). For each individual, we used our model to predict its metabolome (that has not been measured experimentally) and quantified the flow of nutrients (or metabolic activity) through 4 trophic levels in our model averaged over these individuals.
Fig. 3A shows the cascading nature of this flow: metabolites enter the gut as nutrient intake shown as the leftmost turquoise bar in Fig. 3A. Roughly, a fraction 1 − f = 0.1 of this nutrient intake is converted into microbial biomass (red bar), while the remaining fraction f = 0.9 is excreted as metabolic byproducts. Some fraction of these metabolic byproducts (blue bar) cannot be consumed by any of the microbes in individuals microbiome and hence ultimately it leaves the individual as part of their fecal metabolome. The metabolic byproducts that can be consumed by the microbiome (turquoise bar) serve as the nutrient intake for microbes in the next level (i.e. level 3). This scenario repeats itself over the next levels until the level 4, beyond which we assume all the byproducts enter the fecal metabolome. Note that, even though some of these byproducts can be consumed by gut microbes, our previous calibration (Fig. 2A) suggests that this does not happen. We believe this may be due to the finite time of flow of nutrients through the gut. Fig. 3B shows the normalized contributions of the nutrient intake to microbial biomass (red) and fecal metabolome (blue) split across trophic levels. We observe a contrasting pattern across levels, with the contribution to microbial biomass decreasing along levels, whereas the fraction of unused metabolites (contribution to the fecal metabolome) increases. It is also worth noting that the same microbial and metabolic species get contributions from multiple trophic levels, i.e. the same microbes that consume nutrients and excrete byproducts in earlier levels can also grow on metabolites generated in later levels. Thus, even though the dominant contribution to a species’ biomass is typically derived from a specific trophic level, species can grow by consuming metabolites from multiple levels.
Quantifying diversity across trophic levels
The diversity of microbial communities can be separately defined both phylogenetically and functionally. Phylogenetic diversity counts the number of abundant microbial species inferred from the metagenomic profile. On the other hand, functional diversity quantifies the variety of collective metabolic activities of these species, which in our case could be inferred from the metabolome profile. Our model allows to quantify both types of diversity on a level-by-level basis. Instead of just calculating the presence or absence of microbial species or metabolites at each level, we weighed each microbe or metabolite by their relative contribution to the metabolic activity at that trophic level. At each level, we calculated the effective α-, β- and γ-diversity, separately for microbes and metabolites (see Methods for details).
Fig. 4 shows the effective α-, β- and γ- diversity for microbes (grouped at the species and genus levels) and metabolites, averaged over our 380 healthy individuals. The microbes first appear in the second trophic level feeding off the nutrient intake metabolites in the first level. We found that the α-diversity (the average number of abundant entities weighted by their contribution to each level) systematically increases with the level number for both microbes and metabolites. There is no clear trend in the γ-diversity of microbes grouped at the species level (the “pan-microbiome” diversity, i.e. the number of abundant species in the combined metagenomes of 380 individuals).
Finally the beta-diversity of microbial species, defined as the ratio between γ-and α-diversity is the highest (∼ 4) in the first level, while being considerably lower (∼ 2.5) in the next two levels. The β-diversity addresses the following important question: how variable are the abundant species between individuals?
While we found that the β-diversity of microbial species could be as large as 4 (Fig. 4), when we grouped organisms by their genus, β diversity decreased down to ∼ 2 across all levels (Fig. 4E). This drop in β-diversity was the most pronounced in the uppermost trophic level. The overall reduction of β-diversity shown in Fig. 4E relative to Fig. 4D suggests that the chief driver of species variability in the gut microbiome is within-genus competition. Such a pattern has previously been explained by a “lottery-like” process of microbial competition within the gut [19].
We also quantified the diversity of metabolites across 4 trophic levels. We found that the β diversity of metabolites was the highest in the uppermost level of nutrients (∼ 2) and lower in the next three levels (∼ 1). While this declining trend was similar to that observed for microbial diversity, surprisingly, the value of β diversity for nutrients was much smaller than for microbes (about 2.5 times lower across all levels). This suggests the picture of functional stability — in spite of taxonomic variability — in all trophic levels of the human gut microbiome, namely that even though the species composition of the microbiome can be quite different for different individuals, their metabolic function is quite similar. These results supplement similar findings of the HMP project [1] by breaking them up into trophic levels and by using metabolome diversity instead of metabolic pathways diversity to quantify the extent of functional similarity.
Discussion
Above we introduced and studied a mechanistic, consumer-resource model of the human gut microbiome quantifying the flow of metabolites and the gradual building up of microbial biomass across several trophic levels. What distinguishes our model is its ability to simultaneously capture the metabolic activities of hundreds of species consuming and producing hundreds of metabolites. Using only the metabolic capabilities — who eats what, and makes what — of different species in the microbiome, we uncovered roughly four trophic levels in the human gut microbiome. At each of these levels, some microbes consume nutrients, and convert them partially to their biomass, while the remainder gets secreted as metabolic byproducts. These metabolic byproducts can then serve as nutrients for microbes in the next trophic level.
Understanding such a trophic organization of microbial ecosystems is important because it helps identify causal relationships between microbes and metabolites at two consecutive trophic levels and helps to separate them from purely correlative connections, either at the same or at more distant levels. Thus it extends the previously introduced concept of a “microbial metabolic influence network” [6] by highlighting its hierarchical structure in which species/metabolites assigned to higher trophic levels could affect a large number of species/metabolites located downstream from them. The concept of trophic levels is widely discussed in macroecology helping to make sense of flow of nutrients and energy in large food webs, but rarely highlighted in the microbial ecosystems literature.
Our model also allows us to quantify the diversity of both species and metabolites contributing to different trophic levels. One conclusion we made was that the functional convergence of the microbiome holds roughly equally across all trophic levels. Indeed, at each level we observed the microbial diversity across different individuals was considerably higher than their metabolic diversity. Our model also provides additional support to the “lottery” scenario described in Ref. [19], especially in the first trophic level. According to this scenario, there are multiple species nearly equally capable of occupying a certain ecological niche, which in our model corresponds to the set of nutrients they consume and secrete as byproducts. The first species to occupy this niche prevents equivalent microbes from entering it. This is reflected in a high β-diversity of microbial species combined with a low to moderate β-diversity of microbial genera to which they belong and low β-diversity of their metabolic byproducts.
The flow of metabolites through the species-to-species cross-feeding network is reminiscent of the flow of web traffic modeled by Google’s original PageRank algorithm [20]. In the PageRank algorithm, each web page redirects f = 0.85 of its traffic along hyperlinks to other web pages thereby contributing to their network traffic. Interestingly, in our model, each bacterial species redistributes or converts a fraction f = 0.9 of its nutrients to other byproducts, which is close to that found by Page and Brin for web traffic [20].
Our model is focused on studying the effects of cross-feeding and competition of different microbes for their nutrients. Thereby it ignores a number of important factors known to impact the composition of the human gut microbiome. These include interactions with host and its immune system [21] as well as with viruses [22], and environmental parameters other than nutrients, such as pH [14], spatial organization [23], etc. Instead, our model uses only two adjustable parameters: the byproduct fraction f and the number of trophic levels NR, assumed to be common to all species. This very small number of parameters has been a conscious choice on our part. We are perfectly aware that species differ from each other in their byproduct ratios, and that the metabolic flows are not equally split among multiple byproducts. This can be easily captured by a variant of our model in which different nutrient inputs and and byproduct outputs of a given microbial species are characterized by different kinetic rates. However, this would immediately increase the number of parameters from 2 to more than 3, 600. To calibrate a model with such a huge number of parameters one needs many more experimental data than we have access to right now. However, we tested the sensitivity of our model to variation in these parameters by repeating our simulations for 100 random sets of nutrient kinetic uptake and byproduct release rates (λ’s in our model), and found that this did not qualitatively change our central result (i.e. that the human gut microbiome is composed of roughly N𝓁 = 4 trophic levels with a byproduct fraction f = 0.9). Surprisingly, our metabolome predictions were also relatively insensitive with respect to varying these parameters (Figure S1). The exact nature of the robustness of these metabolome predictions is beyond the scope of this paper, and the subject of future work.
Methods
Obtaining data for microbial metabolic capabilities
For information about the metabolic capabilities of human gut microbes, we adopted a recently published manually-curated database, NJS16, which includes such data for 570 common gut microbial species and 244 relevant metabolites from Ref. [6]. This database recorded, for each microbial species, which metabolites each of the species could consume, and which they secreted as byproducts. Since we were interested in those metabolites that could be used for microbial growth, we removed metabolites such as ions (e.g. Na+, Ca+) from NJS16. Moreover, we constrained our analyses to microbes only, and therefore removed the 3 types of human cells from NJS16. This left us with a database with 567 microbes, 235 metabolites and 4,248 interactions connecting these microbes with corresponding metabolites (see table S1 for the complete table of interactions).
Obtaining metagenomic and metabolomic data
To calibrate the key parameters of our model, we used a previously published dataset, namely a 16S rRNA sequencing study of 41 human individuals from rural and urban areas in Thailand [18]. From these data, we collected the reported 16S rRNA OTU abundances as well as their corresponding taxonomy. We explicitly removed all OTUs that did not have an assigned species-level taxonomy. The remaining OTUs explained roughly 71%(±15%) of the bacterial abundances per sample.
We then mapped these species names to species names listed in the NJS16 database. We found an exact match for 110 species out of 208 in this table. In order to improve the species coverage from the abundance data, we manually mapped the remaining species in the following manner. For those genera in NJS16, whose member species had identical metabolic capabilities, we assumed that the capabilities of other, unmapped species from these genera were the same as these species. For several well-studied bacterial genera, such as Bacteroides, we determined a “core” set of metabolic capabilities (i.e. those metabolites that could either be consumed or secreted by all species in that genus), and assigned them to all unmapped species in that genus (i.e. those with known abundances, but otherwise understudied metabolic capabilities in NJS16). This allowed us to map an additional 20 microbial species from the abundance data, and incorporate into our model. Note that we did this additional mapping, only for those genera, where species metabolic capabilities were identical.
To quantify the metabolome levels in each individual, we used the available quantitative metabolome profiles (obtained via from CE-TOF MS) corresponding to the 41 individuals whose metagenomic samples we had. Here, we mapped the reported metabolites to our database of metabolic capabilities using KEGG identifiers, which revealed 84 such measured metabolites.
To make predictions about metabolic flow and effective diversity from our model, we used additional metagenomic datasets, namely those from the Human Microbiome Project (HMP) [1] and MetaHIT [2, 17], for which we had microbial abundances, but no fecal metabolome. This resulted in an additional 380 human individuals, for which we obtained tables of MetaPhlAn2 microbial abundances, and mapped species names to those in NJS16 using the same procedure described above. Here, out of a total of 532 microbial species detected over these data, we could map and incorporate 316 species. Of these, 207 were mapped through an exact taxonomic match, and 109 by a genus-capability match. These incorporated species covered, on average, 90% of the total microbial abundance in each individual sample.
Determining the components of the nutrient intake to the gut
The inputs of our model are the relative abundances of microbial species in each individual, which are known (and described above), and the levels of various nutrients reaching their lower gut, which we fit using the model. For simplicity, we do not explicitly include the various polysaccharides (dietary fibers, starch, etc.) known to constitute the bulk of an individual’s diet. Instead, we chose not to include the polysaccharides themselves, but instead use their breakdown products as the direct nutrient intake to the gut. The reason for this is our limited quantitative understanding of the processes by which these polysachharides are converted to these breakdown products, e.g. the levels of extracellular enzymes, variability in their composition (their lability), etc. This curated nutrient intake consisted of 19 metabolites, such as arabinose, raffinose, and xylose (see table S2 for the complete list of metabolites).
Simulating the trophic model
For a specific individual, our model comprises multiple iterative “rounds” of metabolite consumption by microbes and the subsequent generation of metabolic byproducts, with each round constituting a trophic level. At each level, all metabolites produced in the previous level could be consumed by all microbial species detected in the specific individual’s gut. Note that at the first level, these metabolites were given by the nutrient intake to the gut, as described above. Any metabolite that could be consumed by multiple microbial species, was split across those species in proportion to their measured relative abundances. Those metabolites that could not be consumed at any level were assumed to eventually exit the gut, and form part of the individual’s fecal metabolome. Upon metabolite consumption in any trophic level, we assumed that all microbial species that consumed these metabolites, converted a fraction (1 − f) of the total consumed metabolites to their biomass. The remaining fraction, f (assumed fixed for all species) was converted to byproducts for the next level. Here, we assumed that each of the species produced all the byproducts it was capable of in equal amounts. After N𝓁 such iterative rounds (calibrated separately, see the next section), we assumed that this process ends. We added up all the biomass accumulated by each microbial species across all trophic levels as their total biomass, and added up all the unconsumed metabolite levels as the total fecal metabolome. Finally, we normalized, both the microbial biomass and metabolite amounts separately, to obtain the relative microbial abundances and relative metabolome profiles, respectively.
Fitting and inferring the nutrient intake to the gut
Simulating the model required us to know the nutrient intake to the gut, for which there are no available experimental measurements. Therefore, we inferred the amounts of these 19 intake metabolites by fitting the microbial abundances predicted by our model with those measured from each individual’s microbiome. We used a nonlinear optimization technique (implemented as lsqnonlin in MATLAB R2018a, Mathworks Inc.) for this fit, from which we obtained the amounts of the gut nutrient intake supplied to the first trophic level, that minimized the sum of squares of the logarithm of the difference between the observed species abundances, and those predicted by our model. Typically, we fit 19 metabolite amounts for each human individual, who had roughly 80 microbial species.
Calculating level-by-level diversity
To quantify the diversity of microbes and metabolites at each trophic level across the 380 individuals we studied, we used three measures popular in the ecosystems literature: namely the α-, β- and γ-diversity [24, 25, 26]. For each individual, we calculated the α-diversity of microbes and metabolites on each of the trophic levels. For this we first quantified the relative contributions of a given level to microbial abundances, and separately to the fecal metabolome profile. The contribution of a given trophic level 𝓁 to the relative abundance of a species (microbial or, separately, metabolic) i in a specific individual j is given by pi(𝓁, j) normalized by . The α-diversity where ⟨·⟩j represents taking the average across 380 individuals used in our analysis.
Across all individuals, we calculated the γ-diversity of microbes and metabolites in their gut, which quantified the “global” diversity across all individuals, as: where pi(𝓁) = ⟨pi(𝓁, j) ⟩j is the mean relative abundance of species (or metabolite) i at the trophic level 𝓁 across all individuals used in our analysis.
Finally, to quantify the between-individual variability in microbial and metabolite diversity, we calculated the overall β-diversity, which is the ratio of the global to local diversity, as:
Code availability
All computer code and extracted data files used in this study are available at the following URL: https://github.com/eltanin4/trophic_gut.
Supplementary Figures and Tables
Table S1 Microbial and metabolite interactions used in the model. Table of all 4,248 interactions between microbes and metabolites used in the model, from Ref. [6].
Table S2 Components of the nutrient intake to the gut. List of all 19 metabolites used to fit the gut nutrient intake in the model.
Conflicts of interest
The authors declare that there are no competing interests.
Acknowledgments
A.G. acknowledges support from the Simons Foundation and the American Physical Society. We thank Parth Pratim Pandey for useful discussions.