Abstract
The assembly and development of the gut microbiome in infants has important consequences for immediate and long-term health. Preterm infants represent an abnormal case for bacterial colonization because of early exposure to bacteria and frequent use of antibiotics. To better understand the assembly of the gut microbiota in preterm infants, fecal samples were collected from 32 very low birthweight preterm infants over the first six weeks of life. Infant health outcomes included healthy, late-onset sepsis, and necrotizing enterocolitis (NEC). We characterized the bacterial composition by 16S rRNA gene sequencing and metabolome by untargeted gas chromatography mass spectrometry. Preterm infant fecal samples lacked beneficial Bifidobacterium and were dominated by Enterobacteriaceae, Enterococcus, and Staphylococcus due to the near uniform antibiotic administration. Most of the variance between the microbial community compositions could be attributed to which baby the sample came from (Permanova R2=0.48, p<0.001), while clinical status (healthy, NEC, or late-onset sepsis), and overlapping time in the NICU did not explain a significant amount of variation in bacterial composition. Fecal metabolomes were also found to be unique to the individual (Permanova R2=0.43, p<0.001) and weakly associated with bacterial composition (Mantel statistic r = 0.23 ± 0.05 (p = 0.03 ± 0.03). No measured metabolites were found to be associated with necrotizing enterocolitis, late-onset sepsis or a healthy outcome. Overall, preterm infants gut microbial communities were personalized and reflected antibiotic usage.
Importance Preterm infants face health problems likely related to microbial exposures including sepsis and necrotizing enterocolitis. However, the role of the gut microbiome in preterm infant health is poorly understood. Microbial colonization differs from healthy term babies because it occurs in the NICU and is often perturbed by antibiotics. We measured bacterial compositions and metabolomic profiles of 77 fecal samples from thirty-two preterm infants to investigate the differences between microbiomes in health and disease. Rather than finding microbial signatures of disease, we found the preterm infant microbiome and metabolome were both personalized, and that the preterm infant gut microbiome is enriched in microbes that commonly dominate in the presence of antibiotics. These results contribute to the growing knowledge of the preterm infant microbiome and emphasize that a personalized view will be important to disentangling the health consequences of the preterm infant microbiome.
Introduction
Early life exposure to microbes and their metabolic products is a normal part of development, with enormous and under-explored impact on the immune system. The intestinal microbiota of infants initially assembles by exposure to the mother’s microbiota and microbes in the environment (1–4). In healthy breast-fed infants Bifidobacteria longum spp. infantis capable of digesting human-milk oligosaccharides dominate the infant gut (5). When infants are born preterm, they are exposed to environmental and human associated microbes earlier in their development than normal, and rarely harbor Bifidobacteria spp. in their gut communities. We do not yet understand the effects of altering the timing of initial bacterial exposure. With numerous emerging health consequences related to the microbiome, understanding factors that influence this initial assembly of the microbiome will be important.
Preterm infants are routinely treated with antibiotics, enriching for microbes that can colonize in the presence of antibiotics (4, 6, 7). While antibiotics have tremendously reduced infant mortality, their effect on microbiota assembly and resulting health consequences is not fully understood. Prenatal and postnatal antibiotics have been shown to reduce the diversity of the infant intestinal microbiota (8, 9). Children under two years old are prescribed antibiotics at a higher rate than any other age group, and 85% of extremely low birthweight infants (< 1000 g) are given at least one course of antibiotics (10). Even if an infant is not exposed to antibiotics after birth, approximately 37% of pregnant women use antibiotics over the course of the pregnancy (11).
Perturbing the microbiota of infants can have immediate and long-lasting health consequences. In the short term, infants can be infected by pathogenic bacteria that results in sepsis, which is categorized as early-onset or late-onset depending on the timing after birth. Preterm infants are also at high risk to develop necrotizing enterocolitis (NEC), which is a devastating disease that causes portions of the bowel to undergo necrosis. NEC is one of the leading causes of mortality in preterm infants, who make up 90% of NEC cases (12). The incidence of NEC among low birthweight preterm infants is approximately 7% and causes death in about one third of cases. The exact causes of NEC are not known, but an excessive inflammatory response to intestinal bacteria may be involved (13).
Many of the long-term consequences of microbial colonization are believed to be mediated by interactions between the intestinal microbiota and the immune system. In addition to direct interactions, the microbiota interacts with the immune system through the production of metabolites that can be taken up directly by immune and epithelial cells (14, 15). For example, bacterial production of short chain fatty acids can affect health and integrity of the intestinal epithelia and immune cells (16–18). However, few studies use metabolites alongside bacterial community profiling. In fact, the healthy composition of an infant fecal metabolome is understudied.
In this retrospective study, we follow the changes in the gut microbiota over time in 32 very low birth weight (< 1500 grams) preterm infants born at Children’s Hospital Orange County. We simultaneously track the bacterial composition and metabolite profile over time. Infants were classified into three groups based on health outcomes: healthy infants, late-onset sepsis, and NEC. The composition of the intestinal microbiota was measured by 16S rRNA gene sequencing of fecal samples taken over time. Preterm infant guts were dominated by Enterobacteriaceae and Enterococcus, and Staphylococcus. Untargeted metabolomics analysis of the fecal samples by gas chromatography mass spectrometry (GC-MS) revealed a personalized metabolome that was weakly associated with the bacterial composition.
Results
Patient cohort
A total of 77 fecal samples were collected from 32 very low birth weight infants in the NICU at Children’s Hospital Orange County from 2011 to 2014 (Table 1, Figure 1). Birthweights ranged from 620 – 1570 grams. Fecal samples were collected between day 7 and 75 of life. Sampling time and number of fecal samples varied. Three or more longitudinal samples were available from ten of the infants, while one or two samples were available from the remaining 22 infants. Three infants developed NEC, eight developed late-onset sepsis, and 21 remained healthy. Twelve infants were delivered vaginally while the remaining 22 were delivered by cesarean section. All infants were fed by either breastmilk or a combination of breastmilk and formula. Twenty-four infants received antibiotics at some point during the sampling period, the most common being ampicillin and gentamycin.
Microbial Community Characterization
We sequenced the 16S rRNA gene content of each fecal sample to determine bacterial composition. The total bacterial load of each fecal sample was measured by qPCR of the 16S rRNA gene and scaled to the total weight of stool that DNA was extracted from. Among all infants, bacterial abundances vary over four orders of magnitude and were not associated with health outcome (Supplemental figure 1). The high variation in bacterial load is likely due to the near uniform use of antibiotics. Bacterial communities were composed of mostly Enterobacteriaceae, Enterococcus, Staphylococcus, and Bacteroides (Figure 2a). Most samples were dominated by one to three genera of bacteria. Only three infants (two fed breastmilk, one fed breastmilk and formula) were colonized at greater than 1 % relative abundance by Bifidobacteria, which emerging evidence suggests is a key member of the infant microbiome. We computationally confirmed that the primers used are able to detect 90% of known Bifidobacteria species (19). No single bacterial OTU or community composition was consistently found for infants that became sick (NEC or late-onset sepsis) compared to the infants that remained healthy.
Longitudinal sampling revealed that over the course of days, the bacterial composition could change dramatically (Figure 2a, 2b). Permutational Multivariate Analysis of Variance (PERMANOVA) was applied to determine which of the known clinical factors explained the most variance in the bacterial community composition. The individual explained 48% (p < 0.001) of the variance in the samples, meaning that about half of the total variance among all tested fecal samples could be attributed to the infant the fecal sample came from (Supplemental table 1). None of the other factors explained a significant amount of variation in the bacterial composition, including infant health, overlapping dates in the NICU, delivery mode, or feeding mode. Four of the infants in the study are twins. Twin set 1 (infants 12 and 13) had a similar microbial composition while the other three sets did not (Supplemental Figure 2).
Diversity of the bacterial communities was low as expected for preterm infants. Alpha diversity as measured by Shannon index increased overall with age, but the trend was not significant (linear model R2 = 0.005, p = 0.52) (Figure 3a). Other clinical factors including health outcome, feeding (breastmilk versus breastmilk and formula), antibiotic use, and delivery mode were tested for an effect on the alpha diversity (Figure 3b-e). None of the factors were associated with a difference in alpha diversity except recorded antibiotic use, in which Shannon diversity was unexpectedly lower on average in infants that did not have a record of receiving antibiotics (Wilcoxon rank sum test p=0.06). It should be noted that although six infants did not have a record of antibiotic use, records may be incomplete due to hospital transfers. All four infants that were colonized with Bacteroides were born vaginally, although five other vaginally born infants were not colonized. Only vaginally born infants were colonized by Bacteroides (four out of nine infants) while none of the twenty-two infants born by C-section were colonized.
Metabolomics
Metabolite profiles of infant fecal samples were analyzed by gas chromatography mass spectrometry, which measures small primary metabolites. Over 400 small molecules were detected from each fecal sample and 224 metabolites were known compounds. Metabolites were grouped into the following categories: amino acid metabolism, bile acids, central metabolism, fatty acids, fermentation products, lipid metabolism, nucleotide metabolism, organic acids, sterols, sugars, sugar acids, sugar alcohols, and vitamin metabolism (Figure 4, Supplemental table 2). No metabolites or categories of metabolites were found to be associated with necrotizing enterocolitis or late-onset sepsis. The metabolite profile of each infant was seen to vary over time, similar to the amount of variation seen in the bacterial composition (Figure 5). PERMANOVA analysis to determine which factors explain the most variance in the metabolite profile indicate that the individual explains 43% (p < 0.001) of the variation (Supplemental table 1).
To determine which metabolites might be useful for tracking bacterial metabolism in the infant gut, we examined metabolites with consistent abundance among infants versus those that varied (Supplemental figure 3). In general, sugars, central metabolites, and amino acids were variable while fatty acids, sterols, organic acids, and bile acids were more consistent. Infant 23, which developed necrotizing enterocolitis at day 16 of life, had low abundances of amino acid metabolites the two days prior to disease onset (Figure 4). However, several of the healthy control infants also had similarly low abundances of amino acid metabolites. The individual signal of each infant’s metabolome is far more evident than any trends due to clinical factors (Supplemental table 1).
Bacterial composition associated with metabolite profile
Bacterial metabolism in the gut is expected to contribute to the abundances of metabolites detected in fecal samples. We wanted to know if fecal samples with a similar bacterial composition were also similar in their metabolite profile. We employed a Mantel test using Pearson correlations between distances among bacterial compositions of samples and distances among metabolite profiles of samples. Because bacterial compositions and metabolite profiles are personalized, using multiple samples from a single infant would skew the result. Therefore, one sample from each infant was randomly selected 100 times to remove the effect of the individual and the Mantel test was applied to each subset. The average Mantel statistic of r = 0.23 ± 0.05 (p = 0.03 ± 0.03) indicates a weak but significant association between the bacterial composition and metabolite profile. Also, within individual infants, shifts in the bacterial composition are accompanied by shifts in the metabolome. Infants 17, 23, and 31 have dramatic shifts in both bacterial composition and metabolome profile over time, while infants 10 and 37 remain stable in both bacterial composition and metabolome.
To investigate the correlations driving this overall association, we calculated correlations between bacterial abundances and metabolite intensities (Figure 6). Staphylococcus had the most positive correlations including several classes of sugar metabolites, organic acids, and central metabolites. Fatty acids, lipid metabolism, and amino acids were positively correlated with the commonly abundant gut colonizers Enterobacteriaceae and Bacteroides, and negatively correlated with the commonly low abundance colonizers Staphylococcus and Enterococcus. We also looked more specifically at individual metabolites correlated with bacterial abundances (Figure 6). Bacteroidetes were found to be positively correlated with succinate (r = 0.85). Many other weak correlations (r < 0.5) exist between bacterial abundances and metabolite intensities, but the sample size is not large enough to distinguish signal from noise.
Discussion
Bacterial compositions in this cohort were consistent with the emerging picture from other studies that show the preterm infant gut harbors communities dominated by facultative anaerobes including Enterobacteriaceae, Enterococcus, and Staphylococcus (1, 2, 20). These communities appear to be enriched in commonly antibiotic resistant organisms (21). While we expected to find associations between bacterial community composition and health outcome in this cohort, we were surprised to find that there were not clear signatures of microbiome composition linked to NEC or sepsis. In larger cohorts, associations between particular bacteria or metabolites with NEC have been reported, however, they are not universal signatures across patients, and may reflect patient variation more than disease etiology (22–25). In fact, the strongest signal in both the microbiome and metabolome data from this cohort was the infant from whom the sample was taken. Overall, preterm infant microbiomes in this study were shaped by antibiotics, which have a strong impact on all patients, regardless of health outcome.
Although the bacterial composition of infant guts varied over time, we saw longitudinal samples from individual infants remained highly personalized over several weeks; nearly half of the variation in the microbial community compositions can be explained by which individual the sample came from. The stability of animal-associated microbiomes is an active area of research, with studies finding that the individual microbiome of an adult remains stable through time (26), but can be perturbed by extreme changes in diet or antibiotics (27–29). The bacterial composition in the adult gut largely returns to its previous state one month after antibiotic treatment, but altering the initial assembly of the microbiota in infants can have long lasting health consequences (7, 27, 30, 31). Previous work has found ampicillin and gentamycin (the most common antibiotics taken by infants in this study) to have an inconsistent effect on bacterial diversity, sometimes increasing and sometimes decreasing diversity (1). Similarly, in these infants, ampicillin and gentamycin resulted in more variation in bacterial diversity, but there was no clear trend of increasing or decreasing diversity. However, antibiotics change the dominant members of the microbiota which could have profound effects on immune development and growth (7, 31–33).
Evidence is emerging that a healthy infant gut microbiota is dominated by Bifidobacteria with the capacity to digest human milk oligosaccharides in breastmilk (5, 34, 35). The lack of a core Bifidobacteria community in infants could leave the microbiota open to colonization by facultative anaerobes like we observed in these infants (36). Proteobacteria such as Enterobacteriaceae are commonly seen to increase in abundance after antibiotic administration (21). Indeed, infants in this study had microbiomes that were shaped by antibiotic use. Although six of the thirty-two infants in this study did not have recorded antibiotic use around sampling time, the microbiota can still be affected by prenatal antibiotics taken by the mother (7, 31, 37).
Microbiome studies have become widespread, so that a typical bacterial composition is well characterized in a range of sample cohorts. However, the same cannot be said for the metabolome. There is a dearth of knowledge about what a consensus healthy infant fecal metabolome should be, making comparisons for the cohort in this study difficult. To add to the complexity, each metabolomic approach detects subsets of metabolites, and depends on sample extraction and other method choices. Increasing the frequency of metabolomics data collection in microbiome studies would be a huge step forward for the field. Baseline knowledge about the typical connections between metabolite abundances and bacterial metabolism should be collected, so that molecules that have consistent abundances in a healthy state could give context to data generated from clinical samples in different disease states.
Untargeted metabolomics can survey many metabolites in a biological sample to provide a snapshot of the active metabolism in a system such as the human gut. Metabolite profiles of preterm infants in this study were found to be personalized to a similar degree as the bacterial composition. This is in contrast to a previous study on full term infants that showed the metabolomic profile to be stable, and weakly associated with bacterial composition, over the first few years of life (38). Personalized metabolic signatures of disease hold great promise to complement microbiota profiling in human systems (18, 36). However, analyzing metabolomic data is challenging because an array of inputs contribute to the abundances of metabolites in fecal samples including bacterial metabolism, host biology, and food intake.
We report a number of correlations between bacteria and metabolites in preterm infant feces, and bacterial metabolism has been previously shown to contribute to metabolite abundances in humans and mice (15, 39, 40). Short chain fatty acids are now commonly measured and associated with bacterial fermentation in the gut (41). In this study, the only short-chain fatty acid detected was succinate, which we found to be correlated with the presence of Bacteroides, which produces acetate and succinate from carbohydrate fermentation (42). We also detected several medium-chain fatty acids, which were generally correlated with the abundance of Bacteroides and Enterobacteriaceae. None of the twenty-two C-section born infants in this study were colonized by Bacteroides, possibly due to a lack of vertical transmission from the mother during birth (3).
Overall, we find preterm infant microbiomes are shaped by shared exposures especially to antibiotics, leading to the dominance of antibiotic resistant facultative anaerobes such as Enterococcus spp.. The anaerobic, milk degrading Bifidobacteria were largely absent, even in preterm infants with access to breastmilk, possibly due to a lack of exposure to microbes from family members in the sterile hospital environment along with antibiotics. Our understanding of the health consequences of microbial colonization under these antibiotic-enriched circumstances is still in its infancy.
Materials and methods
Sample Collection
Stool samples from diapers of preterm infants in the neonatal intensive care unit at Children’s Hospital Orange County were collected by nurses over three years from 2011 to 2014. Samples were immediately stored at −20 °C then transferred to −80 °C no more than three days post-collection. Samples were kept at −80 °C and thawed once for DNA extraction and metabolomics. A total of 77 stool samples were collected from 32 preterm infants.
DNA extraction and 16S rRNA gene sequencing
Stool samples were thawed once and DNA was extracted from 10 mg using a fecal Zymo Fecal DNA MiniPrep Kit (#D6010). The V3 and V4 region of the 16SrRNA gene was amplified with two-stage PCR. The first PCR amplified the V3 to V4 region of the 16S rRNA gene using 341F and 805R primers: forward primer (5’- CCTACGGGNGGCWGCAG-3’) and reverse primer (5’- GACTACHVGGGTATCTAATCC −3’) (43). These primers also added an overhang so that barcodes and Illumina adaptors could be added in the second PCR. The first PCR was done as follows: 30 cycles of 95 °C 30 seconds; 65 °C 40 seconds; 72 °C 1 minute. Immediately after completion of the first PCR, primers with sample specific barcodes and Illumina adapter sequences were added and a second PCR was performed as follows: 9 cycles 94 °C for 30 seconds; 55 °C 40 seconds; 72 °C 1 minute. PCR reactions were cleaned using Agencourt AMPure XP magnetic beads (#A63880) using the recommended protocol. Amplicons were run on an agarose gel to confirm amplification and then pooled. Amplicon pool was run on an agarose gel and the 500bp fragment was cut out and gel extracted using Millipore Gel Extraction Kit (#LSKGEL050). The sequencing library was quantified using Quant-iT Pico Green dsDNA Reagent and sent to Laragen Inc. for sequencing on the Illumina MiSeq platform with 250bp paired-end reads producing a total of 2.4 million paired-end reads.
qPCR for bacterial load
The bacterial load of each fecal sample was measured with quantitative PCR of a conserved region of the 16S gene. The following primers were used: (5’- TCC TAC GGG AGG CAG CAG T-3’), (5’- GGA CTA CCA GGG TAT CTA ATC CTG TT-3’). PerfeCTa SYBER Green SuperMix Reaction Mix (Quantabio #95054) was used to quantify DNA from samples. Relative abundance of 16S rRNA genes relative to the mass of stool was compared for each sample. Total fecal DNA was measured with Quant-iT Pico Green dsDNA Assay Kit (ThermoFisher #P11496).
Sequence processing
Sequences were quality filtered using PrinSeq to remove adapters as well as any sequences less than 120 base-pairs, containing any ambiguous bases, or with a mean PHRED quality score of less than 30 (44). Reads were found to drop steeply in quality after 140 base pairs, so all reads were trimmed to 140 base pairs. The forward read contained the V3 region in the high quality first 140 base pairs, while the V4 region was captured in the low-quality region of the reverse reads. Therefore, we used only the forward reads for subsequent analyses.
Bacterial community analysis
Quantitative Insights Into microbial Ecology (QIIME) was used for de novo OTUs picking using the Swarm algorithm with a clustering threshold of 8 (45, 46). This resulted in 2,810 OTUs among all samples. OTUs containing only one sequence were filtered out, leaving 212 OTUs. Taxonomy was assigned to each OTU using QIIME and the Greengenes 13_8 database. An OTU table was constructed and used for downstream analysis. Ten rarefactions were performed on the OTU table down to 2000 reads per sample, which was the largest number of reads that allowed retention of most samples. QIIME was used to calculate alpha diversity by Shannon index and beta diversity by the average weighted UniFrac distance of the ten rarefactions. Community composition barplots, Principal Coordinate Analysis (PCoA) plots, and alpha diversity plots were created using R and the ggplot2 package (47, 48). All R scripts are included in the supplemental information.
Untargeted metabolomics by GC-MS
When fecal samples were thawed for DNA extraction, approximately 50 mg was collected and refrozen at −80 ° for metabolomics. Samples were sent on dry ice to the West Coast Metabolomics Center (WCMC) at UC Davis for untargeted metabolomics by gas chromatography time-of-flight mass spectrometry. Metabolites were extracted from fecal samples with a 3:3:2 mixture of isopropanol, acetonitrile, and water respectively before derivatization and GC-MS analysis by Fiehn Lab standard operating procedures (49–51). Metabolite profiles were compared by calculating Manhattan distances between samples based on all metabolite intensities and visualized by PCoA using the vegan and ape packages in R (52, 53).
Permutational multivariate analysis of variance (PERMANOVA)
PERMANOVA was used to determine factors that explained variance in bacterial community composition and metabolite profile. PERMANOVA was performed using the adonis function in the vegan package in R. The input for PERMANOVA was UniFrac distances of the 16S data and Manhattan distances of the metabolite profiles. Briefly, PERMANOVA quantifies the variation among samples explained by the given groupings compared to randomized groupings. To measure the variance explained by individual infant, we excluded samples that had fewer than three longitudinal samples, leaving ten infants. When performing PERMANOVA for factors other than individual, we accounted for the longitudinal sampling by repeatedly subsampling one sample from each infant and averaging the results.
Correlations between bacteria and metabolites
Pearson correlations between bacterial abundances and normalized metabolite intensities were calculated using the cor function in R. Correlations were calculated between the relative abundances of all bacterial classes and all metabolite intensities among all samples in all infants. Only the four most highly abundant general of bacteria were used to ensure no results were skewed by taxa present in only one or a few samples. For each class of metabolite, the average of all correlations between metabolites in that class and each taxon was calculated so that trends between bacterial taxa and classes of metabolites could be visualized by heatmap.
Mantel test
To determine if fecal samples with similar bacterial compositions also have similar metabolite profiles, a Mantel test was performed. To account for the effect of longitudinal sampling, each dataset was randomly subsampled down to one sample per infant. A Bray-Curtis dissimilarity matrix was computed for the bacterial composition and Manhattan distances calculated for metabolite intensities. The Mantel function in the vegan package of R was used to calculate the Mantel statistic for a Pearson correlation between the two dissimilarity matrices. The average and standard deviation of the Mantel statistic r and p-value for the 100 Mantel tests was reported.
Data availability
Raw sequence data will be uploaded to the SRA. OTU tables, raw metabolomics data, a markdown file of sequence processing workflow, and R scripts used for analyses are available at https://github.com/swandro/preterm_infant_analysis.
Acknowledgements
This project was supported by a UCI Single Investigator grant, the CORCCL SIIG-2014-2015-51, a pilot grant from the UC Davis West Coast Metabolomics Core as part of NIH-DK097154, and start-up funds for the Whiteson Lab in UC Irvine’s Molecular Biology and Biochemistry Department. Thanks to CORCL for funding.
Thank you to Ying (Lucy) Lu for help with qPCR experiments. Thank you Celine Mouginot and Adam Martiny for providing 16S primers and protocols. Thanks to Ilhem Messaoudi and members of the Whiteson lab for editing. Thanks to Megan Showalter and Prof. Oliver Fiehn of the UC Davis West Coast Metabolomics Core were very helpful in carrying out untargeted GC-MS profiling.