Summary
The bacterioplankton diversity in large rivers has thus far been undersampled, despite the importance of streams and rivers as components of continental landscapes. Here, we present a comprehensive dataset detailing the bacterioplankton diversity along a midstream transect of the Danube River and its tributaries. Using 16S rRNA-gene amplicon sequencing, our analysis revealed that bacterial richness and evenness gradually declined downriver in both the free-living and particle-associated bacterial communities. These shifts were also supported by the beta diversity analysis, where the effects of tributaries were negligible in regards to the overall variation. In addition, the river was largely dominated by bacteria that are commonly observed in freshwater and typical of lakes, whereas only few taxa attributed to lotic systems were detected. These freshwater taxa, which were composed of members of the acI lineage and the freshwater SAR11 group (LD12) and the Polynucleobacter, increased in proportion downriver and were accompanied by a decrease in soil and groundwater bacteria. When examining our results in a broader ecological context, we elaborate that patterns of bacterioplankton diversity in large rivers can be explained by the River Continuum Concept published in 1980, with a modification for planktonic microorganisms.
for submission to Environmental Microbiology
Introduction
Streams and rivers link terrestrial and lentic systems with their marine counterparts and provide numerous essential ecosystem services. They supply drinking water, are used for irrigation, industry, and hydropower, and serve as transport routes or for recreation. Of general importance is the role of lotic systems in biogeochemical nutrient cycling. Until recently, rivers and streams were mainly considered as pipes shuttling organic material and nutrients from the land to the ocean. This view has begun to change as lotic and lentic systems are now considered more akin to “leaky funnels” in regard to the cycling of elements. Indeed, they play an important role in the temporary storage (sequestration) and transformation of terrestrial organic matter (Ensign and Doyle, 2006; Cole et al., 2007; Withers and Jarvie, 2008; Battin et al., 2009). As a result of recognising the role of rivers and streams in the carbon cycle (see for example the report by IPCC in 2013; http://www.ipcc.ch/), the study of the diverse, ongoing processes in the water column and sediments of lotic networks has been receiving increasing interest (Kronvang et al., 1999; Beaulieu et al., 2010; Seitzinger et al., 2010; Aufdenkampe et al., 2011; Benstead and Leigh, 2012; Raymond et al., 2013).
When attempting to model the mechanisms of nutrient processing in freshwater systems, bacteria are regarded as the main transformers of elemental nutrients and viewed as substantial contributors to the energy flow (Cotner and Biddanda, 2002; Battin et al., 2009; Findlay, 2010; Madsen, 2011). However, in the case of open lotic systems such as rivers, there remains a major lack of knowledge concerning the diversity of bacterial communities and the link between diversity and ecosystem functioning (Battin et al., 2009). There is currently no agreement on the distinctness of the river bacterioplankton from that of other freshwater systems or the variability of its diversity along entire river transects. More generally, the question of what regulates this diversity remains open.
When summarising previous studies, it can be concluded that bacteria affiliated with the phyla of Proteobacteria (particularly Betaproteobacteria), Bacteroidetes, Cyanobacteria and Verrucomicrobia have dominated the bacterial communities in rivers (Crump et al., 1999; Zwart et al., 2002; Cottrell et al., 2005; Winter et al., 2007; Lemke et al., 2008; Mueller-Spitz et al., 2009; Newton et al., 2011; Liu et al., 2012). These explorative studies on freshwater bacteria suggest that the abundant taxa comprising the riverine bacterioplankton form a cohesive group and can thus be regarded as “typical” freshwater bacteria (Zwart et al., 2002; Lozupone and Knight, 2007; Newton et al., 2011). Nevertheless, the previous studies were constrained by their low sequencing depth and focus on the dominant members of the communities.
Yet, a reasonable sequencing depth is a requirement to correctly estimate the community diversity and identify fine-scale changes that occur as responses to the fluctuating environmental conditions. In one study, a minimum sequencing depth of 1 000 and 5 000 16S rRNA gene reads per sample was suggested for a proper analysis of beta and alpha diversity, respectively (Lundin et al., 2012). These methodological constraints have been overcome with the widespread availability of second-generation sequencing technologies (Shokralla et al., 2012). By targeting the short hyper-variable regions of the universal 16S rRNA gene and proceeding with ultra-deep sequencing, one not only allows for a proper investigation of the diversity and the richness of a community but also uncovers the ability to detect and investigate rare populations that may bear critical functions (Sogin et al., 2006; Gilbert et al., 2012; Sjöstedt et al., 2012).
Regarding large rivers, microbial community studies using second-generation sequencing are scarce with only a few available concerning bacterioplankton. These publications include studies of the Amazonas River (Brazil), the Upper Mississippi River (USA), the Columbia River Estuary (USA) and the Yenisey River mainly reveal taxonomic patterns (Ghai et al., 2011; Fortunato et al., 2013; Staley et al., 2013; Kolmakova et al., 2014). Moreover, the longitudinal development of the bacterioplankton communities along the rivers could not be addressed comprehensively because only a few sites were analysed in each case. Considering the environmental gradients that develop along such rivers (Sekiguchi et al., 2002; Winter et al., 2007; Velimirov et al., 2011), it is expected that the bacterial communities will show a similar variation in their composition and function as one travels from the source to the mouth.
This variability has been hypothesised to originate from the import of bacteria through terrestrial illuviation and merging tributaries as well as from anthropogenic contributions such as wastewater treatment plant pollution. More diffuse phenomena such as soil erosion and agriculture should also be considered (Zampella et al., 2007; Tu, 2011; Besemer et al., 2012). In the case of macroorganisms, an attempt to summarise the large-scale diversity patterns observed from headwater streams to large rivers has been undertaken with the publication of the River Continuum Concept (RCC). The RCC proposes that diversity increases from headwaters to medium-sized stream reaches, with a subsequent decrease towards the river mouth. It is suggested that this pattern is due to the gradient of physical factors formed by the drainage network, the dynamics in chemical properties and the resulting biological activity (Vannote et al., 1980).
Here, we attempted to extend the RCC to include river bacterioplankton by utilising the results from a second-generation sequencing experiment detailing the bacterial community composition along a large river. Furthermore, we revealed how the variability in bacterioplankton diversity is related to the environmental variables along a continuous river transect spanning 2600 km from medium-sized reaches to the river mouth. We separately investigated the free-living communities and particle-associated communities by extracting two different size fractions (0.2-3.0 μm and >3.0 μm) for each sample. These two fractions have been shown to exhibit significant differences in activity and community dynamics in previous studies, justifying this distinction (Crump et al., 1999; Velimirov et al., 2011). The study site was the Danube River, the second largest river in Europe by discharge and length.
The Danube River drains a basin of approximately 801 000 km2; the area is populated with 83 million inhabitants and borders 19 countries (Sommerwerk et al., 2010).
Results
General description of sequences
In total, DNA was extracted and sequenced from 132 filtered water samples originating from the Danube River and its tributaries. In addition, the same procedure was applied to 5 negative control samples. The sequencing yielded 2 030 029 read pairs ranging from 3451 to 24 873 per sample. After quality filtering and mate-pair joining as outlined in Sinclair et al. (in review; see Supporting information), 1 572 361 sequence reads (further referred to as “reads”) were obtained.
The OTU clustering resulted in 8697 OTUs after the removal of all Plastid-, Mitochondrion-, Thaumarchaetoa-, Crenarchaeota- and Euryarchaeota-assigned OTUs. These undesirable sequences represented 19.1% of the reads and accounted for 625 OTUs. Next, for the alpha diversity analysis, any sample with less than 7000 reads was excluded, resulting in 8241 OTUs in the remaining 88 samples. By contrast, for the beta diversity analysis, all samples were randomly rarefied to the lowest number of reads in any one sample. This brought every sample down to 2347 reads, and any OTU containing less than two reads was discarded, which brought the total OTU count to 5082.
Core microbial community
The majority of bacteria-assigned OTUs (4402 out of 8697) were only represented by less than ten reads in the entire dataset. As a consequence, 3243 of 8697 OTUs (∽37%) were present in only one to four samples, and an additional 2219 OTUs (∽26%) were present in as few as five to nine samples. In addition to these rare OTUs, the core community of the Danube River, defined by all OTUs that appeared in at least 90% of all samples, was comprised of 89 OTUs for the free-living bacterioplankton (0.2-3.0 μm) and 141 OTUs for the particle-associated microbes (>3.0 μm). The cumulative contribution of OTUs based on their occurrence along the entire river transect is shown in Fig. 1A. for both analysed size fractions. On average, 81% of all reads of the free-living river community and 63% of all reads of the particle-associated river community were part of their respective core community. A significant increase in the relative contribution of the core communities could be observed towards the river mouth for both fractions (see Fig. 1B.).
Variability of diversity along the river
The Chao1 richness estimator and Pielou’s evenness index were calculated for both size fractions after adjusting all samples down to 7000 reads and discarding those that did not obtain enough reads (n=44). The estimated richness was persistently higher in the particle-associated fraction when compared to the free-living fraction with averages of 2025 OTUs and 1248 OTUs, respectively. We observed the highest diversity of all samples in the medium-sized stretches of the upstream parts of the Danube River. The richness and evenness gradually decreased downstream in both size fractions, as confirmed by the regression analysis (Table 1). The gradual development of the communities can be visualised by applying non-metric multidimensional scaling (NMDS) to the beta diversity distance matrix (Fig. 3.). In both size fractions, a significant relationship between community composition and river kilometre was observed (Table 2). The additional environmental variables that corresponded with the compositional dynamics are given in Table 2, excluding tributaries. As shown in the NMDS, the tributaries did not follow the general patterns and often formed outliers in the ordination plot.
Moreover, based on their bacterial composition, a clear separation was formed between the two filter fractions, as confirmed by PERMANOVA analysis (R2=0.156, p-value<0.01). The apparent synchrony in the gradual development of the two size fractions along the river’s course could also be demonstrated using a Procrustes test (R=0.96, p<0.001). Nevertheless, the application of a permutation test to the beta dispersion values of each size fraction revealed a significantly higher variability in the >3.0 μm fraction when compared to the 0.2-3.0 μm fraction (p-value=0.002) (see Fig. S2.).
Typical river bacterioplankton
We used the 9322 total OTUs to perform a similarity search against the database of freshwater bacteria 16S rRNA sequences developed by Newton and colleagues (2011). The analysis revealed that a large proportion of the bacterial population inhabiting the Danube could be assigned to previously described freshwater taxa (Fig. 4.). In particular, these included representatives of the LD12-tribe belonging to the subphylum of Alphaproteobacteria, as well as the acI-B1-, acI-A7- and acI-C2-tribes belonging to the phylum Actinobacteria.
Interestingly, in the free-living size fraction, an increase in the relative abundance of the four previously mentioned tribes was clearly observed towards the river mouth (Fig. 4A.), contributing up to 35% of the community. Correspondingly, it is possible to observe a clear decrease in the proportion of atypical freshwater taxa in the free-living fraction (labelled “Everything else”) with an increasing number of OTUs assigned to the tribe-level as one goes down the river (Fig. 4B.). In the particle-associated fraction, these typical freshwater taxa are much less abundant (Fig. 4B.). Nevertheless, the OTUs that could be assigned to typical freshwater taxa increased downriver.
In a similar manner, the 8697 bacterial OTUs were BLASTed against the NCBI-NT database; next, any environmental descriptive terms occurring in the search results were retrieved and classified according to the Environmental Ontology (EnvO) terminology. By running a PERMANOVA analysis, we confirmed that the bacterial communities of the different size fractions have distinct habitat preferences (PERMANOVA; R2=0.42, p<0.0001). Restricting the analysis to only ‘groundwater’ and ‘soil’ terms indicated that the proportion of bacteria potentially originating from these two sources decreased towards the river mouth (Fig. 5A. and B.). By using only the contribution of ‘river’ and ‘sediment’ terms, contrary to our expectations, we could not demonstrate any trend along the river transect. It is worth noting that by applying this procedure, most OTUs were not dominated by the ‘river’ environmental term and only a total of four OTUs received an ontology comprising 50% or more of the ‘river’ keyword.
Discussion
The tremendous diversity within the microbial communities inhabiting all types of aquatic environments is being revealed by a rapidly increasing number of studies applying high-throughput sequencing technologies to environmental samples (e.g. Sogin et al., 2006; Andersson et al., 2009; Galand et al., 2009; Eiler et al., 2012; Peura et al., 2012). At the same time, the factors modulating this diversity are also being described (Besemer et al., 2012; Hanson et al., 2012; Lindström and Langenheder, 2012; Szekely et al., 2013). However, very few studies investigating river bacterioplankton are available, and all the studies are based on either relatively small sample sets (Ghai et al., 2011; Fortunato et al., 2013; Staley et al., 2013; Kolmakova et al., 2014) or are of low resolution (Sekiguchi et al., 2002; Winter et al., 2007; Liu et al., 2011, 2012). Here, we describe the diversity of lotic bacterioplankton along a 2600 kilometre transect using high spatial and taxonomic resolution and explain the observed patterns in the context of the River Continuum Concept (RCC; Vannote et al., 1980).
Towards a typical freshwater bacteria community along the river
In addition to an obvious gradual change in beta diversity, we recorded a significant decrease in bacterial richness and evenness in the free-living and particle-associated communities along the river. The gradual change in beta diversity not only significantly correlated with river kilometre but also correlated with alkalinity, dissolved silicates, and nitrate. In addition, the particle-associated community composition correlated significantly with phytoplankton biomass, total suspended solids and total bacterial production. As particles derived from autochthonous matter increased downriver, these correlations, together with the accompanied change in the particle-associated communities, point towards a distinction between communities inhabiting autochthonous and allochthonous particles.
Another distinction was the remarkably higher richness found in particle-attached communities when compared to the free-living bacterioplankton fraction. We ascribe this phenomenon to the higher availability of distinct ecological niches inside and directly on the particles. The suspended particles not only included detritus derived from terrestrial and aquatic sources or mobilised sediments but also included living organisms such as planktonic algae or zooplankton. Therefore, the high diversity of particles in combination with the resulting spectrum of microenvironments (including anoxic habitats) provides an explanation for the higher richness observed in the particle-associated fraction. Furthermore, differences in diversity between the two size fractions were apparent in the results of the EnvO term analysis, indicating the distinct habitat preferences of bacteria. Taking a closer look at these results, we found that the proportion of bacteria originating from soils and groundwater sources was constantly higher in the particle-associated communities, which is likely due to the quantity of suspended particles from soils.
In addition to the riparian zones, merging tributaries or microbial pollution sources could be providing allochthonous particles and bacteria to the river. However, we argue that the gradual exchange of soil and groundwater bacteria with typical freshwater bacterioplankton along the midstream of the river is mostly unaffected by the merging tributaries. This can be explained by estimating the mixing behaviour of the most important tributaries, as conducted by Velimirov and colleagues (2011). In this publication, the authors proposed that the tributaries and other point sources have a negligible effect on the composition of the midstream communities due to the long mixing times of the incoming water and the restrained dilution that this entails. This hypothesis was based on their prior observation of a gradual change in bacterial counts, cell volumes and morphotype composition, which were all significantly correlated with several physicochemical parameters. A few years later, Kolmakova and colleagues (2014) also reported that in the case of the of the Yenisei river, large incoming tributaries could conserve a parallel flow and only merge into the main stream over several kilometres.
Focusing on the taxonomic composition, our data shows that “typical” freshwater bacteria, including members of the acI lineage (c.f. Newton et al., 2011), the freshwater SAR11 group (LD12) and the Polynucleobacter genus, formed to a major part the bacterial “core community”, particularly in the free-living fraction. The close resemblance between riverine bacterial communities and those of lakes strongly corroborates the existence of a so-called “typical freshwater bacteria” group (Zwart et al., 2002; Lozupone and Knight, 2007; Newton et al., 2011).
Explaining patterns in river bacterioplankton
The observed shift towards a more typical freshwater bacteria-dominated community is most likely driven by decreasing inputs of allochthonous bacteria to the midstream from soils and groundwaters on the one hand and by competitive advantages of these taxa on the other hand. The first explanation is supported by previous observations where bacterial communities were similar to or at least heavily impacted by soil communities (Besemer et al., 2012; Crump et al., 2012). In this regard, it was stated that the inputs of allochthonous organisms outweigh the rate of local extinction (e.g. Leibold et al., 2004). The second argument, a downstream rise in competitiveness of downstream-specific OTUs, is suggested by the observed simultaneous decrease in evenness together with bacterial richness in both size fractions along the river transect. Such a rise of few OTUs with a competitive advantage downriver was already predicted by the RCC (Vannote et al., 1980).
The RCC proposes that more refractory and relatively high molecular weight compounds are exported downstream and accumulate along the river, whereas labile allochthonous organic compounds are rapidly used by heterotrophic organisms or physically absorbed in the headwaters. In this case, the highest diversity of soluble organic compounds was proposed to be due to a maximum interface with the landscape (Vannote et al., 1980). Our assumption that downstream-specific OTUs possess a competitive advantage in utilising nutrient-poor organic compounds is also supported by the increasing relative abundance of typical freshwater taxa such as LD12 and acI, which represent small cells with an oligotrophic lifestyle (Salcher et al., 2011; Garcia et al., 2013). A general trend towards smaller cells along the Danube River was previously described by Velimirov et al. (2011), which potentially highlights the decreasing availability of nutrients (larger surface-to-volume-ratio). In addition to the selection for smaller cells based on competitive advantages of oligotrophic bacteria, the starvation of copiotrophic cells originating from terrestrial sources, which are better adapted to higher quality and nutrient-rich compounds (Barcina et al., 1997), could contribute to the trend towards smaller cell volumes.
To demonstrate the role of organic matter sources in the apparent decline of richness towards the river mouth, an assessment of the organic matter composition and bioavailability should be included in future studies. Furthermore, loss factors such as sedimentation and (selective) top-down control such as grazing and viral lysis have been shown to vary over environmental gradients and can substantially influence microbial diversity (Ayo et al., 2001; Langenheder and Jürgens, 2001; Weinbauer, 2004; Pernthaler, 2005; Bouvier and Del Giorgio, 2007).
Necessary adjustments to the RCC for the application to river bacterioplankton
When combining ours and previous results (Besemer et al., 2012, 2013; Crump et al., 2012; Staley et al., 2013), we propose that the RCC – although developed for macroorganisms – can be transferred to river bacterioplankton. For macroorganisms, the RCC proposes the highest diversity in medium-sized streams, which is mainly based on their model parameter of diel temperature variability. However, for bacterioplankton, we observed a continuous decrease from headwaters to river mouth, which could be interpreted to be in conflict with the RCC. Since Vannote and colleagues did not consider bacterioplankton, we argue that the RCC is open for interpretation in this respect. Nevertheless, this requires the careful illumination of the following important points: (i) The primarily passive transport of bacterioplankton contrasts the motility and sessility of macroorganisms, such as aquatic invertebrates, fish or macrophytes; (ii) the large contact zone of small rivers and the surrounding environment (soil and groundwater) is constantly contributing allochthonous microbes to the river bacterioplankton community (Besemer et al., 2012; Crump et al., 2012); (iii) soil communities harbour a much higher diversity when compared to planktonic communities (e.g., Crump et al., 2012); (iv) these allochthonous bacteria should be at least temporarily capable of proliferating in their new lotic environment when compared to, e.g., terrestrial insects that fall or are washed into streams or rivers.
The elevated contribution of allochthonous bacteria to the upstream river bacterioplankton is corroborated by our results of the SEQenv analysis (Fig. 5A and B), where an increased impact from soil and groundwater bacteria to the communities was detected. The importance of the impact from the riparian zone on suspended microbial communities was also reported in previous studies on headwater stream networks and the runoff-process from hill slopes via headwaters to a lake, suggesting terrestrial environments as critical reservoirs of microbial diversity for downstream surface waters (Besemer et al., 2012, 2013; Crump et al., 2012). Crump and colleagues found that the dominant bacteria in an arctic lake were all first observed in soil waters and other upslope environments draining into the system. Additional support for a steady decrease in diversity from headwaters to river mouths was provided by a similar decreasing trend in microbial diversity of benthic microbial biofilms from headwaters to mid-sized streams, which are proposed to be settled by the suspended bacterial community (Besemer et al., 2012, 2013).
Taking these features into account, we propose that patterns in bacterioplankton diversity can indeed be incorporated into the RCC. By highlighting the riparian zone, substrate availability and flow as important determinants of community structure, Vannote and colleagues already provided a conceptual framework to explain the patterns of bacterioplankton diversity in both size fractions along the Danube River. In addition, an increase in the competitiveness of several freshwater taxa attributable to an increase in stability and uniformity of the system along the river continuum is in accordance with the RCC. Furthermore, our study shows that the influence of dispersal from soil, groundwaters and other allochthonous sources in determining the patterns of diversity decreased downriver, whereas internal processes, such as the impact of environmental conditions in rivers, increased in importance. Although we were able to show that the contribution of dispersal and environmental conditions in determining community composition was linked to hydrology, the link between the patterns of diversity and ecosystem function remains to be determined.
Experimental Procedures
Supporting data
Within the frame of the Joint Danube Survey 2, a wide range of chemical and biological parameters was collected (Liska et al., 2008). All data, sampling methods as well as analytical methods are made publicly available via the official website of the International Commission for the Protection of the Danube River (ICPDR;http://www.icpdr.org/wq-db/). Selected data from JDS 1 & 2 were published previously in several studies (Kirschner et al., 2009; Janauer et al., 2010; Velimirov et al., 2011; von der Ohe et al., 2011).
Study sites and sample collection
Samples were collected within the frame of the second Joint Danube Survey project (JDS 2) in 2007. The overall purpose of the Joint Danube Surveys is to produce a comprehensive evaluation of the chemical and ecological status of the entire Danube River on the basis of the European Union Water Framework Directive (WFD) (Liska et al., 2008). During sampling from Aug 15th to Sept 26th 2007, 75 sites were sampled along the mainstream of the Danube River along its shippable way from river kilometre (rkm) 2600 to the river mouth at rkm 0 (Kirschner et al., 2009; Fig. S1.). In addition, 21 samples from the Danube’s major tributaries and branches were included. At the most upstream sites, the Danube River is representative of a typical stream of the rithron and characterised by its tributaries Iller, Lech and Isar (Kavka and Poetsch, 2002). The trip took 43 days and is equivalent to the average retention time of a water body in this part of the Danube River (for discussion of this issue, see Velimirov et al., 2011). Samples were collected with sterile 1 L glass flasks from a water depth of approximately 30 cm. Glass flasks were sterilised by rinsing with 0.5% HNO3 and autoclaving them. For DNA extraction of the particle-associated bacterioplankton depending on the biomass concentration, 120-300 mL river water was filtered through 3.0 μm pore-sized polycarbonate filters (Cyclopore, Whatman, Germany) by vacuum filtration. The filtrate, which represented the bacterioplankton fraction smaller than 3.0 μm (later referred to as “free-living” bacterioplankton), was collected in a sterile glass bottle and subsequently filtered through 0.2 μm pore-sized polycarbonate filters (Cyclopore, Whatman, Germany). The filters were stored at -80 °C until DNA extraction.
DNA extraction and quantification of bacterial DNA using quantitative PCR (qPCR)
Genomic DNA was extracted using a slightly modified protocol of a previously published phenol-chloroform, bead-beating procedure (Griffiths et al., 2000) using isopropanol instead of polyethylene glycol for DNA precipitation. Total DNA concentration was assessed applying the Quant-iT™ PicoGreen® dsDNA Assay Kit (Life Technologies Corporation, USA), and 16S rRNA genes were quantified using domain-specific quantitative PCR. Quantitative PCR reactions contained 2.5 μL of 1:4 and 1:16 diluted DNA extract as the template, 0.2 μM of primers 8F and 338 (Frank et al., 2007; Fierer et al., 2008) targeting the V1-V2 region of most bacterial 16S rRNA genes and iQTM SYBR® Green Supermix (Bio-Rad Laboratories, Hercules, USA). All primer information is available in Table S1. The ratios of measured 16S rRNA gene copy numbers in the different sample dilutions that deviated markedly from 1 after multiplication with the respective dilution factor were interpreted as an indicator for PCR-inhibition.
Preparation of 16S rRNA gene amplicon libraries
For the preparation of amplicon libraries, 16S rRNA genes were amplified and barcoded in a two-step procedure to reduce PCR bias that is introduced by long primers and sequencing adaptor-overhangs (Berry et al., 2011). We followed the protocol as described by Sinclair et al. (unpublished, see Supporting information). In short, 16S rRNA gene fragments of most bacteria were amplified by applying primers Bakt_341F and Bakt_805R (Herlemann et al., 2011; Table S1) targeting the V3-V4 variable regions. In 25 μL reactions containing 0.5 μM primer Bakt_341F and Bakt_805R, 0.2 μM dNTPs (Invitrogen), 0.5 U Q5 HF DNA polymerase and the provided buffer (New England Biolabs, USA), genomic DNA was amplified in duplicate in 20 cycles. To use equal amounts of bacterial template DNA to increase the comparability and reduction of PCR bias, the final volume of environmental DNA extract used for each sample was calculated based on 16S rRNA gene copy concentration in the respective sample determined earlier by quantitative PCR (see above). For 105 samples, the self-defined optimum volume of environmental DNA extract corresponding to 6.4 × 105 16S rRNA genes was spiked into the first step PCR reactions; however, for 27 samples, lower concentrations were used due to limited amounts of bacterial genomic DNA or PCR inhibition detected by quantitative PCR (see above). These 132 samples included eight biological replicates. Prior to the analysis, we removed four samples due to their extremely low genomic DNA concentrations and 16S rRNA gene copy numbers. Duplicates of PCR products were pooled, diluted to 1:100 and used as templates in the subsequent barcoding PCR. In this PCR, diluted 16S rRNA gene amplicons were amplified using 50 primer pairs with unique barcode pairs (Sinclair et al., in review; Table S1). The barcoding PCRs for most samples were conducted in triplicates analogous to the first PCR (n=100). The remaining 32 samples that had weak bands in first step PCR due to low genomic template DNA concentrations or high sample dilution were amplified in 6-9 replicates to increase amplicon DNA yield. Barcoded PCR amplicons were pooled in an equimolar fashion after purification using the Agencourt AMPure XP purification system (Beckman Coulter, Danvers, MA, USA) and quantification of amplicon-concentration using the Quant-iT™ PicoGreen® dsDNA Assay Kit (Life Technologies Corporation, USA). Finally, a total of 137 samples including 5 negative controls resulted in four pools for sequencing.
Illumina® sequencing
The sequencing was performed on an Illumina® MiSeq at the SciLife Lab Uppsala. For each pool, the library preparation was performed separately following the TruSeq protocol with the exception of the initial fragmentation and size selection procedures. This involves the binding of the standard sequencing adapters in combination with separate Illumina®-specific MID barcodes that enables the combination of different pools on the same sequencing run (Sinclair et al., unpublished). After pooling, random PhiX DNA was added to provide calibration and help with the cluster generation on the MiSeq’s flow cell.
16S rRNA gene amplicon data analysis
The sequence data were processed as outlined in Sinclair et al. (in review). In short, after sequencing the libraries of 16S rRNA amplicons, the read pairs were demultiplexed and joined using the PANDAseq software (Masella et al., 2012). Next, reads that did not bear the correct primer sequences at the start and end of their sequences were discarded. Reads were then filtered based on their PHRED scores. Chimera removal and OTU (operational taxonomic unit) clustering at 3% sequence dissimilarity was performed by pooling all reads from all samples together and applying the UPARSE algorithm (Edgar, 2013). Here, any OTU containing less than two reads was discarded. Each OTU was subsequently taxonomically classified by operating a similarity search against the SILVAmod database and employing the CREST assignment algorithm (Lanzén et al., 2012). Plastid, mitochondrial and archaeal OTUs were removed. In addition, OTUs were also taxonomically annotated against the freshwater database (Newton et al., 2011) using the same method. If necessary, OTU rarefying for the purpose of standardising sequence numbers was performed using the ‘rrarefy’-function implemented in the R-package vegan (Oksanen et al., 2013). Biodiversity measure calculation, statistical analyses and plot-generation were conducted in R (R Core Team, 2013) using python scripts. The habitat index for the top 5000 OTUs was determined using the SEQenv pipeline (http://environments.hcmr.gr/seqenv.html). The SEQenv pipeline retrieves hits to highly similar sequences from public repositories (NCBI Genbank) and uses a text mining module to identify Environmental Ontology (EnvO) (Ref: http://environmentontology.org/) terms mentioned in the associated contextual information records (“Isolation Source” field entry for genomes in Genbank or associated PubMed abstracts). At the time of running SEQenv on our dataset (version 0.8), there were approximately 1200 EnvO terms organised into three main branches (namely, environmental material, environmental feature, and biome). However, we used SEQenv to retrieve a subset of these terms, i.e., those that contain “Habitat” (ENVO:00002036). Raw sequence data were submitted to the NCBI Sequence Read Archive (SRA) under accession number SRP045083.
Acknowledgements
This study was supported by the Austrian Science Fund (FWF) as part of the DKplus “Vienna Doctoral Program on Water Resource Systems” (W1219-N22) and the FWF project P25817-B22, as well as the research project “Groundwater Resource Systems Vienna” in cooperation with Vienna Water (MA31). AE and LS are funded by the Swedish Foundation for Strategic Research (ICA10-0015). Infrastructure (cruise ships, floating laboratory) and logistics for collecting, storing and transporting samples were provided by the International Commission for the Protection of the Danube River (ICPDR). The analyses were performed using resources provided by the SNIC through the Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under project “b2011035”.
Conflict of Interest Statement
The authors declare no conflict of interest.