ABSTRACT
The study of microbial evolution is hindered by the fact that microbial populations leave few fossils. We hypothesized that bacterial cells preserved in ancient ice could be used as a molecular fossil record if their DNA could be extracted and sequenced. Channels formed along triple junctions of ice crystals contain liquid “veins” in which microbial cells may be preserved intact. Since vertical motion through the ice matrix is impossible, microbes found in ice cores are representative of microbes present at the time the ice was formed. We detected chlorophyll fluorescence in intact ice cores taken from Greenland and Antarctica. Flow cytometric analysis localized at least some of this fluorescence to particles < 1 μm in diameter. Metagenomic analysis of meltwater indeed revealed sequences similar to modern strains of the picocyanobacterial genera Synechococcus and Prochlorococcus, and some of these sequences were distinct from any sequences known from modern oceans or glacial environments. Our study is a first proof-of-concept of the use of ice cores as records of microbial evolution, and we suggest that future genetic studies with higher vertical resolution in the cores might shed light on the pace and character of evolution of these ecologically important cells.
INTRODUCTION
Studies of the evolutionary history of life on Earth are often informed by a combination of fossil evidence and molecular phylogenetics. The former method allows robust estimates of the timing of major evolutionary events, but usually is limited to hard-bodied organisms and, like all purely morphological studies, is prone to researcher bias. Molecular comparisons (e.g., of homologous DNA sequences) are much more amenable to statistical examination but are generally only available for extant organisms. “Molecular fossils” – e.g., ancient DNA that remains sufficiently intact for analysis – are rare, due to the continuing degradation of large biomolecules after an organism’s death. Most studies of ancient biomolecules either deal with a small number of individuals [e.g. studies of mammoth and Neanderthal DNA, 1, 2] or with mixed populations that are difficult to connect directly with any extant species [e.g. bacterial communities in Antarctic ice, 3]. It remains difficult to extrapolate from these studies to large-scale population genetic questions, particularly of biogeochemically important microbial populations.
Here we explore the feasibility of studying the long-term evolution of microorganisms by exploiting the discovery of chlorophyll-containing cells in ice cores from Antarctic and Arctic glaciers [4]. Microbial cells may become preserved in nutrient-rich liquid veins within the ice matrix at temperatures too low for either growth or significant movement in the ice matrix [5]. Therefore, cells in a given ice stratum are representative of strains that were alive in the oceans at the time of deposition onto glacial ice. The cell envelopes of a large fraction of the preserved cells remain intact [4], leading to the possibility that their genomes may also remain intact. The recovery of high-molecular weight DNA from ancient, datable microbial communities with readily identifiable modern relatives would constitute a fossil record of microbial evolution, with the oldest genomes at the bottom of an ice core and the youngest at the top.
In this manuscript we present results from our first attempt to extract and sequence microbial DNA from ice cores from Antarctica and Greenland. Despite substantial numbers of sequences from probably contaminants, we were nevertheless able to confirm the presence of unicellular marine cyanobacteria closely related to the widespread modern taxa Synechococcus and Prochlorococcus in 3 of the 4 ice cores we examined. We believe these results are a proof-of-concept of the use of ice cores as records of microbial evolution, and based on our estimates of cyanobacterial abundance throughout the depths of both Greenland and Antarctic ice and the time frames represented by those cores, we expect future, refined efforts to allow evaluation of the evolution of marine bacteria on a million-year timescale.
RESULTS AND DISCUSSION
Analysis of Chl fluorescence in ice cores
Using a scanning fluorometer (the BFS, see Methods), we detected Chl fluorescence throughout intact ice cores from the Western Antarctic Ice Sheet Divide Ice Core project (WDC) and the Greenland (GISP2D) project. Chl fluorescence in WDC ice sharply decreased in the top 200 m, followed by a relatively abrupt shallowing of the slope from 1200 m down to ∼3100 m. (Fig. 1A). Similar trends were observed in intact GISP2D cores (Fig. 1A).
FCM of melted ice from WDC, GISP2D, and numerous other cores (Table 1) revealed micron-sized Chl-containing particles similar in size and shape to marine cyanobacteria of the genera Prochlorococcus and Synechococcus in every sample that we analyzed (e.g. Fig. S1), suggesting that at least some of the Chl detected by the BFS was contained in intact cells. Events in the Chl/SSC gate calibrated with cultured marine picophytoplankton were extremely rare in sterile controls, and all ice core samples were well above the average control value of 9.9 counts mL-1 (Fig. 2C).
In contrast to the data from intact cores (Fig. 1A), the abundance of Chl-containing particles increased with depth for both GISP2D (Fig. 2A, Spearman ρ = 0.332, P of true ρ < 0, 0.089) and WDC ice (Fig. 2B, ρ = 0.494, P of true ρ < 0, 0.013). The opposite trends in intact core Chl fluorescence and abundance of Chl-containing cells counted by FCM was possibly explained by a decrease in the fluorescence intensity of individual particles with depth for GISP2D ice (Fig. S2A, Spearman ρ = -0.448, P of true ρ > 0, 0.031); however this trend was not observed for WDC ice (Fig. S2B, ρ = 0, P of true ρ > 0, 0.5). Another explanation derives from the differential preservation of different size classes of chlorophyll-containing cells. Larger photoautotrophs such as diatoms and seaweed fragments would be excluded from ice veins and would likely die during ice crystal formation. Chl fluorescence from these non-viable cells would slowly decay, explaining the downward trend in whole-core fluorescence (Fig. 1), but importantly none of the Chl from these cells would be detected by our FCM methods. In contrast, some small cells, including the smallest phytoplankton, would essentially be cryopreserved within the veins and these cells would comprise a larger fraction of total Chl as the larger cells decayed.
Phylogenetic placement of microorganisms preserved in polar ice
We successfully amplified DNA from all four depths (two from the GISP2D and two from the WDC ice cores) that we attempted (Table 1). A total of 55,505 quality-controlled pyrosequences were examined (Table S2), of which nearly 50% belonged to NTUs [nodal taxonomic units, 6] that were present in negative control samples and were thus removed. Of the remaining 28,336 sequences, nearly 97% belonged to NTUs that were closely related to Escherichia coli. Few E. coli-like sequences were present in negative controls, suggesting that E. coli was present in or on the ice cores and was not effectively removed by our decontamination procedures. Due to their high abundance in the sequencing run relative to all other sequences as well as the ubiquity of E. coli DNA in biology labs, we reasoned that these E. coli sequences were almost certainly human-introduced contaminants and removed them from subsequent analyses. This level of contamination was unfortunate, but is not uncommon for studies of ancient DNA or other systems for which the yield of target DNA is low [7, 8].
The remaining 883 sequences were distributed among 16 bacterial taxonomic groups (Table S3). Taxonomic composition differed between the samples (Fig. 3A). Bacteria often associated with soil and freshwater (e.g. Actinobacteria, Caulobacter, TM7, Firmicutes) were numerically dominant in WDC samples and were also abundant in GISP2D samples (Fig. 3A). Sequences identified as Prochlorococcus, a highly abundant unicellular cyanobacterium from temperate and tropical waters, were discovered in three of four ice core samples, including both depths from GISP2D ice. Indeed, Prochlorococcus-like sequences were numerically dominant amongst the retained sequences from deep GISP2D ice (Fig. 3A).
Reasoning that marine cyanobacteria were unlikely to be contaminants, and since we saw evidence of Prochlorococcus-like cells by FCM, we looked more closely at these sequences. Placement of the 9 unique Prochlorococcus-like partial 16S rRNA pyrosequences in a phylogenetic tree (Fig. 3B) clearly placed these sequences within the eMIT9312 ecotype of Prochlorococcus, with an average genetic distance of 0.02 ± 0.005 (standard deviation) substitutions • bp-1 in comparison to the type strain MIT9312 (Table S4). Only one of the sequences had an exact match in a cultured strain (Prochlorococcus MIT9301), although 7 others had an exact match with uncultured bacteria from environmental studies (Table 2). One sequence, however, had no exact match in any of the databases we examined (Table 2). This sequence (marked with an asterisk in Fig. 3B) differed by a SNP and a single-base pair deletion from its closest match in the NCBI database. Moreover, it was detected independently in both GISP2D libraries, suggesting that these changes were not PCR or sequencing artifacts. Additionally, none of these sequences matched cultivated strains in use by our laboratories, nor were any Prochlorococcus-like sequences found in the 19,934 sequences found in negative controls, strongly suggesting that these warm-water cyanobacteria were in fact preserved within the ice cores.
We further recovered eight unique cyanobacteria-like ITS sequences from shallow GISP2D ice and deep WDC ice. Phylogenetic placement of these sequences (Fig. 3C) revealed greater diversity than the 16S sequences. The two ITS sequences from GISP2D clearly clustered with Synechococcus, a sister group to Prochlorococcus. One WDC sequence clustered in the same eMIT9312 Prochlorococcus ecotype as the 16S sequences. However, the other five WDC sequences formed a distinct clade within Prochlorococcus that was quite divergent from known sequences. While this clade was separated from all other sequences with 100% posterior probability, its branching order in relation to the other ecotypes of Prochlorococcus was unclear. Interestingly, three of these five sequences had better hits in an Antarctic metagenome [9] than in any other database (Table 2) suggesting the exciting possibility that currently undescribed cold-adapted picocyanobacterial strains might exist in these frigid regions.
A fossil record preserved in glacial ice?
To glaciologists it might seem that cyanobacterial genomes would evolve little during only the last million years relative to their origin some 2.5 to 3 billion years ago. However, microorganisms are capable of very rapid evolution due to their vast population sizes and ability to undergo horizontal gene transfer during encounters with other cells in the ocean [10-13]. In contrast, growth, and hence evolution, effectively ceases after preservation in glacial ice. Also, the vertical distances bacteria may travel in veins is less than 1 μm yr-1, or < 10 cm in 100,000 yr [14]. Thus, microbial genomes in ice cores exist as molecular fossils, and their existence potentially opens a window into the microbial biogeochemistry of the past.
The rapid evolvability of bacterial populations is clear from evolutionary theory as well as from laboratory experiments [15], but it is difficult to observe it in action in situ. The time course and pace of microbial evolution have been documented in microbial pathogens [16], acid mine drainage communities [17], and laboratory populations [18]. The existence of a robust molecular fossil record linked to a well-studied modern microbial group would allow us to study these same processes on a much grander scale. Prochlorococcus and Synechococcus are globally distributed in oceans, and their modern population genetics are perhaps better understood than those of any other microbe in the environment. If our DNA sequences came from tropical cyanobacterial populations that were originally swept up from ocean surfaces by wind currents and deposited on ice thousands of miles away [4], then deeper examination of the microbiota of these ice cores should allow us to extend the ecological analysis of Prochlorococcus and Synechococcus to a multi-millennial or even million-year time scale. If instead we have discovered a novel clade of psychrophilic Prochlorococcus-like cyanobacteria, we can still study the pace of long-term divergence from, or introgression into, open ocean populations. Most importantly, because Prochlorococcus and Synechococcus are crucial players in Earth’s carbon cycle, their genome dynamics over successive glaciations might help us understand not only how microbes adapt to their changing environments, but also how the oceans will respond to ongoing anthropogenic changes. Our study represents a first step toward this ultimate goal, and we are optimistic that methodological refinements will greatly improve the resolution and reliability of future attempts to probe this valuable resource.
METHODS
Ice source and handling prior to analysis
We analyzed polar ice samples from a wide range of depths from the Greenland Ice Sheet Project 2 (GISP2D) and the Western Antarctic Ice Sheet Divide Ice Core (WDC) projects, as well as samples from the South Pole, Vostok Station, and a number of other sites in Antarctica and Greenland (Table 1). Fluorimetry analysis was conducted only on GISP2D and WDC ice, and was performed at the storage facility at the National Ice Core Laboratory (NICL) in Denver, CO. Small pieces of cores for use in flow cytometry and DNA analysis were stored in Price’s laboratory at −40° C until analysis.
Non-destructive analysis of intact ice cores using the Berkeley Fluorescence Spectrometer (BFS)
Bramall [19] developed a scanning spectrofluorimeter powered by a 375 nm laser to map the 465 nm emission from the F420 pigment in methanogens inside solid ice cores at NICL [20]. During one week almost every year from 2008-2013, we were allowed to continuously occupy a room at -25°C at NICL that we were able to darken completely. One version of the BFS using a 7-channel 224 nm laser allowed tryptophan, chlorophyll (Chl), and volcanic ash to be mapped as a function of depth in ice cores. A strong peak in channel 6 was used to detect Chl and its decomposition product pheophytin [21] and strong emission in channel 7 was used to detect fluorescence from mineral grains transported from volcanic eruptions. With a laser beam ∼1 mm in diameter and brief bursts of laser light at intervals ∼1 mm perpendicular to the flat surface of an ice core cut longitudinally in half and polished, Price and co-workers were able to acquire 7 channels of data along a core ∼1 meter long in ∼2 minutes. Data from different scanning sessions were normalized as described previously [4, 20].
Flow cytometry
Abundance of ice core bacteria were assessed using flow cytometry (FCM). In order to reduce surface contamination due to microbes not originating in the ice, we removed all but ∼4 ml from ∼50 ml ice samples with either stirred 6% household bleach or stirred sterile water. Then the remaining ice was melted and passed through a 1 μm pore size polycarbonate prefilter to eliminate volcanic ash particles, most mineral grains, and cells or cell debris too large to occupy veins in ice crystals. Microorganisms small enough to pass through this filter were analyzed with a Fortessa digital flow cytometer (BD). Chl autofluorescence was detected by emission at 670 nm (deep red) following excitation by a 488 nm argon laser, and phycoerythrin-like (PE) autofluorescence was detected by emission at 593 nm (orange) following excitation by a 567 nm laser. Particles were characterized in pairwise plots of Chl vs. PE and of Chl vs. side scatter (SSC, a measure of size and internal structure; see Fig. S1 for representative examples). Cyanobacterial gate boundaries were set using fluorescent beads of sizes 0.3, 0.5, and 1 μm. Cultures of Prochlorococcus (SS120, NATL1A, MIT9313, and MED4), Synechococcus WH8102, and Ostreococcus OTH95 and 601 were run as standards when initially adjusting voltage settings on the flow cytometer. After trial and error, events were triggered on values of SSC greater than 30 units, which accepted Prochlorococcus and Synechococcus cells and eliminated points labeled as noise (Fig. S1). FCM data were plotted and analyzed using Flowjo (TreeStar, Inc.).
Statistical analysis of FCM data was performed in the statistical computing environment R v. 2.15.1.
DNA Extraction and Sequencing
In order to reduce the probability of contamination by foreign DNA on the exterior of the ice samples or on laboratory surfaces, all materials that came in contact with the ice core or melt water were sterilized either by heating at 450° C for 4 hours or by treatment with 10% HCl. Ice melting operations and meltwater filtering were conducted sequentially in a laminar flow hood. During the day of a run, several cores were removed from plastic bags, cut into samples of working volume ∼50 ml and rinsed with warm nanopure water to remove up to 50% of the volume of the outer surface. The remaining interior portion of the core was allowed to melt slowly in the dark. Microorganisms in the meltwater were collected on 0.1 μm polycarbonate membrane filters. Nucleic acids were extracted and separated as described elsewhere [22].
Two different amplification strategies were employed for DNA sequence analysis. Total community composition was assessed by amplifying partial 16S rRNA genes. These amplicons were processed as described elsewhere [6] except that a different reverse primer (519R appended with the 454B sequence (GCCTTGCCAGCCCGCTCAGTGWATTACCGCGGCKGCTG) was used. Pyrosequences were generated using Roche 454 GS FLX Titanium technology. Different barcoded forward primers were used for each sample, two nanopure controls, and a PCR negative control.
In order to target picocyanobacteria, the 16S/23S internal transcribed spacer (ITS) region was amplified using primers described previously (16S-1247F and 23S-1608R) that target Synechococcus and Prochlorococcus [23]. Samples with positive PCR signal were cloned into the pGEM-T-easy vector according to manufacturer’s instructions (Promega) and transformed into E. coli using an electroporator (Gibco). Purified plasmids from colonies containing inserts were screened by PCR using the universal internal priming sites 16S-1492F and 23S-241R and Sanger sequenced with these same primers using an ABI 373 sequencer.
Other studies have reported substantial degradation of DNA recovered from ancient ice [3, 24]. Our methods avoided this problem, however, by using an initial PCR amplification step, such that only reasonably intact DNA molecules were available for sequencing.
All sequence data have been deposited at Genbank as BioProject PRJNA239379.
Phylogenetic analysis
16S pyrosequences were processed using the PhyloAssigner pipeline and reference phylogenetic tree described by Vergin et al. [6]. The taxonomic identification of the pyrosequencing reads (‘tags’) followed the approach of Sogin et al. [25]. The tags were compared by BLASTN [26] with a reference database of hypervariable region tags (RefHVR_V6, http://vamps.mbl.edu/) based on the SILVA database version 95 [27], and the 100 best matches were aligned to the tag sequences using MUSCLE [28]. Reference sequences were defined as those having the minimum global distance (number of insertions, deletions and mismatches divided by the length of the tag) to the tag sequence, and all reads showing the best match to the same reference V6 tag were grouped together as the same operational taxonomic unit (OTU) [‘best match definition’, 25, 29, 30]. Taxonomy was assigned to each reference sequence with the RDP Classifier [31]. This pipeline was, however, not precise enough to classify all bacteria from the deep Arctic Ocean, for which few sequence records are available in the databases [6]. To increase the taxonomic resolution, pyrosequencing reads were compared against our nearly full-length sequences using BLASTN. This additional step improved the classification up to the class/order level for 30,760 sequences (9.8% of the sequences) that were originally classified at the domain level, and 55,550 sequences (17.7%) that were first defined at the phylum level.
ITS amplicons from ice cores and metagenomes were aligned with a sample set of Prochlorococcus and Synechococcus ITS regions from fully-sequenced strains using MUSCLE. ITS alignments were then trimmed to remove 16S and 23S rRNA sequences as well as the two tRNA coding sequences [23]. Picocyanobacterial sequences identified from the pyrosequencing runs were aligned with Prochlorococcus and Synechococcus 16S sequences obtained from fully sequenced genomes using SINA [32]. Based on these alignments, we removed 16S pyrosequences that were potentially artifacts of the 454 sequencing process. Specifically, we eliminated any pyrosequences containing single nucleotide insertions relative to other Prochlorococcus or Synechococcus strains. We also eliminated any sequences that were only detected once. After this curation, 9 unique 16S pyrosequences and 8 ITS sequences remained for phylogenetic analysis.
16S and ITS alignments were used to infer phylogenetic relationships using a Markov chain Monte Carlo simulation as implemented in a multi-core build of MrBayes v3.2.1 [33]. Two parallel runs were used, each consisting of 1 “cold” chain and 3 “heated” chains. Analyses were carried out until the standard deviation between the runs was < 0.01. Phylogenetic trees were visualized using Geneious 5.5.9 (Biomatters).
BLAST analysis was employed to detect best matches for each of the ice core putative Prochlorococcus/Synechococcus 16S and ITS sequences remaining following curation. Sequences were compared against the NCBI nr database as well as a number of metagenomic datasets (Table S1). The raw read data from each metagenome was downloaded from CAMERA [34] and made into a BLAST database using BLAST+ v2.2.28 [35]. Ice core sequences were then used as queries against these databases using BLASTN and the single best match was returned. Because the average read length for most of the metagenomes was shorter than the sequences we were analyzing (Table S1), we split the sequences into sections of ∼240 bp and analyzed each section separately, then summarized the best hits for each fragment to get a total percent identity. Summarized BLAST results, alignment files, and tree files are available for download at datadryad.org (DOI Pending).
ACKNOWLEDGEMENTS
We are grateful to the members of the WDC and GISP2 project members for their years of work collecting and preserving ice cores; Andrei Kurbatov, Jeffrey Severinghaus, David Marchant, and Geoffrey Hargreaves (curator, NICL) for providing ice samples; Richard E. Lenski for helpful discussions about evolutionary theory and population genetics; Riccardo Cavicchioli for informing us of the paper by Wilkins et al. (2013) on metagenomics of Prochlorococcus and Synechococcus in the Southern Ocean; Ben Temperton for assistance with 454 sequencing and the PhyloAssigner pipeline; and Kartoosh Heydari for assistance with the Li ka-Shing Fortessa flow cytometer. Price and Bay acknowledge partial support from NSF Grant ANT-1142178; Giovannoni and Vergin acknowledge support by a grant from the Marine Microbiology Initiative of the Gordon and Betty Moore Foundation to Giovannoni; Morris was partially supported by NSF Grants DBI-0939454 and OCE-1316101, and by a postdoctoral fellowship from the NASA Astrobiology Institute.