Abstract
Survival and growth of the anaerobic gut fungi (AGF, Neocallimastigomycota) in the herbivorous gut necessitate the possession of multiple abilities absent in other fungal lineages. We hypothesized that horizontal gene transfer (HGT) was instrumental in forging the evolution of AGF into a phylogenetically distinct gut-dwelling fungal lineage. Patterns of HGT were evaluated in the transcriptomes of 27 AGF strains, 22 of which were isolated and sequenced in this study, and 4 AGF genomes broadly covering the breadth of AGF diversity. We identified 327 distinct incidents of HGT in AGF transcriptomes, with subsequent gene duplication resulting in an HGT frequency of 2.9-4.1% in AGF genomes. The majority of HGT events were AGF specific (90.8%) and wide (67.3%), indicating their occurrence at early stages of AGF evolution. The acquired genes allowed AGF to expand their substrate utilization range, provided new venues for electron disposal, augmented their biosynthetic capabilities, and facilitated their adaptation to anaerobiosis. The majority of donors were anaerobic fermentative bacteria prevalent in the herbivorous gut. In addition, acquisition incidents from marine invertebrates provide interesting clues to the habitat of AGF ancestors prior to terrestrialization. This work strongly indicates that HGT indispensably forged the evolution of AGF as a distinct fungal phylum and provides a unique example of the role of HGT in shaping the evolution of a high rank taxonomic eukaryotic lineage.
Introduction
Horizontal gene transfer (HGT) is defined as the acquisition, integration, and retention of foreign genetic material into a recipient organism (Doolittle 1999). HGT represents a relatively rapid process for trait acquisition; as opposed to gene creation either from preexisting genes (via duplication, fission, fusion, or exon shuffling) or through de-novo gene birth from non-coding sequences (Andersson et al 2015, Carvunis et al 2010, Innan and Kondrashov 2010, Cai, 2008, Kaessmann 2010). In prokaryotes, the occurrence, patterns, frequency, and impact of HGT on the genomic architecture (Ochman et al 2000), metabolic abilities (Caro-Quintero and Konstantinidis 2015, Youssef et al 2015), physiological preferences (Omelchenko et al 2005, Puigbo et al 2008), and ecological fitness (Wiedenbeck and Cohan 2011) has been widely investigated, and the process is now regarded as a major driver of genome evolution in bacteria and archaea (Philippe and Douady 2003, Syvanen 2012). Although eukaryotes are perceived to evolve principally through modifying existing genetic information, analysis of HGT events in eukaryotic genomes has been eliciting increasing interest and scrutiny. In spite of additional barriers that need to be overcome in eukaryotes, e.g. crossing the nuclear membrane, germ line sequestration in sexual multicellular eukaryotes, and epigenetic nucleic acids modifications mechanisms (Andersson et al 2015, Fitzpatrick 2012), it is now widely accepted that HGT contributes significantly to eukaryotic genome evolution (Husnik and McCutcheon 2017, Keeling and Palmer 2008). HGT events have convincingly been documented in multiple phylogenetically disparate eukaryotes ranging from the Excavata (Eichinger et al 2005, Hirt et al 2002, Nixon et al 2002, Qian and Keeling 2001), SAR supergroup (Eme et al 2017, Kishore et al 2013, Ricard et al 2006, Wisecaver et al 2013), Algae (Schönknecht et al 2013b), Plants (Richardson and Palmer 2007), and Opisthokonta (Gladyshev et al 2008, Danchin, 2010 #72, Marcet-Bouben and Gabaldon 2010, Sun et al 2010). Reported HGT frequency in eukaryotic genomes ranges from a handful of genes, e.g. (McCarthy and Fitzpatrick 2016), to up to 9.6% in Bdelloid rotifers (Gladyshev et al 2008).
The kingdom Fungi represents a phylogenetically coherent clade that evolved ≈ 900-1481 Mya from a unicellular flagellated ancestor (Douzery et al 2004, Parfrey et al 2011, Taylor and Berbee 2006). To date, multiple efforts have been reported on the detection and quantification of HGT in fungi. A survey of 60 fungal genomes reported HGT frequencies of 0-0.38% (Marcet-Bouben and Gabaldon 2010), and similar low values were observed in the genomes of five early-diverging pathogenic Microsporidia and Cryptomycota (Alexander et al 2016b). As such, the prevailing consensus is that HGT events in fungal genomes are infrequent and sporadic (Fitzpatrick 2012, Schönknecht et al 2013b). This has been attributed to the osmotrophic lifestyle of fungi (Berbee et al 2017), which is less conducive to HGT compared to the phagocytic lifestyle of several microeukaryotes with relatively higher HGT frequency (Doolittle 1998).
The anaerobic gut fungi (AGF, Neocallimastigomycota) represent a phylogenetically distinct basal fungal lineage. The AGF appear to exhibit a restricted distribution pattern, being encountered in the gut of ruminant and non-ruminant herbivorous (Gruninger et al 2014). In the herbivorous gut, the life cycle of the AGF (Figure S1) involves the discharge of motile flagellated zoospores from sporangia in response to animal feeding, the chemotaxis and attachment of zoospores to ingested plant material, spore encystment, and the subsequent production of rhizoidal growth that penetrates and digests plant biomass through the production of a wide array of cellulolytic and lignocellulolytic enzymes.
Survival, colonization, and successful propagation of AGF in the herbivorous gut necessitate the acquisition of multiple unique physiological characteristics and metabolic abilities absent in other fungal lineages. These include, but are not limited to, development of a robust plant biomass degradation machinery, adaptation to anaerobiosis, and exclusive dependence on fermentation for energy generation and recycling of electron carriers (Boxma et al 2004, Youssef et al 2013). Therefore, we hypothesized that sequestration into the herbivorous gut was conducive to the broad adoption of HGT as a relatively faster adaptive evolutionary strategy for niche adaptation by the AGF (Figure S1). Further, since no part of the AGF life cycle occurs outside the animal host and no reservoir of AGF outside the herbivorous gut has been identified (Gruninger et al 2014), then acquisition would mainly occur from donors that are prevalent in the herbivorous gut (Figure S1). Apart from earlier observations on the putative bacterial origin of a few catabolic genes in two AGF isolates (Garcia-Vallvé et al 2000, Harhangi et al 2003), and preliminary BLAST-based queries of a few genomes (Haitjema et al 2017, Youssef et al 2013), little is currently known on the patterns, determinants, and frequency of HGT in the Neocallimastigomycota. To address this hypothesis, we systematically evaluated the patterns of HGT acquisition in the transcriptomes of 27 AGF strains and 4 AGF genomes broadly covering the breadth of AGF genus-level diversity. Our results document the high level of HGT in AGF in contrast to HGT paucity across the fungal kingdom. The identity of genes transferred, distribution pattern of events across AGF genera, phylogenetic affiliation of donors, and the expansion of acquired genetic material in AGF genomes highlight the role played by HGT in forging the evolution and diversification of the Neocallimastigomycota as a phylogenetically, metabolically, and ecologically distinct lineage in the fungal kingdom.
Materials and Methods
Organisms
Type strains of the Neocallimastigomycota are unavailable through culture collections due to their strict anaerobic and fastidious nature, as well as the frequent occurrence of senescence in AGF strains (Ho and Barr 1995). As such, obtaining a broad representation of the Neocallimastigomycota necessitated the isolation of representatives of various AGF genera de novo. Samples were obtained from the feces, rumen, or digesta of domesticated and wild herbivores around the city of Stillwater, OK and Val Verde County, Texas (Table 1). Isolation procedures are explained in detail in the Supplementary text.
Sequencing and assembly
Transcriptomic sequencing was conducted for 22 AGF strains. Sequencing multiple taxa provides stronger evidence for the occurrence of HGT in a target lineage (Richards and Monier 2017), and allows for the identification of phylum-wide versus genus- and species-specific HGT events. Transcriptomic, rather than genomic, sequencing was chosen for AGF-wide HGT identification efforts since enrichment for polyadenylated (poly(A)) transcripts prior to RNA-seq provides an additional safeguard against possible prokaryotic contamination, an issue that often plagued eukaryotic genome-based HGT detection efforts (Boothby et al 2015, Koutsovoulos et al 2016), as well as to demonstrate that HGT genes identified are transcribed in AGF. Protocols for RNA extraction, transcriptomic sequencing and assembly, and peptide model prediction are described in detail in the supplementary text.
HGT identification
A combination of BLAST similarity searches, comparative similarity index (HGT index, hU), phylogenetic analyses, and parametric gene composition approaches were conducted to identify HGT events in the analyzed transcriptomic datasets (Fig. 1). We define an HGT event as the acquisition of a foreign gene/pfam by AGF from a single lineage/donor. Details on the criteria used for identification of HGT events are described in the supplementary text. The GC content, and intron distribution were assessed in all identified events and compared to averages of an equal number of randomly chosen genes from AGF genomes using Student t-test to identify possible deviations in such characteristics as often observed with HGT genes (Soucy et al 2015). As a control, the frequency of HGT occurrence in the genomes of a filamentous ascomycete (Colletotrichum graminicola, GenBank Assembly accession number GCA_000149035.1), and a microsporidian (Encephalitozoon hellem, GenBank Assembly accession number GCA_000277815.3) were determined using our pipeline (Table S1); and the results were compared to previously published results (Alexander et al 2016a, Jaramillo et al 2015).
Identification of HGT events in carbohydrate active enzymes (CAZymes) transcripts
In AGF genomes, carbohydrate active enzymes (CAZymes) are often encoded by large multi-module genes with multiple adjacent CAZyme or non-CAZyme domains. A single gene can hence harbor multiple CAZyme pfams of different (fungal or non-fungal) origins (Haitjema et al 2017, Youssef et al 2013). As such, our initial efforts for HGT assessment in CAZyme-encoding transcripts using an entire gene/ transcript strategy yielded inaccurate results since similarity searches only identified pfams with the lowest e-value or highest number of copies, while overlooking additional CAZyme pfams in the transcript (Figure S2). To circumvent the multi-modular nature of AGF CAZyme transripts, we opted for the identification of CAZyme HGT events on trimmed domains, rather than entire transcript. Details on the identification of CAZyme-containing transcripts and criteria used for detection of CAZyme HGT events are explained in the supplementary text.
Neocallimastigomycota-specific versus non-specific HGT events
For all HGT events identified in the Neocallimatigomycota, orthologues (30% identity, >100 amino acids alignment) were extracted from the genomes of other basal fungi, i.e. members of Blastocladiales, Chytridiomycota, Cryptomycota, Microsporidia, Mucoromycota, and Zoopagomycota, and the phylogenetic affiliation of these orthologues was assessed. An HGT event was judged to be Neocallimastigomycota-specific if: 1. orthologues were absent in all basal fungal genomes, 2. orthologues were identified in basal fungal genomes, but these orthologues were of clear fungal origin or displayed an affiliation different from that observed in the Neocallimastigomycota. On the other hand, events were judged to be non-specific to the Neocallimastigomycota if phylogenetic analysis of basal fungal orthologues indicated a non-fungal origin with a donor affiliation similar to that observed in the Neocallimastigomycota (Figure 1).
Data accession
Sequences of individual transcripts identified as horizontally transferred are deposited in GenBank under the accession number MH043627-MH043936, and MH044722-MH044724. The whole transcriptome shotgun sequences were deposited in GenBank under the BioProject PRJNA489922, and Biosample accession numbers SAMN09994575-SAMN09994596. Transcriptomic assemblies were deposited in the SRA under project accession number SRP161496. Alignments, as well as Newick tree files for all HGT genes are provided as a supplementary file (Supp. files 1 and 2). Trees of specific HGT events discussed in the results and discussion sections are presented in the supplementary document (S5-S56).
Results
Isolates
The transcriptomes of 22 different isolates were sequenced. These isolates belonged to six out of the nine currently described AGF genera: Anaeromyces (n=5), Caecomyces (n=2), Neocallimastix (n=2), Orpinomyces (n=3), Pecoramyces (n=4), Piromyces (n=4), as well as the recently proposed genus Feramyces (n=2) (Hanafy et al 2018) (Table 1, Supplementary Figure 3). Out of the three AGF genera not included in this analysis, two are currently represented by a single strain that was either lost (genus Oontomyces (Dagar et al 2015)), or appears to exhibit an extremely limited geographic and animal host distribution (genus Buwchfawromyces (Callaghan et al 2015)). The third unrepresented genus (Cyllamyces) has recently been suggested to be phylogenetically synonymous with Caecomyces (Wang et al 2017). As such, the current collection is a broad representation of currently described AGF genera.
Sequencing
Transcriptomic sequencing yielded 15.2-110.8 million reads (average, 40.87) that were assembled into 31,021-178,809 total transcripts, 17,539-132,141 distinct transcripts (clustering at 95%), and 16,500-70,061 predicted peptides (average 31,611) (Table S2). Assessment of transcriptome coverage using BUSCO (Simão et al 2015) yielded high completion values (82.76-97.24%) for all assemblies (Table S2). For strains with a sequenced genome (Pecoramyces ruminantium, Piromyces finnis, Piromyces sp. E2, Anaeromyces robustus, and Neocallimastix californiae), genome coverage (percentage of genes in a strain’s genome for which a transcript was identified) ranged between 70.9-91.4% (Table S2).
HGT events
A total of 327 distinct HGT events were identified in the Neocallimastigomycota pantranscriptome analyzed (Table S3). The average number of events per genus was 251±16 and ranged between 232 in the genus Caecomyces to 276 in the genus Pecoramyces pantranscriptomes (Fig. 2A). The majority of HGT acquisition events identified (297, 90.83%) appear to be Neocallimastigomycota-specific, i.e. identified only in genomes belonging to the Neocallimastigomycota, but not in other basal fungal genomes (Table S3), strongly suggesting that such acquisitions occurred post, or concurrent with, the evolution of Neocallimastigomycota as a distinct fungal lineage. As well, the majority of these identified genes were Neocallimastigomycota-wide, being identified in strains belonging to at least six out of the seven examined genera (220 events, 67.3%), suggesting the acquisition of such genes prior to genus level diversification within the Neocallimastigomycota. Only 47 events (14.4%) were genus-specific, with the remainder (60 events, 18.3%) being identified in the transcriptomes of 3-5 genera (Table S3, Figure S4, and Fig. 2b).
The absolute majority (89.9%) of events were successfully mapped to at least one of the four AGF genomes (Table S4), with a fraction (8/33) of the unmapped transcripts being specific to a genus with no genome representative (Feramyces, Caecomyces). Compared to a random subset of 327 genes in each of the sequenced genomes, horizontally transferred genes in AGF genomes exhibited significantly (P<0.0001) fewer introns (1.55±3.67 vs 3.32±0.83), as well as higher GC content (30.94±4.6 vs 27.7±5.5) (Table S4). Further, HGT genes/pfams often displayed high levels of gene/ pfam duplication and expansion within the genome (Table S4), resulting in an HGT frequency of 2.88% in Pecoramyces ruminantium (470 HGT genes out of 16,347 total genes), 3.74% in Piromyces finnis (429 HGT genes out of 11,477 total genes), 4.00% in Anaeromyces robustus (517 HGT genes out of 12,939 total genes), and 4.13% in Neocallimastix californiae (864 HGT genes out of 20,939 total genes).
Donors
A bacterial origin was identified for the majority of HGT events (82.6%), with four bacterial phyla (Firmicutes, Proteobacteria, Bacteroidetes, and Spirochaetes) identified as donors for 203 events (62.1% of total, 75.2% of bacterial events) (Fig. 3A). Specifically, the contribution of members of the Firmicutes (142 events) was paramount, the majority of which were most closely affiliated with members of the order Clostridiales (119 events). In addition, minor contributions from a wide range of bacterial phyla were also identified (Fig. 3A). The majority of the putative donor taxa are strict/ facultative anaerobes, and many of which are also known to be major inhabitants of the herbivorous gut and often possess polysaccharide-degradation capabilities (He et al 2018, Stewart et al 2018). Archaeal contributions to HGT were extremely rare (5 events). On the other hand, multiple (50) events with eukaryotic donors were identified. Remarkably, eukaryotic marine lineages, e.g. Cnidaria (stony corals and sea anemonae), Arthropoda (crustaceans and horse shoe crabs), Mollusca (Oysters and Scallops), Osteichthyes (bony fish), Brachiopoda, Echinodermata (sea urchins, sea cucumber, and starfish), Porifera (sponges), and Trichoplax (the only extant member of the Placozoa; known to inhabit marine environments especially on substrates such as stony corals and mollusk shells (Pearse and Voigt 2007)) contributed 10 out of the 50 eukaryotic HGT events, despite their physical separation from the current AGF habitat (Table S3, supplementary document). In few instances, a clear non-fungal origin was identified for a specific event, but the precise inference of the donor based on phylogenetic analysis was not feasible (Table S3).
Metabolic characterization
Functional annotation of HGT genes/pfams indicated that the majority (61.8%) of events encode metabolic functions such as extracellular polysaccharide degradation and central metabolic processes. Bacterial donors were slightly overrepresented in metabolic HGT events (88.1% of the metabolism-related events, compared to 82.6% of the total events). Genes involved in cellular processes and signaling represent the second most represented HGT events (11%), while genes involved in information storage and processing only made up 6.12% of the HGT events identified (Figs 3b-e). Below we present a detailed description of the putative abilities and functions enabled by HGT transfer events.
Central catabolic abilities
Multiple HGT events encoding various central catabolic processes were identified in AGF transcriptomes and successfully mapped to the genomes (Fig. 4, Table S3, Figs S5-S17). A group of events appear to encode enzymes that allow AGF to channel specific substrates into central metabolic pathways. For example, genes encoding enzymes of the Leloir pathway for galactose conversion to glucose-1-phosphate (galactose-1-epimerase, galactokinase (Fig. 5A), galactose-1-phosphate uridylyltransferase, and UDP-glucose-4-epimerase) were identified, in addition to genes encoding ribokinase, as well as xylose isomerase and xylulokinase for ribose and xylose channeling into the pentose phosphate pathway. As well, genes encoding deoxyribose-phosphate aldolase (DeoC) enabling the utilization of purines as carbon and energy sources were also horizontally acquired in AGF. Further, several of the glycolysis/gluconeogenesis genes, e.g. phosphoenolpyruvate synthase, as well as phosphoglycerate mutase were also of bacterial origin. Fungal homologues of these glycolysis/gluconeogenesis genes were not identified in the AGF transcriptomes, suggesting the occurrence of xenologous replacement HGT events.
In addition to broadening substrate range, HGT acquisitions provided additional venues for recycling reduced electron carriers via new fermentative pathways in this strictly anaerobic and fermentative lineage. The production of ethanol, D-lactate, and hydrogen appears to be enabled by HGT (Fig. 4). The acquisition of several aldehyde/alcohol dehydrogenases, and of D-Lactate dehydrogenase for ethanol and lactate production from pyruvate was identified. Although these two enzymes are encoded in other fungi as part of their fermentative capacity (e.g. Saccharomyces and Schizosaccharomyces), no homologues of these fungal genes were identified in AGF pantranscriptomes. Hydrogen production in AGF, as well as in many anaerobic eukaryotes with mitochondria-related organelles, involves pyruvate decarboxylation to acetyl CoA, followed by the use of electrons generated for hydrogen formation via an anaerobic Fe-Fe hydrogenase. In AGF, while pyruvate decarboxylation to acetyl CoA via pyruvate-formate lyase and the subsequent production of acetate via acetyl-CoA:succinyl transferase appear to be of fungal origin, the Fe-Fe hydrogenase and its entire maturation machinery (HydEFG) seem to be horizontally transferred being phylogenetically affiliated with similar enzymes in Thermotogae, Clostridiales, and the anaerobic jakobid excavate, Stygiella incarcerate (Fig. 5B). It has recently been suggested that Stygiella acquired the Fe-Fe hydrogenase and its maturation machinery from bacterial donors including Thermotogae, Firmicutes, and Spirochaetes (Leger et al 2016), suggesting either a single early acquisition event in eukaryotes, or alternatively independent events for the same group of gene have occurred in different eukaryotes.
Anabolic capabilities
Multiple anabolic genes that expanded AGF biosynthetic capacities appear to be horizontally transferred (Fig. S18-S31). These include several amino acid biosynthesis genes e.g. cysteine biosynthesis from serine; glycine and threonine interconversion; and asparagine synthesis from aspartate. In addition, horizontal gene transfer allowed AGF to de-novo synthesize NAD via the bacterial pathway (starting from aspartate via L-aspartate oxidase (NadB; Fig. 5C) and quinolinate synthase (NadA) rather than the five-enzymes fungal pathway starting from tryptophan (Lin et al 2010)). HGT also allowed AGF to salvage thiamine via the acquisition of phosphomethylpyrimidine kinase. Additionally, several genes encoding enzymes in purine and pyrimidine biosynthesis were horizontally transferred (Fig. 4). Finally, horizontal gene transfer allowed AGF to synthesize phosphatidyl-serine from CDP-diacylglycerol, and to convert phosphatidyl-ethanolamine to phosphatidyl-choline.
Adaptation to the host environment
Horizontal gene transfer also appears to have provided means of guarding against toxic levels of compounds known to occur in the host animal gut (Fig. S32-S38). For example, methylglyoxal, a reactive electrophilic species (Lee and Park 2017), is inevitably produced by ruminal bacteria from dihydroxyacetone phosphate when experiencing growth conditions with excess sugar and limiting nitrogen (Russell 1993). Genes encoding enzymes mediating methylglyoxal conversion to D-lactate (glyoxalase I and glyoxalase II-encoding genes) appear to be acquired via HGT in AGF. Further, HGT allowed several means of adaptation to anaerobiosis. These include: 1) acquisition of the oxygen-sensitive ribonucleoside-triphosphate reductase class III (Fig. 5D) that is known to only function during anaerobiosis to convert ribonucleotides to deoxyribonucleotides (Jordan and Reichard 1998), 2) acquisition of squalene-hopene cyclase, which catalyzes the cyclization of squalene into hopene, an essential step in biosynthesis of the cell membrane steroid tetrahymanol that replaced the molecular O2-requiring ergosterol in the cell membranes of AGF, 3) acquisition of several enzymes in the oxidative stress machinery including Fe/Mn superoxide dismutase, glutathione peroxidase, rubredoxin/rubrerythrin, and alkylhydroperoxidase.
In addition to anaerobiosis, multiple horizontally transferred general stress and repair enzymes were identified (Fig S39-S46). HGT-acquired genes encoding 2-phosphoglycolate phosphatase, known to metabolize the 2-phosphoglycolate produced in the repair of DNA lesions induced by oxidative stress (Pellicer et al 2003) to glycolate, were identified in all AGF transcriptomes studied (Fig. 4, Table S3). Surprisingly, two genes encoding antibiotic resistance enzymes, chloramphenicol acetyltransferase and aminoglycoside phosphotransferase, were identified in all AGF transcriptomes, presumably to improve its fitness in the eutrophic rumen habitat that harbors antibiotic-producing prokaryotes (Table S3). While unusual for eukaryotes to express antibiotic resistance genes, basal fungi such as Allomyces, Batrachochytrium, and Blastocladiella were shown to be susceptible to chloramphenicol and streptomycin (Bishop et al 2009, Rooke and Shattock 1983). Other horizontally transferred repair enzymes include DNA-3-methyladenine glycosylase I, methylated-DNA--protein-cysteine methyltransferase, galactoside and maltose O-acetyltransferase, and methionine-R-sulfoxide reductase (Table S3).
HGT transfer in AGF carbohydrate active enzymes machinery
Within the analyzed AGF transcriptomes, CAZymes belonging to 39 glycoside hydrolase (GHs), 5 polysaccharide lyase (PLs), and 10 carbohydrate esterase (CEs) families were identified (Fig. 6). The composition of the CAZyomes of various AGF strains examined were broadly similar, with the following ten notable exceptions: Presence of GH24 and GH78 transcripts only in Anaeromyces and Orpinomyces, the presence of GH28 transcripts only in Pecoramyces, Neocallimastix, and Orpinomyces, the presence of GH30 transcripts only in Anaeromyces, and Neocallimastix, the presence of GH36 and GH95 transcripts only in Anaeromyces, Neocallimastix, and Orpinomyces, the presence of GH97 transcripts only in Neocallimastix, and Feramyces, the presence of GH108 transcripts only in Neocallimastix, and Piromyces, and the presence of GH37 predominantly in Neocallimastix, GH57 transcripts predominantly in Orpinomyces, GH76 transcripts predominantly in Feramyces, and CE7 transcripts predominantly in Anaeromyces (Fig. 6).
HGT appears to be rampant in the AGF pan-CAZyome: A total of 90 events (27.5% of total HGT events) were identified, with the majority occurring in all AGF genera examined (Fig. 6, Table S3). In 48% of GH families, 55% of CE families, and 20% of PL families, a single event (i.e. attributed to one donor) was observed (Fig. 6, Table S3).
Duplication of these events in AGF genomes was notable, with 152, 373, 201, and 191 copies of HGT CAZyme pfams identified in Anaeromyces, Neocallimastix, Piromyces and Pecoramyces genomes, representing 40.7%, 45.3%, 52.9%, and 41.2% of the overall organismal CAZyme machinery (Table S4). The contribution of Viridiplantae, Flavobacteriales, Fibrobacteres, and Gamma-Proteobacteria was either exclusive to CAZyme-related HGT events or significantly higher in CAZyme, compared to other, events (Fig. 3A).
Transcripts acquired by HGT represented >50% of transcripts in anywhere between 13 (Caecomyces) to 20 (Anaeromyces) GH families; 3 (Caecomyces) to 5 (Anaeromyces, Neocallimastix, Orpinomyces, and Feramyces) CE families; and 2 (Caecomyces and Feramyces) to 3 (Anaeromyces, Pecoramyces, Piromyces, Neocallimastix, and Orpinomyces) PL families (Fig. 6). It is important to note that in all these families, multiple transcripts appeared to be of bacterial origin based on BLAST similarity search but did not meet the strict criteria of hU >30, and so were deemed not horizontally transferred. As such, the contribution of HGT transcripts to overall transcripts in these families is probably an underestimate. Only GH9, GH20, GH37, GH45, and PL3 families appear to lack any detectable HGT events. A PCA biplot comparing CAZyome in AGF genomes to other basal fungal lineages strongly suggests that the acquisition and expansion of many of these foreign genes play an important role in shaping the lignocellulolytic machinery of AGF (Fig. 7). The majority of CAZyme families defining AGF CAZyome were predominantly of non-fungal origin (Fig. 7). This pattern clearly attests to the value of HGT in shaping AGF CAZyome via acquisition and extensive duplication of acquired gene families.
Collectively, HGT had a profound impact on AGF plant biomass degradation capabilities. The AGF CAZyome encodes enzymes putatively mediating the degradation of twelve different polysaccharides (Fig. S57). In all instances, GH and PL families with >50% horizontally transferred transcripts contributed to backbone cleavage of these polymers; although in many polymers, e.g. cellulose, glucoarabinoxylan, and rhamnogalactouronan, multiple different GHs can contribute to backbone cleavage. Similarly, GH, CE, and PL families with >50% horizontally transferred transcripts contributed to 10 out of 13 side-chain-cleaving activities, and 3 out of 5 oligomer-to-monomer breakdown activities (Fig. S57).
Discussion
Here, we present a systematic analysis of HGT patterns in 27 transcriptomes and 4 genomes belonging to the Neocallimastigomycota. Our analysis identified 327 events, representing 2.9-4.1% of genes in examined AGF genomes. Further, we consider these values to be conservative estimates due to the highly stringent criteria employed. Only events with hU of >30 were considered, and all putative events were further subjected to manual inspection and phylogenetic tree construction to confirm incongruence with organismal evolution and bootstrap-supported affiliation to donor lineages. Further, events identified in less than 50% of strains in a specific genus were excluded, and parametric gene composition approaches were implemented in conjunction with sequence-based analysis. Nevertheless, the observed HGT frequency in this study is in contrast to the reported paucity in HGT events across the fungal kingdom (Alexander et al 2016b, Marcet-Bouben and Gabaldon 2010), and hence the Neocallimastigomycota represent a notable exception within the Mycota. Multiple factors could be postulated to account for the observed high HGT frequency in AGF. The sequestration of AGF into the anaerobic, prokaryotes-dominated herbivorous gut necessitated the implementation of the relatively faster adaptive mechanisms for survival in this new environment, as opposed to the slower strategies of neofunctionalization and gene birth. Indeed, niche adaptation and habitat diversification events are widely considered important drivers for HGT in eukaryotes (de Koning et al 2000, Keeling and Palmer 2008, Ricard et al 2006, Schönknecht et al 2013a). Further, AGF are constantly exposed to a rich milieu of cells and degraded DNA in the herbivorous gut. Such close physical proximity between donors/ extracellular DNA and recipients is also known to greatly facilitate HGT (Beiko et al 2005, Moliner et al 2010, Shterzer and Mizrahi 2015). Finally, AGF release asexual motile free zoospores into the herbivorous gut as part of their life cycle (Gruninger et al 2014). According to the weak-link model (Huang 2013), these weakly protected and exposed structures provide excellent entry point of foreign DNA to eukaryotic genomes. It is important to note that AGF zoospores also appear to be naturally competent, capable of readily uptaking nucleic acids from their surrounding environment (Calkins et al 2016).
The distribution of HGT events across various AGF taxa (Fig. 2), identities of HGT donors (Fig. 3), and abilities imparted (Figs. 4-5) could offer important clues regarding the timing and impact of HGT on Neocallimastigomycota evolution. The majority of events (67.3%) were Neocallimastigomycota-wide and were mostly acquired from lineages known to inhabit the herbivorous gut, e.g. Firmicutes, Proteobacteria, Bacteroidetes, and Spirochaetes (Figs. 2-3). This pattern strongly suggests that such acquisitions occurred post (or concurrent with) AGF sequestration into the herbivorous gut, but prior to AGF genus level diversification. Many of the functions encoded by these events represented novel functional acquisitions that impart new abilities, e.g. galactose metabolism, methyl glyoxal detoxification, pyruvate fermentation to d-lactate and ethanol, and chloramphenicol resistance (Fig. 3). Others represented acquisition of novel genes or pfams augmenting existing capabilities within the AGF genomes, e.g. acquisition of GH5 cellulases to augment the fungal GH45, acquisition of additional GH1 and GH3 beta gluco- and galactosidases to augment similar enzymes of apparent fungal origin in AGF genomes (Fig. 6-7, Fig. S47). Novel functional acquisition events enabled AGF to survive and colonize the herbivorous gut by: 1. Expanding substrate-degradation capabilities (Fig. 5a, 6, 7, S5-S17, Table S3), hence improving fitness by maximizing carbon and energy acquisition from available plant substrates, 2. Providing additional venues for electron disposal via lactate, ethanol, and hydrogen production, and 3. Enabling adaptation to anaerobiosis (Fig. 4, S32-S38, Table S3).
A smaller number of observed events (n=47) were genus-specific (Fig. 2, Table S3). This group was characterized by being significantly enriched in CAZymes (53.2% of genus-specific horizontally transferred events have a predicted CAZyme function, as opposed to 27.5% in the overall HGT dataset), and being almost exclusively acquired from donors that are known to inhabit the herbivorous gut (Creevey et al 2014) (35 out of the 47 events were acquired from the orders Clostridiales, Bacillales, Lactobacillales and Negativicutes within Firmicutes, Burkholderiales, and Vibrionales within the Beta- and Gamma-Proteobacteria, Flavobacteriales and Bacteroidales within Bacteroidetes, and the Spirochaetes, Actinobacteria, and Lentisphaerae), or from Viridiplantae (5 out of the 47 events). Such pattern suggests the occurrence of these events relatively recently, in the herbivorous gut post AGF genus level diversification. We reason that the lower frequency of such events is a reflection of the relaxed pressure for acquisition and retention of foreign genes at this stage of AGF evolution.
Finally, few Neocallimastiogomycota-wide HGT events were characterized by donors that are not typically encountered in the herbivorous gut (Comtet-Marre et al 2017, Creevey et al 2014, Neves et al 2017, Qi et al 2011, Ricard et al 2006, Romero-Pérez et al 2011, Tapio et al 2016). Remarkably, many of these donors are marine inhabitants (Table S3, Fig. 3, Fig. S47-S56). In general, these HGT events of marine origin were Neocallimastigomycota-wide (identified in 5-7 genera), and encoded functions that are beneficial for survival in a wide range of habitats (e.g. DNA repair, motility, and signal transduction, with only one metabolism-related event). We reason that this observation could offer interesting clues regarding the pre-gut sequestration ancestor of AGF. The presence of such genes could be a marker of an ancient symbiotic relationship between AGF ancestors and marine eukaryotes prior to fungal terrestrialization. The marine ancestral origin of the kingdom Fungi is currently undisputed (Berbee et al 2017), but the nature and mechanism of the process, e.g. single ancestral terrestrialization followed by diversification, or multiple independent events of terrestrialization, is still unclear. It is notable that the majority (80%) of these HGT events were Neocallimastigomycota-specific, reflecting either a unique symbiotic relationship between AGF ancestors and these marine hosts or their loss in all other currently known basal fungal lineages.
Gene acquisition by HGT necessitates physical contact between donor and recipient organisms. Many of the HGT acquired traits by AGF are acquired from prokaryotes that are prevalent in the herbivorous gut microbiota (Fig. 3). However, since many of these traits are absolutely necessary for survival in the gut, the establishment of AGF ancestors in this seemingly inhospitable habitat is, theoretically, unfeasible. This dilemma is common to all HGT processes enabling niche adaptation and habitat diversification (Eme et al 2017). We put forth two evolutionary scenarios that could explain this dilemma not only for AGF, but also for other gut-dwelling anaerobic microeukaryotes, e.g. Giardia, Blastocystis, and Entamoeba, where HGT was shown to play a vital role in enabling survival in anaerobic conditions (Andersson et al 2003, Eme et al 2017, Grant and Katz 2014). The first is a coevolution scenario in which the progressive evolution of the mammalian gut from a short and predominantly aerobic structure characteristic of carnivores/insectivores to the longer, more complex, and compartmentalized structure encountered in herbivores was associated with a parallel progressive and stepwise acquisition of genes required for plant polymers metabolism and anaerobiosis by AGF ancestors, hence assuring its survival and establishment in the current herbivorous gut. The second possibility is that AGF ancestors were indeed acquired into a complex and anaerobic herbivorous gut, but initially represented an extremely minor component of the gut microbiome and survived in locations with relatively higher oxygen concentration in the alimentary tract e.g. mouth, saliva, esophagus or in micro-niches in the rumen where transient oxygen exposure occurs. Subsequently, HGT acquisition has enabled the expansion of their niche, improved their competitiveness and their relative abundance in the herbivorous gut to the current levels.
In conclusion, our survey of HGT in AGF acquisition demonstrates that the process is absolutely crucial for the survival and growth of AGF in its unique habitat. This is not only reflected in the large number of events, massive duplication of acquired genes, and overall high HGT frequency observed in AGF genomes, but also in the nature of abilities imparted by the process. HGT events not only facilitated AGF adaptation to anaerobiosis, but also allowed them to drastically improve their polysaccharide degradation capacities, provide new venues for electron disposal via fermentation, and acquire new biosynthetic abilities. As such, we reason that the process should not merely be regarded as a conduit for supplemental acquisition of few additional beneficial traits. Rather, we posit that HGT enabled AGF to forge a new evolutionary trajectory, resulting in Neocallimastigomycota sequestration, evolution as a distinct fungal lineage in the fungal tree of life, and subsequent genus and species level diversification. This provides an excellent example of the role of HGT in forging the formation of high rank taxonomic lineages during eukaryotic evolution.
Conflict of Interest
The authors declare no conflict of interest.
Acknowledgments
This work has been funded by the NSF-DEB Grant numbers 1557102 to N.Y. and M.E. and 1557110 to J.E.S.