Abstract
Central to understanding early animal evolution are the questions of when and how many times in the ancestry of extant animals “eumetazoan” traits - nervous and digestive systems, striated musculature, and potentially defined mesoderm or its precursors - have arisen. The phylogenetic placement of the only two major animal clades lacking these traits, poriferans (sponges) and placozoans, is crucial to this point, with the former having received much attention in recent years, and the latter relatively neglected. Here, adding new genome assemblies from three members of a previously unsampled placozoan lineage, and including a comprehensive dataset sampling the extant diversity of all other major metazoan clades and choanoflagellate outgroups, we test the positions of placozoans and poriferans using hundreds of orthologous protein-coding sequences. Surprisingly, we find strong support under well-fitting substitution models for a relationship between Cnidaria and Placozoa, contradicting a clade of Bilateria + Cnidaria (= Planulozoa) seen in previous work. This result is stable to Dayhoff 6-state recoding, a strategy commonly used to reduce artefacts from amino acid compositional heterogeneity among taxa, a problem to which the AT-rich Placozoa may be particularly susceptible. We also find that such recoding is sufficient to derive strong support for a first-splitting position of Porifera. In light of these results, it is necessary to reconsider the homology of eumetazoan traits not only between ctenophores and bilaterians, but also between cnidarians and bilaterians. Whatever traits are homologous between these taxa must also have occurred in the evolutionary history of Placozoa (or occur cryptically in modern forms), and the common ancestor of Cnidaria and Bilateria may extend deeper into the Precambrian than is presently recognized.
Introduction
The discovery1 and mid-20th century rediscovery2 of the enigmatic, amoeba-like placozoan Trichoplax adhaerens did much to ignite the imagination of zoologists interested in early animal evolution3. As a microscopic animal adapted to extracellular grazing on the biofilms over which it creeps4, Trichoplax has a simple anatomy suited to exploit passive diffusion for many physiological needs, with only six morphological cell types discernible even to intensive scrutiny5,6, and no muscular, nervous, or digestive systems. Reproduction is apparently primarily through asexual fission and somatic growth, although there is genetic evidence of recombination7 and early abortive embryogenesis has been described8,9, with speculation that sexual phases of the life cycle occur only under poorly-understood field conditions10.
Given their simple morphology and dearth of embryological clues, molecular data are crucial in placing placozoans phylogenetically. Early phylogenetic analyses through nuclear rRNA and mitochondrial marker genes gave somewhat contradictory and/or poorly supported placements of Placozoa in the larger metazoan tree11–⇓13. However, analyses of these markers strongly rejected some long-standing hypotheses, such as the notion that placozoans may be highly modified cnidarians14. Another important result from mitochondrial marker analyses was the revelation of a large degree of molecular diversity in placozoan isolates from around the globe, clearly indicating the existence of many morphologically cryptic haplotypes presumably corresponding to species, which are partitioned into several divergent clades showing dramatic variations in the structure and size of complete mitogenomes10,15,16. In particular, haplotypes appear to be divided between two divergent groups (clades A & B) with up to 27% genetic distance in 16S rRNA alignments17. An apparent definitive answer to the question of placozoan affinities was provided by production of a reference nuclear genome assembly from Trichoplax adhaerens haplotype H1, a clade B representative7, which strongly supported a position relatively far from the metazoan root, as the sister group of a clade of Bilateria and Cnidaria (sometimes called Planulozoa). However, this effort also revealed a surprisingly advanced (more accurately, bilaterian-like18) developmental gene toolkit in placozoans, a paradox for such a simple animal.
As metazoan phylogenetics has pressed onward into the genomic era, perhaps the largest controversy has been the debate over the identity of the sister group to the remaining metazoans, traditionally thought to be Porifera, but considered to be Ctenophora by Dunn et al.19 and subsequently by additional studies20–⇓⇓23. Others have suggested this result arises from inadequate taxon sampling, flawed matrix husbandry, and use of poorly fitting substitution models24–⇓⇓27. A third view has emphasized that using different sets of genes can lead to different conclusions, with only a small number sometimes sufficient to drive one result or another28,29. This controversy, regardless of its eventual resolution, has spurred serious contemplation of possibly independent origins of several hallmark eumetazoan traits such as striated muscle, digestive systems, and particularly, nervous systems21,30–⇓⇓⇓⇓35.
In contrast to these upsets, as new genomic and transcriptomic data from non-bilaterians and metazoan outgroups have accrued, the position of placozoans as sister group to Planulozoa has remained relatively stable (a poorly supported result in a single early analysis notwithstanding36). However, to date, the reference H1 haplotype assembly has represented the sole branch of this deeply branching metazoan clade in almost all analyses, and the role of model violations such as nonstationarity of amino acid frequency has been inadequately explored. Here, we provide a novel test of the phylogenetic position of placozoans, adding newly sequenced genomes from three clade A placozoans, spanning the root of this divergent second group in the phylum15. We analyse them jointly under well-fitting models with a wide taxonomic diversity of available genomes and transcriptomes, thereby sampling the total available diversity of major metazoan clades and their closest outgroups: Bilateria, Cnidaria, Porifera, Ctenophora, and Choanoflagellata.
Results and Discussion
Orthology assignment on sets of predicted proteomes derived from 59 genome and transcriptome assemblies yielded 4,294 orthogroups with at least 20 sequences each, sampling all 5 major metazoan clades and outgroups, from which we obtained 1,388 well-aligned orthologues. Within this set, individual maximum-likelihood (ML) gene trees were constructed, and a set of 430 most-informative orthologues were selected on the basis of tree-likeness scores37. This yielded an amino acid matrix of 73,547 residues with 37.55% gaps or missing data, with an average of 371.92 and 332.75 orthologues represented for Cnidaria and Placozoa, respectively (with a maximum of 383 orthologues present for the H4 clade representative; Figure 1).
Surprisingly, our Bayesian analyses of this matrix place Cnidaria and Placozoa as sister groups excluding Bilateria with full posterior probability under the general site-heterogeneous CAT+GTR+Г4 model (Figure 1). Under ML approximation of the CAT mixture model family38 with LG substitution matrices (Figure S1), we again recover Cnidaria+Placozoa, but support for this clade is strong only when secondary NNI search correction on UFbootstrap trees39 is not performed (Figure S1), indicating possible model misspecification (unsurprising, because models with fixed substitution matrices such as CAT+LG have been shown to fit less well than the more general CAT+GTR model in cross-validation tests40). Intriguingly, both Bayesian and ML analyses show little internal branch diversity within Placozoa, indicating either a dramatic deceleration of substitution rates within the crown group or, more likely, a recent extinction of all but one lineage in the ancestry of modern placozoans. Accordingly, deleting all clade A placozoans from our analysis has no effect on topology and only a marginal effect on support in ML analysis (Figure S1).
Compositional heterogeneity of amino acid frequencies along the tree is a source of phylogenetic error not modelled by even complex site-heterogeneous substitution models such as CAT+GTR40–⇓⇓43. Furthermore, previous analyses28 have shown that placozoans and choanoflagellates in particular, both of which taxa our matrix samples intensively, deviate strongly from the mean amino acid composition of Metazoa, perhaps as a result of genomic GC content discrepancies. A posterior predictive simulation test of amino acid stationarity from our own converged CAT+GTR+Γ4 analyses confirms this (Table S1), showing summed absolute differences between global and taxon-specific amino acid frequencies (z-scores) outside the simulated null distribution for all taxa (p <0.05) in the real dataset, with particularly extreme values (z > 60) seen for most placozoans, choanoflagellates, and the calcisponge Leucosolenia complicata An attempt to remove compositionally heterogeneous sites through a matrix-trimming algorithm (BMGE’s χ2-test based operation called with the ‘-s FAST’ flag;44) removed all but 9,776 amino acid sites from our matrix (downstream analyses therefore not undertaken), further confirming that indeed strong compositional heterogeneity among taxa is likely present. As an alternative step to at least partially ameliorate compositional bias, we therefore recoded the amino-acid matrix into 6 “Dayhoff” categories proposed to encompass biochemically similar residues, a strategy previously shown to reduce the effect of compositional variation among taxa, albeit information is lost45,46. Analysis of this recoded matrix under the CAT+GTR model again recovered full support (pp=1) for Cnidaria+Placozoa (Figure 2). Indeed, in this analysis the only major change from the full alphabet analyses is in the relative positions of Ctenophora and Porifera, with the latter here constituting the sister group to the remaining Metazoa with full support. Accordingly, we suggest that compositional heterogeneity may be driving at least some of the discrepancies in the current debate over the basal most divergences of the metazoan tree, in both our analyses and others.
Concordance among gene trees (or the lack thereof) has also been emphasized as an important alternative metric of phylogenetic confidence in large-scale inference29,47. We used novel quartet-based statistics48 to measure internode certainty among the 430 genes along the CAT+GTR+Г4 tree. The Cnidaria+Placozoa clade had Lowest Quartet-IC (LQ-IC) and Extended Quadripartition (EQP-IC) scores close to 0 (Figure 1), indicating little agreement among gene trees in favour of this clade - but also no strong preference for any particular alternative topology, such as Planulozoa. We interpret this to indicate that support for at least some ancient relationships emerges only in combined analyses, and can be masked at the level of the individual gene, where errors in gene tree estimation may predominate49. Indeed, many other major clades whose monophyly is not in doubt (e.g., Bilateria, Porifera) also have EQP-IC scores close to 0.
With only 6 morphologically salient cell types known in the asexually dividing adults6, placozoans are frequently dubbed the simplest extant metazoans. Evolutionary interpretations of this simplicity have been diametrically opposed. Some favour the possibility that the simplicity is plesiomorphic50,51, inherited from a common metazoan ancestor which had likewise not yet developed a basement membrane, musculature, or nervous, excretory, and internal digestive systems. Others propose that Placozoa must have undergone secondary simplification, indicating its scant significance to understanding any evolutionary path outside its own. The phylogenetic position of this taxon is key to this debate.
The position we have recovered for Placozoa as a sister group to Cnidaria is consistent with many rRNA-centric phylogenetic analyses11,12 and may be said to resolve the paradox of a taxon whose gene complement closely resembles the inferred complexity of that of the cnidarian-bilaterian ancestor, yet which had been previously regarded as splitting off before the divergence of these two taxa7. It is tempting to interpret the existence of a Cnidaria+Placozoa clade as supporting the hypothesis of secondary loss of eumetazoan traits in Placozoa, particularly when considered jointly with our analysis (Figure 2) and others25–⇓27 which support Porifera as the sister group to all other metazoans. An alternative and probably more controversial interpretation of this relationship is that traits commonly held to be homologous between cnidarians and bilaterians - e.g., nervous systems52 - might have been independently developed in both taxa from an ancestor with a grade of organisation similar in some respects to modern placozoans. Just as the phylogenetic controversy over the position of ctenophores has prompted many to seriously consider possibly independent origins of some of these traits in this taxon, we suggest that a similar critical logic be applied towards presumed homologies between Bilateria and Cnidaria. Indeed, there are already indications of independent origins of striated muscle in these two clades35. In considering the relative advantages of hypotheses of primary vs. secondary absence of eumetazoan features in Placozoa, we emphasise that these two interpretations are not mutually exclusive: homology should be examined point-by-point for individual characters, avoiding the broad conclusion that any given lineage is ancestrally simple vs secondarily simplified at a whole-organism level. We see great promise for single-cell RNA-seq and its ability to achieve unbiased cell type identification as a means of resolving homologies across such widely divergent taxa53.
Much developmental work has already been conducted on cnidarian model organisms, especially Nematostella vectensis and Hydra magnipapillata, on the assumption that such work would help understand the condition from which the bilaterian lineage evolved31,52,54–⇓⇓⇓58. This phylogeny supporting Placozoa + Cnidaria implies that both are equally important outgroups to understanding the bilaterian ancestor; much more experimental work therefore needs to be directed to placozoans. It may be especially fruitful to compare Placozoa and Xenacoelomorpha, the latter now firmly understood (from studies with taxon sampling adequate to address this question) to form the sister group of all remaining Bilateria59,60.
This result also implies, however, that proposed “deep” developmental correspondences between Cnidaria and Bilateria must also extend to Placozoa, at least ancestrally. For instance, much work in cnidarian models has focused on identifying germ layer homology with undisputed triploblastic animals, resulting for instance in the identification of conserved mechanisms of mesoderm specification during gastrulation in circumblastoporal cells expressing brachyury and regulated by the BMP-cWNT signalling55,58,61; brachyury-expressing cells are also found peripherally in adult placozoans62. Most intriguingly, a new model of germ layer homology has been recently put forward on the basis of lineage tracing and transcription factor expression, suggesting that the bilaterian endoderm is best understood as homologous to cnidarian and perhaps ctenophore pharyngeal ectoderm in particular63. Extending this model, it may be possible to interpret the secretory/digestive cell layer in placozoans, comprising lower (ventral) epithelial cells, digestive lipophil cells, and gland cells6 as homologous to cnidarian pharyngeal ectoderm and by extension, bilaterian endoderm (Figure 3). In effect, under this hypothesis, one may interpret the entire placozoan lower epithelium as “pharyngeal”, with the margin of the body representing the outline of the mouth (see also36). In this light, the original interpretation of the internal, contractile, but also digestive4 placozoan fibre cells as homologous to bilaterian mesoderm3,51, and by extension, cnidarian mesendoderm, gains traction.
Developmental observations have also been used to argue that cnidarians may have a cryptic and similarly specified bilateral symmetry inherited from a common ancestor with Bilateria64,65. If true, this implies that bilateral symmetry must also have been present in the stem placozoan lineage, although modern placozoans have not been observed to show any morphological or behavioural axis orthogonal to the plane of their bodies. It is interesting in this light to revisit the paleobiological suggestion that the iconic Ediacaran fossils of the genus Dickinsonia and related forms may be related to Placozoa66 on the basis of a broadly similar mode of external grazing, through the lower part of the animal, on microbial mats that were widespread on the late Neoproterozoic ocean floor. The fact that such fossil forms evince a pronounced longitudinal axis of symmetry may be reconciled to their proposed affiliation to the modern anaxial Placozoa if a history of bilateral symmetry was indeed present in the placozoan stem. However, the utility of feeding mode as a phylogenetic character is limited, and in many other respects (e.g., highly regulated isometric growth of Dickinsonia by terminal addition67,68), placozoans clearly diverge from these Precambrian forms. Regardless of the specific affiliation of Placozoa to any given fossil taxon, one clear paleobiological consequence of our results is that the inferred divergence time between Cnidaria (+Placozoa) and Bilateria is likely to be much earlier than currently recognized in molecular clock studies69–⇓⇓72, potentially pushing deep into the Cryogenian period, before the planet experienced one or more “Snowball Earth” glaciations, and adding several tens of millions of years for bilaterians to acquire the traits that define the most diverse major group of animal life on the planet.
Materials and Methods
Sampling, sequencing, and assembling reference genomes from Clade A placozoans
Haplotype H4 and H6 placozoans were collected from water tables at the Kewalo Marine Laboratory, University of Hawaii-Manoa, Honolulu, Hawaii in October 2016. Haplotype H11 placozoans were collected from the Mediterranean ‘Anthias’ show tank in the Palma de Mallorca Aquarium, Mallorca, Spain in June 2016. All placozoans were sampled by placing glass slides suspended freely or mounted in cut-open plastic slide holders into the tanks for 10 days10. Placozoans were identified under a dissection microscope and single individuals were transferred to 500 μl of RNAlater, stored as per manufacturer’s recommendations.
DNA was extracted from 3 individuals of haplotype H11 and 5 individuals of haplotype H6 using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany). DNA and RNA from three haplotype H4 individuals were extracted using the AllPrep DNA/RNA Micro Kit (Qiagen), with both kits used according to manufacturer’s protocols.
Illumina library preparation and sequencing was performed by the Max Planck Genome Centre, Cologne, Germany. In brief, DNA/RNA quality was assessed with the Agilent 2100 Bioanalyzer (Agilent, Santa Clara, USA) and the genomic DNA was fragmented to an average fragment size of 500 bp. For the DNA samples, the concentration was increased (MinElute PCR purification kit; Qiagen, Hilden, Germany) and an Illumina-compatible library was prepared using the Ovation® Ultralow Library Systems kit (NuGEN, Leek, The Netherlands) according the manufacturer’s protocol. For the haplotype H4 RNA samples, the Ovation RNA-seq System V2 (NuGen, 376 San Carlos, CA, USA) was used to synthesize cDNA and sequencing libraries were then generated with the DNA library prep kit for Illumina (BioLABS, Frankfurt am Main, Germany). All libraries were size selected by agarose gel electrophoresis, and the recovered fragments quality assessed and quantified by fluorometry. For each DNA library 14-75 million 100 bp or 150 bp paired-end reads were sequenced on Illumina HiSeq 2500 or 4000 machines (Illumina, San Diego, U.S.A); for the haplotype H4 RNA libraries 32-37 million single 150 bp reads were obtained.
For assembly, adapters and low-quality reads were removed with bbduk (https://sourceforge.net/projects/bbmap/) with a minimum quality value of two and a minimum length of 36 and single reads were excluded from the analysis. Each library was error corrected using BayesHammer73. A combined assembly of all libraries for each haplotype was performed using SPAdes 3.6274. Haplotype 4 and H11 data were assembled from the full read set with standard parameters and kmers 21, 33, 55, 77, 99. The Haplotype H6 data was preprocessed to remove all reads with an average kmer coverage <5 using bbnorm and then assembled with kmers 21, 33, 55 and 77.
Reads from each library were mapped back to the assembled scaffolds using bbmap (https://sourceforge.net/projects/bbmap/) with the option fast=t. Scaffolds were binned based on the mapped read data using MetaBAT75 with default settings and the ensemble binning option activated (switch-B 20). The Trichoplax host bins were evaluated using metawatt76 based on coding density and sequence similarity to the Trichoplax H1 reference assembly (NZ ABGP00000000.1). The bin quality metrics were computed with BUSCO277 and QUAST78.
Predicting proteomes from transcriptome and genome assemblies
Predicted proteomes from species with published draft genome assemblies were downloaded from the NCBI Genome portal or Ensembl Metazoa in June 2017. For Clade A placozoans, host metagenomic bins were used directly for gene annotation. For the H6 and H11 representatives, annotation was entirely ab initio, performed with GeneMark-ES79; for the H4 representative, total RNA-seq libraries obtained from three separate isolates (SRA accessions XXXX, XXXX, and XXXX) were mapped to genomic contigs with STAR v2.5.3a80 under default settings; merged bam files were then used to annotate genomic contigs and derive predicted peptides with BRAKER v1.981 under default settings. Choanoflagellate proteome predictions used in27) were provided as unpublished data from Dan Richter. Peptides from a Leucosolenia complicata transcriptome assembly were downloaded from compagen.org. Peptide predictions from Nemertoderma westbladi and Xenoturbella bocki as used in59) were provided directly by the authors. The transcriptome assembly (raw reads unpublished) from Euplectella aspergillum was provided by the Satoh group, downloaded from (http://marinegenomics.oist.jp/kairou/viewer/info?project_id=62). Predicted peptides were derived from Trinity RNA-seq assemblies (multiple versions released 2012-2016) as described by Laumer et al.82 for the following sources/SRA accessions: : Porifera: Petrosia ficiformis: SRR504688, Cliona varians: SRR1391011, Crella elegans: SRR648558, Corticium candelabrum: SRR504694-SRR499820-SRR499817, Spongilla lacustris: SRR1168575, Clathrina coriacea: SRR3417192, Sycon coactum: SRR504689-SRR504690, Sycon ciliatum: ERR466762, Ircinia fasciculata, Chondrilla caribensis (originally misidentified as Chondrilla nucula) and Pseudospongosorites suberitoides from (https://dataverse.harvard.edu/dataverse/spotranscriptomes); Cnidaria: Abylopsis tetragona: SRR871525, Stomolophus meleagris: SRR1168418, Craspedacusta sowerbyi: SRR923472, Gorgonia ventalina: SRR935083; Ctenophora: Vallicula multiformis: SRR786489, Pleurobrachia bachei: SRR777663, Beroe abyssicola: SRR777787; Bilateria: Limnognathia maerski: SRR2131287. All other peptide predictions were derived through transcriptome assembly as paired-end, unstranded libraries with Trinity v2.4.083, running with the -trimmomatic flag enabled (and all other parameters as default), with peptide extraction from assembled transcripts using TransDecoder v4.0.1 with default settings. For these species, no ad hoc isoform selection was performed: any redundant isoforms were removed during tree pruning in the orthologue determination pipeline (see below).
Orthologue identification and alignment
Predicted proteomes were grouped into top-level orthogroups with OrthoFinder v1.0.684, run as a 200-threaded job, directed to stop after orthogroup assignment, and print grouped, unaligned sequences as FASTA files with the ‘-os’ flag. A custom python script (‘renamer.py’) was used to rename all headers in each orthogroup FASTA file in the convention [taxon abbreviation] + ‘@’ + [sequence number as assigned by OrthoFinder SequenceIDs.txt file], and to select only those orthogroups with membership comprising at least one of all five major metazoan clades plus outgroups, of which exactly 4,300 of an initial 46,895 were retained. Scripts in the Phylogenomic Dataset Construction pipeline85 were used for successive data grooming stages as follows: Gene trees for top-level orthogroups were derived by calling the fasta_to_tree.py script as a job array, without bootstrap replicates; six very large orthogroups did not finish this process. In the same directory, the trim_tips.py, mask_tips_by_taxonID_transcripts.py, and cut_long_internal_branches.py scripts were called in succession, with ‘./ .tre 10 10’, ‘./ ./ y’, and ‘./ .mm 1 20 ./’ passed as arguments, respectively. The 4,267 subtrees generated through this process were concatenated into a single file and 1,419 orthologues were extracted with UPhO86. Orthologue alignment was performed using the MAFFT v7.271 ‘E-INS-i’ algorithm, and probabilistic masking scores were assigned with ZORRO87, removing all sites in each alignment with scores below 5 as described previously82. 31 orthologues with retained lengths less than 50 amino acids were discarded, leaving 1,388 well-aligned orthologues.
Matrix assembly
A full concatenation of all retained orthogroups was performed with the ‘geneStitcher.py’ script distributed with UPhO available at https://github.com/ballesterus/PhyloUtensils. However, such a matrix would be too large for tractably inferring a phylogeny under well-fitting mixture models such as CAT+GTR; therefore we used MARE v0.1.237 to extract an informative subset of genes using tree-likeness scores, running with ‘-t 100’ to retain all taxa and using ‘-d 1’ as a tuning parameter on alignment length. This yielded our 430-orthologue, 73,547 site matrix. Parallel matrices were constructed by selecting subsets of genes evincing strong phylogenetic signals, i.e. those whose gene trees had mean bootstrap scores with a threshold at 50 and 60%47; however, inferences performed on these matrices yielded trees judged broadly comparable to those from the matrix constructed with MARE (results not shown).
Phylogenetic Inference
Individual ML gene trees were constructed on all 1,388 orthologues in IQ-tree v1.6beta, with ‘-m MFP -b 100’ passed as parameters to perform automatic model selection and 100 standard nonparametric bootstraps on each gene tree. ML inference on the concatenated matrix (Figure 1 – Supplemental Figure 1) was performed passing ‘-m CAT20+LG+FO+R4 -bb 1000’ as parameters to specify the mixture model and retain 1000 trees for ultrafast bootstrapping; the ‘-bnni’ flag was used as described for analyses incorporating NNI correction39. Bayesian inference under the CAT+GTR+Г4 model was performed in PhyloBayes MPI v1.6j 43 with 20 cores each dedicated to 4 separate chains, run for 2885-3222 generations with the ‘-dc’ flag applied to remove constant sites from the analysis, and using a starting tree derived from the FastTree2 program88. The two chains used to generate the posterior consensus tree summarized in Figure 1 converged on exactly the same tree in all MCMC samples after removing the first 2000 generations as burn-in. A posterior predictive simulation test of compositional heterogeneity was performed using one of these chains, also with 2000 generations of burn-in, in PhyloBayes MPI v1.7a. Analysis of a Dayhoff-6-states recoded matrix in CAT+GTR+Г4 was performed with the serial PhyloBayes program v4.1c, with ‘-dc -recode dayhoff6’ passed as flags. Six chains were run from 1441-1995 generations; two chains showed a maximum bipartition discrepancy (maxdiff) of 0.042 after removing the first 1000 generations as burn-in (Figure 2). QuartetScores48 was used to measure internode certainty metrics including the reported EQP-IC, using the 430 gene trees from those orthologues used to derive the matrix as evaluation trees, and using the amino acid CAT+GTR+Г4 tree as the reference to be annotated (Figure 1).
Competing Interests
The authors declare that they have no conflicting interests relating to this work.
Source data availability
SRA accession codes, where used, and all alternative sources for sequence data (e.g. individually hosted websites, personal communications), are listed above in the Materials and Methods section. A DataDryad accession is available at DOI: XXXXX, which makes available all scripts, orthogroups, multiple sequence alignments, phylogenetic program output, and raw host proteomes inputted to OrthoFinder. Metagenomic bins containing placozoan host contigs used to derive proteomes from H4, H6 and H11 isolates are also provided in this accession.
Author Contributions
CEL assembled most libraries, conducted all analyses starting from predicted peptides onwards, and wrote the initial draft. HGV collected clade A placozoan isolates from Majorca, maintained Hawaiian isolates prior to sequencing at the MPI for Marine Microbiology, submitted purified nucleic acids for amplification and sequencing, and assembled and provided binned metagenomic contigs. MH and VBP assisted with collection of the original Hawaiian placozoan isolates, and H4 and H6 samples used for sequencing were derived from clones originally established by MGH at the Kewalo Marine Labs. AR generated new transcriptomic data for many sponge taxa. CEL, JCM and GG conceptualized and initiated this work, and supervised throughout. All authors read and contributed to the final manuscript.
Acknowledgements
Nicole Dubilier (Max Planck Institute for Marine Microbiology) contributed resources that permitted the collection and assembly of draft Trichoplax genomes, which were amplified and sequenced at the Max Planck-Genome-Centre Cologne. Dan Richter (King lab) and Kanako Hisata (Satoh lab) provided access to unpublished transcriptomes and peptide predictions. The EMBL-EBI Systems Infrastructure team provided essential support on the EBI compute cluster.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵