Abstract
Standard models of sex chromosome evolution propose that recombination suppression leads to the degeneration of the heterogametic chromosome, as is seen for the Y chromosome in mammals and the W chromosome in most birds. Unlike other birds, palaeognaths (ratites and tinamous) possess large non-degenerate regions on their sex chromosomes (PARs or pseudoautosomal regions), despite sharing the same sex determination region as neognaths (all other birds). It remains unclear why the large PARs of palaeognaths are retained over more than 100 MY of evolution, and the impact of these large PARs on sex chromosome evolution. To address this puzzle, we analysed Z chromosome evolution and gene expression across 12 palaeognaths, several of whose genomes have recently been sequenced. We confirm at the genomic levels that most palaeognaths (excepting some species of tinamous) retain large PARs. As in neognaths, we find that all palaeognaths have incomplete dosage compensation on the regions of the Z chromosome homologous to degenerated portions of the W (differentiated regions or DRs), but we find no evidence for enrichments of male-biased genes in PARs. We find limited evidence for increased evolutionary rates (faster-Z) either across the chromosome or in DRs for most palaeognaths with large PARs, but do recover signals of faster-Z evolution similar to neognaths in tinamou species with mostly degenerated W chromosomes (small PARs). Unexpectedly, in some species PAR-linked genes evolve faster on average than genes on autosomes. Increased TE density and longer introns in PARs of most palaeognaths compared to autosomes suggest that the efficacy of selection may be reduced in palaeognath PARs, contributing to the faster-Z evolution we observe. Our analysis shows that palaeognath Z chromosomes are atypical at the genomic level, but the evolutionary forces maintaining largely homomorphic sex chromosomes in these species remain elusive.
Introduction
Sex chromosomes are thought to evolve from autosomes that acquire a sex determination locus (Bull 1983). Subsequent suppression of recombination between the X and Y (or the Z and W) chromosomes leads to the evolutionary degeneration of the sex-limited (Y or W) chromosome. Theoretical models predict that suppression of recombination will be favored so that the sexually antagonistic alleles that are beneficial in the heterogametic sex can be linked genetically to the sex determination locus (Rice 1987; Ellegren 2011). Recombination suppression is thought to be initiated by inversions, which can occur multiple times in the course of sex chromosome evolution (Lahn and Page 1999; Bergero and Charlesworth 2009; Cortez et al. 2014; Zhou et al. 2014; Wright et al. 2016). Despite differences in their autosomal origins and heterogamety, eutherian mammals and neognathous birds followed similar but independent trajectories of sex chromosome evolution (Graves 2015; Bellott et al. 2017).
However, this model of sex chromosome evolution seems incompatible with patterns in many other vertebrate lineages. Henophidian snakes (boas) are thought to have ZW chromosomes that have remained homomorphic for about 100 MY (Vicoso, Emerson, et al. 2013), although a recent study suggests a transition from ZW to XY system may have occurred (Gamble et al. 2017). Many lineages in fish and non-avian reptiles also possess homomorphic sex chromosomes, in most cases because the sex chromosomes appear to be young due to frequent sex chromosome turnover (Bachtrog et al. 2014). In some species of frogs, homomorphic sex chromosomes appear to be maintained by occasional XY recombination in sex-reversed XY females (the ‘fountain of youth’ model), which is possible if recombination suppression is a consequence of phenotypic sex rather than genotype (Perrin 2009; Dufresnes et al. 2015; Rodrigues et al. 2018).
Palaeognathous birds (Palaeognathae), which include the paraphyletic and flightless ratites and the monophyletic tinamous, and comprise sister group to all other birds, also retain largely or partially homomorphic sex chromosomes (de Boer 1980; Ansari et al. 1988; Ogawa et al. 1998; Nishida-Umehara et al. 1999; Pigozzi and Solari 1999; Stiglec et al. 2007; Tsuda et al. 2007; Janes et al. 2009; Pigozzi 2011), albeit with some exceptions (Zhou et al. 2014). These species share the same ancestral sex determination locus, DMRT1, with all other birds, suggesting its origin about 150 million years ago (Bergero and Charlesworth 2009; Yazdi and Ellegren 2014). Because recombination occurs in both males and females in birds, the ‘fountain of youth’ model is also not applicable in this case. Thus, the homomorphic sex chromosomes in palaeognaths must be old.
The reasons why palaeognath sex chromosomes have not degenerated are obscure, although two hypotheses have been proposed. Based on an excess of male-biased gene expression in the recombining pseudo-autosomal region, Vicoso and colleagues (Vicoso, Emerson, et al. 2013) suggested that sexual antagonism is resolved by sex-biased expression without recombination suppression and differentiation of Z and W sequences in emu. Alternatively, lack of dosage compensation, which in mammals and other species normalizes expression of genes on the hemizygous chromosome between the homogametic and heterogametic sex, has been proposed as a mechanism that could arrest the degeneration of the W chromosome due to selection to maintain dosage-sensitive genes (Adolfsson and Ellegren 2013). Although these hypotheses are compelling, they have only been tested in single-species studies and without high quality genomes. A broader study of palaeognathous birds is therefore needed for comprehensive understanding of the unusual evolution of their sex chromosomes.
Degeneration of sex-limited chromosomes (the W or the Y) leads to the homologous chromosome (the Z or the X) becoming hemizygous in the heterogametic sex. Numerous studies have shown that one common consequence of this hemizygosity is that genes on the X or Z chromosome typically evolve faster on average than genes on the autosomes (Charlesworth et al. 1987; Meisel and Connallon 2013).The general pattern of faster-X or faster-Z protein evolution has been seen in many taxa, including Drosophila (Charlesworth et al. 1987; Baines et al. 2008; Avila et al. 2014; Charlesworth et al. 2018), birds (Mank et al. 2007; Mank, Nam, et al. 2010), mammals (Torgerson 2003; Lu and Wu 2005; Kousathanas et al. 2014) and moths (Sackton et al. 2014). One primary explanation for this is that recessive beneficial mutations are immediately exposed to selection in the heterogametic sex, leading to more efficient positive selection (Charlesworth et al. 1987; Vicoso and Charlesworth 2006; Mank, Vicoso, et al. 2010). Alternatively, the degeneration of the Y or W chromosomes results in the reduction of the effective population size of the X or Z chromosomes relative to the autosomes (because there are 3 X/Z chromosomes for every 4 autosomes in a diploid population with equal sex ratios). This reduction in the effective population size can increase the rate of fixation of slightly deleterious mutations due to drift (Mank, Vicoso, et al. 2010; Mank, Nam, et al. 2010). In both scenarios, faster evolution of X- or Z-linked genes is expected.
The relative importance of these explanations varies across taxa. In both Drosophila and mammals, faster evolutionary rates of X-linked genes seem to be driven by more efficient positive selection for recessive beneficial alleles in males (Connallon 2007; Meisel and Connallon 2013). For female-heterogametic taxa, the evidence is mixed. In Lepidoptera there is evidence that faster-Z evolution is also driven by positive selection (Sackton et al. 2014) or is absent entirely (Rousselle et al. 2016), whereas in birds, increased fixation of slightly deleterious mutations due to reduced Ne is likely a major factor driving faster-Z evolution (Mank, Nam, et al. 2010; Wang et al. 2014; Wright et al. 2015).
The limited degeneration of the W chromosome in palaeognaths makes these species an ideal system to further test the causes of faster-Z evolution in birds. For many palaeognaths, a large proportion of the sex chromosomes retains homology and synteny between the Z and the W; these regions are referred to as pseudoautosomal regions (PARs) because they recombine in both sexes and are functionally not hemizygous in the heterogametic sex. In PARs, no effect of dominance is expected on evolutionary rates, and as the population size of the PAR is not different from that of autosomes, no increase in fixations of weakly deleterious mutations is expected. Therefore, neither the positive selection hypothesis nor the genetic drift hypothesis is expected to lead to differential evolutionary rates in the PAR compared to autosomes, although other selective forces such as sexually antagonistic selection may impact evolutionary rates in the PAR (Otto et al. 2011; Charlesworth et al. 2014).
With numerous new palaeognath genomes now available (Zhou et al. 2014; Le Duc et al. 2015; Zhang et al. 2015; Sackton et al. 2018), a reevaluation of sex chromosome evolution in palaeognaths is warranted. Here, we investigate faster-Z evolution, dosage compensation and sex-biased expression, to gain a better understanding of the slow evolution of sex chromosomes in ratites. Surprisingly, we did not find evidence for faster-Z evolution for most palaeognaths, even when analyzing only differentiated regions (DRs) that are functionally hemizygous in the heterogametic sex. Instead, we find limited evidence that PARs tend to evolve faster than autosomes. Indirect evidence from the accumulation of transposable elements and larger introns suggests reduced efficacy of selection in both PARs and DRs, potentially because of lower recombination rates compared to similarly sized autosomes.
Based on new and previously published RNA-seq data, we find a strong dosage effect on gene expression, suggesting substantially incomplete dosage compensation as in other birds (Itoh et al. 2010; Adolfsson and Ellegren 2013; Uebbing et al. 2013; Uebbing et al. 2015), but do not recover a previously-reported excess of male-biased expression in the PAR (Vicoso, Kaiser, et al. 2013). Our results suggest that simple models of sex chromosome evolution probably cannot explain evolutionary history of palaeognath sex chromosomes.
Results
Most palaeognaths have large pseudoautosomal regions
To identify Z-linked scaffolds from palaeognath genomes, we used nucmer (Kurtz et al. 2004) to first align the published ostrich Z chromosome (Zhang et al. 2015) to assembled emu scaffolds (Sackton et al. 2018), and then aligned additional palaeognaths (Fig. 1) to emu. We then ordered and oriented putatively Z-linked scaffolds in non-ostrich assemblies into pseudo-chromosomes using the ostrich Z chromosome as a reference (Fig. S1). Visualization of pseudo-chromosome alignments (Fig. S1) showed little evidence for inter-chromosomal translocations, as expected based on the high degree of synteny across birds (Ellegren 2010); an apparent 12Mb autosomal translocation onto the ostrich Z chromosome is a likely mis-assembly (Fig. S2).
We next annotated the pseudoautosomal region (PAR) and differentiated region (DR) of the Z chromosome in each species. The DR is thought to arise as a result of inversions on the Z or W chromosome that suppress recombination between the Z and the W, eventually leading to degeneration of the W-linked sequence inside the inversion and hemizygosity of the homologous Z-linked sequence (the DR). Outside the DR – in the PAR – ongoing recombination between the Z and the W chromosome maintains sequence homology. In the DR, reads arising from the W in females will not map to the homologous region of the Z (due to sequence divergence associated with W chromosome degeneration), while in the PAR reads from both the Z and the W will map to the Z chromosome. Thus, we expect coverage of sequencing reads mapped to the Z chromosome in the DR to be ½ that of the autosomes or PAR in females, logically similar to the approach used to annotate Y and W chromosomes in other species (Chen et al. 2012; Carvalho and Clark 2013; Tomaszkiewicz et al. 2017). We also annotated PAR/DR boundaries using gene expression data. If we assume that complete dosage compensation is absence, as it is in all other birds studied to date (Graves 2014), M/F expression ratios of genes on the Z with degenerated W-linked gametologs (in the DR) should be about twice that of genes with intact W-linked gametologs (in the PAR).
For seven species genomic sequencing data from females, either newly generated in this study (lesser rhea, thicket tinamou, Chilean tinamou) or previously published (emu, ostrich, cassowary, North Island brown kiwi, white-throated tinamou), we annotated PAR and DR regions using coverage (Fig. 1B, Fig. S3). While some variation in coverage attributable to differences in GC content is apparent, the coverage reduction in the DR region is robust (Fig. 1B). For three of these species, we also used newly generated (emu) or published (ostrich, Chilean tinamou) male and female RNA-seq data; using expression ratios to annotate DR/PAR boundaries produced results consistent with coverage analysis in all three of these species (Fig. 1B, Fig. S3, Fig. S4). We used expression ratios alone to demarcate the DR/PAR boundaries in little spotted kiwi and Okarito kiwi (Fig. S4), which we found to be in similar genomic locations in both species, and also in the same locations as the DR/PAR boundary position in North Island brown kiwi. For three species (greater rhea, elegant crested tinamou and great spotted kiwi) with neither female sequencing data nor expression data, we projected the DR/PAR boundary from a closely related species (lesser rhea, Chilean tinamou and little spotted kiwi respectively). Our results corroborate prior cytogenetic studies across palaeognaths and support a large PAR in all species except the Tinaminae (thicket tinamou and white-throated tinamou), which have small PARs and heteromorphic sex chromosomes. PAR sizes in non-Tinaminae palaeognaths range from 32.2 Mb (49% of Z chromosome, in elegant crested tinamou) to 59.3 Mb (73% of Z chromosome, in emu); in contrast, PAR sizes in the Tinaminae and in typical neognaths rarely exceed ~1 Mb (~1.3% of Z chromosome size) (Table S1).
Genes with male-biased expression are not overrepresented in palaeognath PARs
Several possible explanations for the maintenance of old, homomorphic sex chromosomes are related to gene dosage (Adolfsson and Ellegren 2013; Vicoso, Kaiser, et al. 2013). We analyzed RNA-seq data from males and females from five palaeognath species, including newly collected RNA-seq data from three tissues from emu (brain, gonad, and spleen; 3 biological replicates from each of males and females), as well as previously published RNA-seq data from Chilean tinamou (Sackton et al. 2018), ostrich (Adolfsson and Ellegren 2013), kiwi (Ramstad et al. 2016), and additional embryonic emu samples (Vicoso, Kaiser, et al. 2013). For each species we calculated expression levels for each gene with RSEM (Li and Dewey 2011) and DESeq2 (Love et al. 2014), and computed male/female ratios to assess the extent of dosage compensation, although we note that this measure does not always reflect retention of ancestral sex chromosome expression levels in the hemizygous sex (Gu and Walters 2017). Consistent with previous studies in birds (Graves 2014), we find no evidence for complete dosage compensation by this measure. Instead, we see evidence for partial compensation with M/F ratios ranging from 1.19 to 1.68 (Fig. 2A). The extent of dosage compensation seems to vary among species, but not among tissues within species (Fig. S5).
Incomplete dosage compensation poses a challenge for detection of sex-biased genes: higher expression levels of DR-linked genes in males may be due to the incompleteness of dosage compensation rather than sex-biased expression per se. With substantially improved genome assemblies (and PAR/DR annotations) and data from a greater number of species, we reevaluated the observation that there is an excess of male-biased genes in the emu PAR. Previous work with a preliminary genome assembly (Vicoso, Kaiser, et al. 2013) showed evidence for an excess of male-biased genes in the emu PAR and argued that sexually antagonistic effects could be resolved via sex-biased expression instead of recombination suppression and W chromosome degeneration.
Using the PAR/DR annotation inferred in this study, we analyzed both the previously published RNA-seq data from emu embryos and new RNA-seq data from three biological replicates of three adult tissues from both sexes. We find that most emu Z-linked male-biased genes are located on the DR (Fig. 2C), and when DR genes are excluded, we no longer detect an excess of male-biased genes on the Z chromosome of emu (Fig. 2C). For PAR-linked genes, although there was a slight shift of expression toward male-bias in 42-day emu embryonic brain (Fig. 2B), only one gene was differentially expressed in male (Fig. 2C). This lack of genes with male-biased expression in the PAR is largely consistent across other palaeognaths with large PARs, including Chilean tinamou, ostrich and little spotted kiwi, with one exception in the Okarito brown kiwi (Fig. S6). Overall, we see little evidence for accumulation of male-biased genes in palaeognath PARs, and suggest that the lack of degeneration of the emu W chromosome, and likely other palaeognathous chromosomes is probably not due to resolution of sexual antagonism through acquisition of sex-biased genes.
Large PARs are associated with lack of faster-Z evolution in palaeognaths
The unusually large PARs and the variation in PAR size make Palaeognathae a unique model to study faster-Z evolution. To test whether Z-linked genes evolve faster than autosomal genes, we computed branch-specific dN/dS ratios (the ratio of nonsynonymous substitution rate to synonymous substitution rate) using the PAML free-ratio model for protein coding genes (Yang 2007), based on previously published alignments (Sackton et al. 2018). Because macro-chromosomes and micro-chromosomes differ extensively in the rates of evolution in birds (Gossmann et al. 2014; Zhang et al. 2014) (Fig. S7), we include only the macro-chromosomes (chr1 to chr10) in our comparison, and further focus on only chromosome 4 (97 Mb in chicken) and chromosome 5 (63 Mb) to match the size of the Z chromosome, unless otherwise stated.
We included include 23 neognaths and 12 palaeognaths in our analysis. Overall, Z-linked genes in neognaths (with few exceptions) have a significantly higher dN/dS ratio than autosomal (chr 4/5) genes, suggesting faster-Z evolution (Fig. 3). This result is consistent with a previous study involving 46 neognaths (Wang et al. 2014). In contrast to neognaths, the majority of palaeognaths, and all but two species with large PARs, had similar dN/dS ratios for autosomal and Z-linked genes, and thus did not show evidence for faster-Z evolution (Fig. 3).
The lack of a faster-Z effect for palaeognaths with large PARs is perhaps not surprising, since PAR-linked genes are not expected to evolve faster than autosomal genes under standard models of faster-Z evolution. We divided Z-linked genes into those with presumed intact W-linked gametologs (PAR genes) and those with degenerated W-linked gametologs (DR genes). Surprisingly, we see little evidence for faster-Z evolution in palaeognaths even for DR genes: only in cassowary, thicket tinamou and white-throated tinamou do DR genes show accelerated dN/dS and dN relative to autosomes (Fig. 4, Fig. S8). Thicket tinamou and white-throated tinamou possess small PARs typical of neognaths, and faster-Z has also been observed for white-throated tinamou in a previous study (Wang et al. 2014), so faster-DR in these species is expected.
The observation of faster-DR evolution in cassowary (p = 0.009, two-sided permutation test) suggests that faster-DR evolution may not be limited to species with extensive degeneration of the W chromosome (e.g., with small PARs). However, an important caveat is that the cassowary genome (alone among the large-PAR species) was derived from a female individual, which means that some W-linked sequence could have been assembled with the Z chromosome, especially for the region with recent degeneration. This would cause an artefactual increase in divergence rate.
Unexpectedly, in four species of palaeognaths we find evidence that genes in the PAR evolve faster than autosomal genes on chromosomes of similar size (chr4/5), which is not predicted by either the positive selection or genetic drift hypothesis for faster-Z evolution (Fig. 4). The faster-PAR effect shows a lineage-specific pattern, particularly in tinamous where three of four species (white-throated tinamou, Chilean tinamou, elegant-crested tinamou) show faster evolution for PAR-linked genes, and all four species have higher dN in the PAR than autosomes, although not significantly so for the elegant-crested tinamou. The faster-PAR in white-throated tinamou is particularly unexpected because previous studies suggest that small PARs evolve slower in birds (Smeds et al. 2014). The small number of PAR-linked genes in white-throated tinamou (N=9) suggests some caution in interpreting this trend is warranted. The kiwis also show a trend towards faster-PAR evolution, though this is only statistically significant in Okarito brown kiwi (p = 0.010, two-sided permutation test) (Fig. 4). Interestingly, species with evidence for faster-PAR evolution also have suggestive evidence of relatively faster rates of W chromosome degeneration. In particular, tinamous have intermediate or small PARs (Fig. 1), suggesting that sex chromosomes may not be as stable in these species as in ratites. Similarly, while PARs in the little spotted kiwi and great spotted kiwi are large compared to neognaths, they are relatively shorter than in ostrich, emu and cassowary, suggesting additional degeneration of the W chromosomes. Moreover, in North Island brown kiwi, coverage for female reads suggests an on-going degeneration of the W chromosome (Fig. S3). However, further study will be needed to confirm this trend.
Evidence for reduced efficacy of selection on the Z chromosome
The signatures of higher dN and dN/dS we observe in the PARs of tinamous and some other species could be driven by increased fixation of weakly deleterious mutations, if the efficacy of selection is reduced in PARs despite homology with the non-degenerated portion of the W chromosome. One potential marker of the efficacy of selection is the density of transposable elements (TEs), which are thought to increase in frequency when the efficacy of selection is reduced (Rizzon et al. 2002; Lockton et al. 2008). We find that chromosome size, which is correlated with recombination rates in birds (Kawakami et al. 2014), shows a strong positive correlation with TE density (lowest in Okarito brown kiwi, r = 0.90; highest in white-throated tinamou, r = 0.98) (Fig. S9, Table S2). Extrapolating from autosomal data, we would expect PARs (smaller than 50Mb in all species) to have lower TE density than chr5 (~63Mb) or chr4 (~89Mb) if similar evolutionary forces are acting on them to purge TEs. Strikingly, we find that all palaeognaths with large PARs harbor significantly higher TE densities on the PAR than autosomes (Fig. 4B), which suggests reduced purging of TEs on PARs. For DRs, it is unsurprising that TE densities are much higher than in chr4/5 (Fig. 4B) since both reduced recombination rates (due to no recombination in females) and reduced Ne (due to hemizygosity of the DR in females) will reduce the efficacy of selection. In fact, TE densities of the DRs are also higher than those of all macro-chromosomes, as well as those of the PARs (Fig. 4B).
Intron size is probably also under selective constraint (Carvalho and Clark 1999), and in birds smaller introns are likely favored (Zhang and Edwards 2012; Zhang et al. 2014). If this is also the case in palaeognaths, an expansion of intron sizes could suggest reduced efficacy of selection. We compared the intron sizes among PARs, DRs and autosomes across all palaeognaths in our study. Like TE densities, intron sizes show strong positive correlation with chromosome size (lowest in Okarito brown kiwi, r = 0.74; highest in thicket tinamou, r = 0.91) (Fig. S9, Table S2). Except for white-throated tinamou and thicket tinamou, intron sizes of the PARs are larger than those of chr4/5 (p < 8.8e-10, Wilcoxon rank sum test, fig. 4C).
The pattern of larger intron sizes in the PARs remains unchanged when all macro-chromosomes were included for comparison (Fig. S9). Similar to PARs, DRs also show larger intron sizes relative to chr4/5 (p < 0.00081, Wilcoxon rank sum test).
Finally, codon usage bias is often used as proxy for the efficacy of selection and is predicted to be larger when selection is more efficient (Shields et al. 1988). To assess codon usage bias, we estimated effective number of codon (ENC) values, accounting for local nucleotide composition. ENC is lower when codon bias is stronger, and thus should increase with reduced efficacy of selection. As expected, ENC values showed a strong positive correlation with chromosome sizes (Table S2), and are higher for DR-linked genes in most species (although not rheas, the little spotted kiwi, or the Okarito brown kiwi) (Fig. S10). However, for PAR-linked genes, ENC does not suggest widespread reductions in the efficacy of selection: only cassowary and Chilean tinamou exhibited significantly higher ENC values in the PAR, although a trend of higher ENC values can be seen for most species (Fig. S10).
One possible cause of changes in the efficacy of selection in the absence of W chromosome degeneration is a reduction in the recombination rate of the PAR of some species with a large PAR, although a previous study on the collared flycatcher (a neognath species with a very small PAR) showed that the PAR has a high recombination rate (Smeds et al. 2014). Previous work (Bolivar et al. 2016) has shown that recombination rate is strongly positively correlated with GC content of synonymous third positions in codons (GC3s) in birds, so we used GC3s as a proxy for recombination rate in the absent of pedigree or population samples to estimate the rate directly. We find that GC3s are strongly negatively correlated with chromosome size in all palaeognaths (−0.78 ~ −0.91, p-value <= 0.0068) except for ostrich (r=-0.51, p=0.11) (Fig. S9, Fig. S11, Table S2), similar to what was observed in mammals (Romiguier et al. 2010). Recombination rates are also negatively correlated with chromosome sizes in birds (Gossmann et al. 2014; Kawakami et al. 2014) and other organisms (Jensen-Seaman et al. 2004)(Jensen-Seaman et al. 2004)(Jensen-Seaman et al. 2004)(Jensen-Seaman et al. 2004)(Jensen-Seaman et al. 2004), suggesting that GC3s are at least a plausible proxy for recombination rate. In contrast to the results for collared flycatcher, GC3s of palaeognath PARs were significantly lower than those of chr4/5s (p < 2.23e-05, Wilcoxon sum rank test) (Fig. 5A, Fig. S9, Fig. S11), except for white-throated tinamou and thicket tinamou. Inclusion of the other macro-chromosome does not change the pattern (p < 0.0034). Moreover, distribution of GC3s along the PAR is more homogeneous compared to chr4 or chr5, except for the chromosomal ends (Fig. S11).
Overall, then, we find evidence from TE density and intron size that efficacy of selection may be reduced on the PAR in most large-PAR palaeognaths, potentially because of reductions in recombination rate (as suggested by reduced GC3s), although we note the signature from codon bias (ENC) is more ambiguous. If indeed recombination rate is reduced relative to a similarly sized autosome for most large-PAR species, that could explain why we see some evidence for faster-PAR evolution in palaeognaths.
Discussion
Old, homomorphic sex chromosomes have long been an evolutionary puzzle, as they defy our usual expectations about how sexually antagonistic selection drives recombination suppression of the Y (or W) chromosome and eventual degradation. A long-standing example of old, homomorphic sex chromosomes are found in the Palaeognathae, where previous cytogenetic and genomic studies have clearly demonstrated the persistence of largely homomorphic sex chromosomes. Our results extend previous studies, and confirm at the genomic level that all ratites and some tinamous have large, nondegenerate PARs, while in at least some Tinaminae degradation of the W chromosome has proceeded, resulting in typically small PARs.
Evolutionary forces acting on sex chromosomes
Several studies have reported evidence for faster-Z evolution in birds, probably driven largely by increased fixation of weakly deleterious mutations due to reduced Ne of the Z chromosome (Mank, Nam, et al. 2010; Wright et al. 2015). However, these studies have focused on neognaths, with fully differentiated sex chromosomes. Here, we show that palaeognath sex chromosomes, which mostly maintain large PARs, do not have consistent evidence for faster-Z evolution, while confirming the pervasive faster-Z effect in neognaths. Notably, the two species in our dataset that presumably share heteromorphic sex chromosomes derived independently from neognaths (white-throated tinamou and thicket tinamou) do show evidence for faster-Z evolution, and in particular faster evolution of DR genes. In contrast, palaeognaths with small DR and large PAR do not tend to show evidence for faster-DR, even though hemizygosity effects should be apparent (the exception is cassowary, which may be an artifact due to W-linked sequence assembling as part of the Z).
A previous study on neognaths shows that the increased divergence rate of the Z is mainly contributed by recent strata while the oldest stratum (S0) does not show faster-Z effect (Wang et al. 2014). Neognaths and palaeognaths share the S0, and since their divergence only a small secondary stratum has evolved in palaeognaths (Zhou et al. 2014). In particular, the DR of palaeognaths without heteromorphic sex chromosomes is largely dominated by this shared S0 stratum. The absence of faster-Z effect in palaeognath DR where S0 dominates is therefore largely consistent with the results of the study on the neognath S0 stratum. A possible mechanism to explain this pattern is that, in S0, the reduced effective population size (increasing fixation of deleterious mutations) is balanced by the greater efficacy of selection in removing recessive mutations (due to hemizygosity). A recent study on ZW evolution in Maniola jurtina and Pyronia tithonus butterflies suggests a similar model, where purifying selection is acting on the hemizygous DR genes to remove deleterious mutations (Rousselle et al. 2016). While this model would account for the pattern we observe, it remains unclear why the shared S0 stratum should have a different balance of these forces than the rest of the DR in both neognaths and palaeognaths with large DRs. Nonetheless, the evolutionary rates of the DR genes in the older strata are probably the net results of genetic drift and purifying selection against deleterious mutation, with little contribution of positive selection for recessive beneficial mutations.
We also detect evidence for faster evolution of genes in the PAR for tinamous and some species of kiwi. Since the PAR is functionally homomorphic and recombines with the homologous region of the W chromosome, it is not clear why this effect should be observed in these species. However, a common feature of the PARs of tinamous and kiwis is that they are relatively shorter than PARs of other palaeognaths. This raises at least two possible explanations for the faster-PAR effect in tinamous and kiwis: 1) the differentiation of the sex chromosomes is more rapid compared with other palaeognaths, and at least some parts of the PARs may have recently stopped recombining and actually become DR but undetectable by using the coverage method; or 2) the PARs are still recombining but at lower rate, resulting in weaker efficacy of selection against deleterious mutations.
Efficacy of selection and recombination rate
Multiple lines of evidence suggest a possible reduction in the efficacy of selection in the PAR across all palaeognaths with a large PAR. Specifically, we find both an increase in TE density and an increase in intron size in PARs. In contrast, we do not find clear evidence for a reduction in the degree of codon bias in PARs. However, it is possible that genetic drift (Marais et al. 2001), GC-biased gene conversion (Galtier et al. 2018) and/or mutational bias (Szövényi et al. 2017) may also affect the codon bias, which may weaken the correlation between codon usage bias and the strength of natural selection.
It is unclear, however, why the efficacy of selection may be reduced in PARs. One possible cause is that the PARs may recombine at lower rate than autosomes. This is a somewhat unexpected prediction because in most species PARs have higher recombination rates than autosomes (Otto et al. 2011). In birds, direct estimates of recombination rates of the PARs are available in both collared flycatcher and zebra finch, and in both species PARs recombines at much higher rates than most macrochromosomes (Smeds et al. 2014; Singhal et al. 2015). This is probably due to the need for at least one obligate crossover in female meiosis, combined with the small size of the PAR in both collared flycatcher and zebra finch.
In palaeognaths where PARs are much larger, direct estimates of recombination rate from pedigree or genetic cross data are not available. Our observation that GC3s are significantly lower in large palaeognath PARs than similarly sized autosomes is at least consistent with reduced recombination rates in these species. However, a recent study on greater rhea shows that the recombination rate of the PAR does not differ from similarly sized autosomes in females (del Priore and Pigozzi 2017), but this study did not examine males. Since the recombination rates differ extensively between sexes (van Oers et al. 2014; Halldorsson et al. 2016; Bhérer et al. 2017), more data is needed to test whether sex-average recombination rate of the PAR differs from autosomes. Additionally, a previous study of emu conducted prior to the availability of an emu genome assembly suggested that the PAR has a higher population recombination rate than autosomes (Janes et al. 2009). However, of twenty two loci in that study, seven appear to be incorrectly assigned to the sex chromosomes based on alignment to the emu genome assembly (Table S3), potentially complicating that conclusion. The relatively small size of that study and recently improved resources and refined understanding of recombination rates across chromosome types provide opportunities for a new analysis. Further direct tests of recombination rate on ratite Z chromosomes are needed to resolve these discrepancies.
Sexual antagonism and sex chromosome degeneration
A major motivation for studying palaeognath sex chromosomes is that, unusually, many palaeognaths seem to maintain old, homomorphic sex chromosomes. Standard models of sex chromosome evolution, in which recombination suppression evolves in order to tightly link sexually antagonistic mutations to the sex determination locus, thus do not seem to be able to explain palaeognath sex chromosomes. Previous work has suggested two hypotheses to explain this discrepancy: (1) the lack of dosage compensation in birds prevents the degeneration of the W chromosome due to dosage sensitivity (Adolfsson and Ellegren 2013), or (2) sexually antagonistic effects are resolved by the evolution of male-biased expression (Vicoso, Kaiser, et al. 2013).
However, neither hypothesis seems to fully explain the slow degeneration of palaeognath sex chromosomes. Published RNA-seq expression data from both males and females from ostrich, Okarito brown kiwi, and little brown kiwi, as well as new RNA-seq data from emu and Chilean tinamou, suggest dosage compensation is partial in palaeognaths and consistent with what has been seen in neognaths. If the absence of complete dosage compensation is the reason for the arrested sex chromosome degeneration in palaeognaths, it is not clear why some palaeognaths (thicket tinamou and white-throated tinamou) and all neognaths have degenerated W chromosomes and small PARs. The other hypothesis, derived from a previous study on emu (Vicoso, Kaiser, et al. 2013), implies an excess of male-biased genes on the PAR as resolution of sexual antagonism. However, gene expression data from multiple tissues and stages of emu show that male-biased genes are only enriched on the DR (presumably attributable to incomplete dosage compensation), with very few present on the PAR. We find similar patterns in other species.
Classic views on the evolution of sex chromosomes argue that recombination suppression ultimately leads to the complete degeneration of the sex-limited chromosomes (Charlesworth et al. 2005; Bachtrog 2006). However, recent theoretical work suggests suppression of recombination is not always favored, and may require strong sexually antagonistic selection (Charlesworth et al. 2014) or other conditions (Otto 2014). Thus, there may be conditions which would have driven tight linkage of the sex-determining locus and sex-specific beneficial loci via the suppression of recombination in neognaths (Gorelick et al. 2016; Charlesworth 2017), but not in palaeognaths, although it remains unclear the exact model that could produce this pattern (it would require, e.g. fewer sexually antagonistic mutations in palaeognaths).
Alternatively, the suppression of recombination between sex chromosomes may be unrelated to sexually antagonistic selection (Rodrigues et al. 2018), and non-adaptive. By model simulations, Cavoto and colleagues (Cavoto et al. 2017) recently suggests complete recombination suppression can sometimes be harmful to the heterogametic sex, and sex chromosomes are not favorable locations for sexually antagonistic alleles in many lineages. An alternative evolutionary explanation for loss of recombination in the heterogametic sex is then needed. Perhaps the rapid evolution of the sex-limited chromosome may have its role in the expansion of the non-recombining region on the sex chromosome. For instance, once recombination ceases around the sex-determination locus, the W (and Y) chromosome rapidly accumulates TEs, particularly LTRs, and the spread of LTRs in the non-recombining region may in turn increase the chance of LTR-mediated chromosomal rearrangements, including inversions, leading to the suppression of recombination between the W and Z (or Y and X). Future study on the W chromosomes of palaeognaths and neognaths is needed to elucidate the role the W in the evolution of avian sex chromosomes.
Methods
Identification of the Z chromosome, PARs and DRs
The repeat-masked sequence of ostrich Z chromosome (chrZ) (Zhang et al. 2015) was used as a reference to identify the homologous Z-linked scaffolds in recently assembled palaeognath genomes (Sackton et al. 2018). We used nucmer function from MUMmer package (Kurtz et al. 2004) to first align the ostrich Z-linked scaffolds to emu genome; an emu scaffold was defined as Z-linked if more than 50% of the sequence was aligned. The Z-linked scaffolds of emu were further used as reference to infer the homologous Z-linked sequences in the other palaeognaths because of the more continuous assembly of emu genome and closer phylogenetic relationships, and 60% coverage of alignment was required. During this process, we found that a ~12Mb genomic region of ostrich chrZ (scf347, scf179, scf289, scf79, scf816 and a part of scf9) aligned to chicken autosomes. The two breakpoints can be aligned to a single scaffold of lesser rhea (scaffold_0) (Fig. S1), so we checked whether there could be a mis-assembly in ostrich by mapping the 10k and 20k mate-pair reads from ostrich to the ostrich assembly. We inspected the reads alignments around the breakpoint and confirmed a likely mis-assembly (Fig. S2). The homologous sequences of this region were subsequently removed from palaeognathous Z-linked sequences. When a smaller ostrich scaffold showed discordant orientation and/or order, but its entire sequence was harbored within the length of longer scaffolds of other palaeognaths (Fig. S1), we manually changed the orientation and/or order of that scaffold for consistency. After correcting the orientations and orders of ostrich scaffolds of chrZ, a second round of nucmer alignment was performed to determine the chromosomal positions for palaeognathous Z-linked scaffolds.
One way to infer the boundary between the PAR (pseudoautosomal region) and DR (differentiated region) is to compare the differences of sequencing depth of female DNA. Because the DR is not recombining in female and W-linked DR will degenerate over time (and thus diverge from Z-linked DR), the depth of sequencing reads from the Z-linked DR is generally expected to be half of that from the PAR or autosomes. This approach was applied to cassowary, whose sequence is derived a female individual. For emu, female sequencing was available from Vicoso et al. (Vicoso, Kaiser, et al. 2013). To facilitate the PAR annotation, we generated additional DNA-seq data from a female for each of lesser rhea, Chilean tinamou and thicket tinamou. Default parameters of BWA (v0.7.9) were used to map DNA reads to the repeat-masked genomes with BWA-MEM algorithm (Li 2013), and mapping depth was calculated by SAMtools (v1.2) (Li et al. 2009). A fixed sliding window of 50kb was set to calculate average mapping depths along the scaffolds. Any windows containing less than 5kb were removed. Significant shifts of sequencing depth were annotated as the boundaries of the PARs and DRs.
Another independent method for PAR annotation is based on gene expression differences between male and female of PAR- and DR-linked genes. To reduce the effect of transcriptional noise and sex-biased expression, 20-gene windows were used to calculate the mean male-to-female ratios. The shifts of male-to-female expression ratios were used to annotate approximate PAR/DR boundaries. This method was applied to little spotted kiwi, Okarito brown kiwi, emu and Chilean tinamou. Given the small divergence between little spotted kiwi and great spotted kiwi, it is reasonable to infer that the latter should have a similar PAR size. Neither female reads nor RNA-seq reads are available for greater rheas and elegant crested tinamou, so the PAR/DR boundaries of lesser rhea and Chilean tinamou were used to estimate the boundaries respectively.
Comparison of genomic features
To estimate GC content of synonymous sites of the third position of codons (GC3s), codonW (http://codonw.sourceforge.net) was used with the option ‘-gc3s’. The exon density was calculated by dividing the total length of exon over a fixed 50k windows by the window size. Similarly, we summed the lengths of transposable elements (TEs, including LINE, SINE, LTR and DNA element) based on RepeatMasker outputs (Kapusta et al, personal communication) to calculate density for 50k windows. Intron sizes were calculated from gene annotations (GFF file). Codon usage bias were quantified by effective number of codons (ENC) using ENCprime. We used intronic sequences to estimate background nucleotide frequency to further reduce the effect of local GC content on codon usage estimates. Wilcoxon sum rank test were used to assess statistical significance.
Divergence analyses
The estimates of synonymous and non-synonymous substitution numbers and sites were extracted from PAML (Yang 2007) outputs generated by free-ratio branch models, based on alignments produced by Sackton et al (Sackton et al. 2018). For a given chromosome, the overall synonymous substitution rate (dS) was calculated as the ratio of the number of synonymous substitution to the number of synonymous site over the entire chromosome, similarly, the chromosome-wide dN was calculated using the numbers of non-synonymous substitution and site over the entire chromosome (this is effectively a length-weighted average of individual gene values). The dN/dS values (ω) were calculated by the ratios of dN to dS values. Confidence intervals for dN, dS and dN/dS were estimated using the R package ‘boot’ with 1000 replicates of bootstrapping. P-values were calculated by taking 1000 permutation tests.
Gene expression analyses
Three biological replicates of samples from emu brains, gonads and spleens of both adult sexes were collected from Songline Emu farm (specimen numbers: Museum of Comparative Zoology, Harvard University Cryo 6597-6608). For Chilean tinamou, RNA samples were collected from brains and gonads of both sexes of adults with one biological replicate (raw data from (Sackton et al. 2018), but re-analyzed here). RNA-seq reads for both sexes of ostrich brain and liver (Adolfsson and Ellegren 2013), emu embryonic brains of two stages (Vicoso, Kaiser, et al. 2013), and blood of little spotted kiwi and Okarito brown kiwi (Ramstad et al. 2016) were downloaded from NCBI.
For the newly generated samples (emu brains, gonads and spleens), RNA extraction was performed using RNeasy Plus Mini kit (Qiagen). The quality of the total RNA was assessed using the RNA Nano kit (Agilent). Poly-A selection was conducted on the total RNA using PrepX PolyA mRNA Isolation Kit (Takara). The mRNA was assessed using the RNA Pico kit (Agilent) and used to make transcriptome libraries using the PrepX RNA-Seq for Illumina Library Kit (Takara). HS DNA kit (Agilent) was used to assess the library quality. The libraries were quantified by performing qPCR (KAPA library quantification kit) and then sequenced on an NextSeq instrument (High Output 150 kit, PE 75 bp reads). Each library was sequenced to a depth of approximately 30M reads. The quality of the RNA-seq data was assessed using FastQC. Error correction was performed using Rcorrector; unfixable reads were removed. Adapters were removed using TrimGalore!. Reads of rRNAs were removed by mapping to the Silva rRNA database.
We used RSEM (v1.2.22) (Li and Dewey 2011) to quantify the gene expression levels. RSEM implemented bowtie2 (v2.2.6) to map the RNA-seq raw reads to transcripts (based on a GTF file for each species), and default parameters were used for expression quantification. TPM (Transcripts Per Million) on the gene level were used to represent the normalized expression. The expected reads counts rounded from RSEM outputs were used as inputs for DESeq2 (Love et al. 2014) for differential expression analysis between sexes. We used a 5% FDR cutoff to considered as sex-biased genes.
Acknowledgements
We thank John Parsch, Beatriz Vicoso, and Qi Zhou for their useful comments. The computations in this paper were performed on the Odyssey cluster at Harvard University and supported by Harvard University Research Computing. This work was supported by NSF grant DEB-135343/EAR-1355292 to SVE. All raw data newly generated in this study (emu RNA-seq, and DNA-seq from lesser rhea, Chilean tinamou, and thicket tinamou) are available from NCBI at BioProjects XXXXXX and XXXXXX.