ABSTRACT
Interactions among divergent elements of transcriptional networks from different species can lead to misexpression in hybrids through regulatory incompatibilities, some with the potential to generate sterility. Genes with male-biased expression tend to be overrepresented among genes misexpressed in hybrid males. While the possible contribution of faster-male evolution to this misexpression has been explored, the role of the hemizygous X chromosome (i.e., the dominance theory for transcriptomes) remains yet to be determined. Here we study genome-wide patterns of gene expression in females and males of Drosophila yakuba and D. santomea and their hybrids. We used attached-X stocks to specifically test the dominance theory, and we uncovered a significant contribution of recessive alleles on the X chromosome to hybrid misexpression. Our analysis of gene expression patterns suggests that there is a contribution of weakly deleterious regulatory mutations to gene expression divergence in the sex towards which the expression is biased. In the opposite sex (e.g., genes with female-biased expression analyzed in male transcriptomes), we detect stronger selective constraints on gene expression divergence. Although genes with high degree of male-biased expression show a clear signal of faster-X evolution for gene expression divergence, we also detected slower-X evolution of gene expression in other gene classes (e.g. female-biased genes) that is mediated by significant decreases of cis- and trans-regulatory divergence. The distinct behavior of X-linked genes with high degree of male-biased expression is consistent with these genes experiencing a higher incidence of positively selected regulatory mutations than their autosomal counterparts. We propose that both dominance theory and faster-X evolution of gene expression may be major contributors to hybrid misexpression and possibly the large X-effect in these species.
INTRODUCTION
Population genetics models predict that, under certain conditions, X-linked loci are expected to show accelerated divergence relative to autosomal loci (i.e., faster-X evolution) (Charlesworthet al. 1987; Orr and Betancourt 2001; Betancourtet al. 2002; Vicoso and Charlesworth 2006; Vicoso and Charlesworth 2009a; Manket al. 2010b). Faster-X evolution can, in principle, result from the contributions to the rates of evolution of either beneficial or weakly deleterious mutations. The visibility to selection of new X-linked recessive beneficial mutations in hemizygous males significantly increases their chances of spreading and contributing to the rate of evolution. Weakly deleterious mutations that are partially dominant can also produce faster-X because the efficacy of natural selection tends to be slightly reduced on the X chromosome. This is because the effective population size of the X chromosome (NeX) is often reduced relative to that of autosomes (NeA) due to its hemizygosity in males. Theoretical models, however, show that the chances of observing faster-X evolution are very sensitive to the ratio NeX /NeA (Vicoso and Charlesworth 2009a; Manket al. 2010b) and any parameter that impacts this ratio, including demographic changes and differences in recombination rates between the X chromosome and autosomes among others (Langleyet al. 1988; Connallon 2007; Pool and Nielsen 2007; Vicoso and Charlesworth 2009b; Charlesworth 2012; Comeronet al. 2012; Avilaet al. 2015). For extreme values of NeX /NeA, faster-X evolution can even occur independently of the coefficient of dominance (Vicoso and Charlesworth 2009a; Manket al. 2010b).
Today there is ample evidence supporting faster-X evolution of protein-coding sequences in a variety of organisms, including XY and ZW systems (Thornton and Long 2002; Torgerson and Singh 2003; Countermanet al. 2004; Khaitovichet al. 2005; Lu and Wu 2005; Nielsenet al. 2005; Thornton and Long 2005; Musterset al. 2006; Torgerson and Singh 2006; Baines and Harr 2007; Begunet al. 2007; Manket al. 2007; Baineset al. 2008; Singhet al. 2008; Ellegren 2009; Manket al. 2010a; Grath and Parsch 2012; Meisel and Connallon 2013; Vicosoet al. 2013; Avilaet al. 2014; Garriganet al. 2014; Kousathanaset al. 2014; Sacktonet al. 2014; Veeramahet al. 2014; Avilaet al. 2015; Coolonet al. 2015; Llopart 2015; Larsonet al. 2016). Faster-X evolution has also been reported for gene expression divergence (Khaitovichet al. 2005; Brawandet al. 2011; Kayseriliet al. 2012; Llopart 2012; Meiselet al. 2012; Coolonet al. 2015; Deanet al. 2015) and intergenic DNA sequences (Huet al. 2013; Coolonet al. 2015; Llopart 2018), likely reflecting the effects of natural selection on the evolution of regulatory elements. In species where males are the heterogametic sex, faster-X is expected to be strongest in genes with male-specific fitness effects because these genes are always hemizygous when they are X-linked. A similar rationale applies to genes with female-specific fitness effects in ZW species (Betancourtet al. 2002; Vicoso and Charlesworth 2009a; Manket al. 2010b).
Regardless of the evolutionary forces driving faster-X (Z) evolution (positive selection or relaxed negative selection), its implications for the genetic basis of hybrid dysfunction are clear: the X chromosome will disproportionally contribute to between-species Dobzhansky-Muller incompatibilities (Dobzhansky 1937; Muller 1942; Coyne and Orr 1989). Even a slightly elevated substitution rate on the X chromosome will be amplified in hybrids, as the number of incompatibilities increases exponentially with divergence time (Orr 1995; Turelli and Orr 2000; Matuteet al. 2010; Moyle and Nakazato 2010; Guerreroet al. 2017). Mapping studies in a variety of species including mice, Lepidoptera, and Drosophila have shown that the factors having the largest effects on hybrid dysfunction, particularly hybrid male sterility, are X-linked (Coyne and Orr 1989; Coyneet al. 1991; Coyne 1992; Coyne and Orr 2004). Orr’s (1987) seminal study of the genetic basis of hybrid sterility in D. pseudoobscura-D. persimilis constitutes a classic example of this pattern known as the large X-effect. The large X-effect, together with Haldane’s rule (i.e., the preferential inviability or sterility of the heterogametic over the homogametic sex; Haldane 1922), underline the importance of sex chromosomes in speciation (Coyne and Orr 1989; Masly and Presgraves 2007; Presgraves 2008; Moyleet al. 2010).
The large X-effect can also be explained by the dominance theory (Turelli and Orr 1995; Orr and Turelli 1996; Turelli and Orr 2000), which posits that alleles causing hybrid dysfunction tend to act recessively in hybrids (Muller 1940;Muller 1942). As a result, hemizygous X chromosomes will express all incompatibilities (dominant and recessive) while heterozygous autosomes will only express the fairly dominant ones. In the context of gene regulation, the dominance theory, if confirmed, opens the possibility that a large fraction of the the abundant misexpression observed in hybrid males, but not in hybrid females, may be due to regulatory incompatibilities involving X-linked recessive mutations. Disruption of gene expression has been extensively documented in Drosophila hybrids and it is often the case that rapidly evolving genes enriched in male functions tend to be overrepresented among misexpressed genes in sterile hybrid males (Reiland and Noor 2002; Michalak and Noor 2003; Michalak and Noor 2004; Ranzet al. 2004; Landryet al. 2005; Noor 2005; Haerty and Singh 2006; Artieriet al. 2007; Barbash and Lorigan 2007; Moehringet al. 2007; Catron and Noor 2008; Sundararajan and Civetta 2011; Llopart 2012; Maheshwari and Barbash 2012; Satyakiet al. 2014; Weiet al. 2014; Gomes and Civetta 2015). While the potential link between faster-male theory (Wu and Davis 1993; Wuet al. 1996) and hybrid misexpression has been explored (Noor 2005; Haerty and Singh 2006; Ortiz-Barrientoset al. 2007; Civetta 2016), the dominance theory for hybrid transcriptomes remains untested. Here we examine genome-wide patterns of transcript abundance in females and males of Drosophila yakuba, D. santomea and their interspecific hybrids using RNAseq technology for genes with variable degrees of sex-biased expression. We took advantage of attached-X stocks developed in both species to test the dominance theory for transcriptomes.
MATERIALS AND METHODS
Fly stocks
We used the D. yakuba Taï18E2 and the D. santomea STO.4 stocks to obtain an overall view of gene expression variation between species and in F1 hybrid females and males. The D. yakuba Taï18E2 was part of the 12 Drosophila Genomes project, originated in the Taï rainforest (Taï National Park, Ivory Coast), and was inbred for at least 10 generations of single-pair brother/sister mating before whole-genome sequencing (Clarket al. 2007). The STO.4 stock of D. santomea was established from a single fertilized female collected in the Obo Natural Reserve on São Tomé (Lachaiseet al. 2000), and was inbred for 10 generations. To generate ‘unbalanced’ F1 hybrid females with two D. yakuba X chromosomes, D. yakuba females of the attached-X stock [C(1)RM y, wor females, + males] were crossed to D. santomea males of the attached-X stock [C(1)RM g females, + males] (Coyneet al. 2004); F1 hybrid females with two D. santomea X chromosomes were also generated from the reciprocal cross. All flies were raised on standard cornmeal-yeast-agar medium at 24°C with a 12 h dark-light cycle.
RNA isolation, library preparation, and sequencing
Females and males were collected under CO2 anesthesia 0-5 h post eclosion, aged for 19 h at 24°C, snap-frozen in liquid nitrogen, and stored at –80°C. For each genotype analyzed, total RNA was isolated from a pool of 15 females or 30 males using TRIzol Reagent and following the manufacturer’s recommendations (Invitrogen). Up to 15 μg of nucleic acid were treated with RNase-free DNase I (∼7 Kunitz units) in solution (‘RNase-free DNase set’; Qiagen) and cleaned- up using the RNeasy Mini Kit (Qiagen). We used ∼1 μg of total RNA to prepare libraries suitable for high throughput sequencing following the Illumina protocol (‘mRNA Sample Preparation Guide’; Cat. # RS-930-1001, Part # 1004898 Rev. D; Illumina). Briefly, poly-A containing RNA molecules were purified using poly-T oligo-attached magnetic beads. Following this purification, the mRNA was fragmented into small pieces using divalent cations under elevated temperature. The first-strand cDNA was copied using reverse transcriptase and random hexamers while the second cDNA strand was synthesized using DNA polymerase I and RNAse H. The fragmented cDNA went through the end repair process, addition of a single ‘A’ base, and ligation to custom-designed adapters. These include unique 7-nucleotide tags that allowed multiplexing of several samples prior to sequencing. The sequences and a detailed description of these custom-designed adapters can be found elsewhere (Comeronet al. 2012).
We run the ligation products on a 2% agarose gel, selected for ∼250bp-size, and recovered cDNA templates using the QIAquick Gel Extraction Kit (Qiagen). These cDNA templates were enriched with PCR to create the final cDNA libraries, which were validated using a 2100 Bioanalyzer (Agilent Technologies). For each genotype analyzed, we obtained two technical replicates using the same total RNA. Cluster generation and sequencing were carried out on HiSeq 2000/2500 instruments at the Iowa Institute of Human Genetics (IIHG; University of Iowa) or at the Iowa State DNA facility (Iowa State University) using a combination of multiple lanes per sample (single-end 100 cycles/read or 125 cycles/read). Sequence reads were sorted out according to individual tags, 3’ trimmed (-t 12 -l 30), and filtered based on quality (-q 12 and -p 75) using the FASTX-toolkit (0.0.14) (http://hannonlab.cshl.edu/fastx_toolkit/commandline.html). The average number of trimmed reads that passed quality filtering across samples was 51.5 and 31.3 millions for replicates 1 and 2, respectively. All analyses were carried out combining sequence reads from both technical replicates. All sequence reads have been archived at NCBI SRA BioProject PRJNA470895.
Gene expression analyses
To examine genome-wide transcript abundance, we first generated the D. yakuba reference gene set (14,687 genes) based on the genome sequence and annotation of coding sequences (CDS) available in FlyBase (dyak_r1.3_FB2011_08; http://flybase.org/) (Gramateset al. 2017). Genes on Arm U and random sequence scaffolds were not included. To allow reads covering 5’ and 3’ UTRs, we expanded the CDS definitions by 75 base pairs upstream and downstream of starting and stop codons, respectively. Exon and UTR sequences were extracted using tools available on Galaxy (https://usegalaxy.org/) (Afganet al. 2016) and assembled with custom scripts.
Trimmed and quality filtered reads were mapped to the D. yakuba reference gene set with Bowtie 2 (version 2.1.0) using default parameters(Langmead and Salzberg 2012). Bam files were sorted and indexed with SAMtools [version 0.1.18 (r982:295)] (Liet al. 2009), and the number of reads aligned to each of the genes was calculated using RSEM (version 1.3.0) (Li and Dewey 2011). We only considered for further analyses genes with at least 10 mapped reads and > 1 fpkm (Fragments Per Kilobase of transcript per Million mapped reads) in both species and in at least one sex.
To classify genes into different sex-biased expression categories that were consistent across species, we took advantage of a DESeq2 (version 1.16.1) multifactor design with two variables (∼species + sex) and used the effect size estimate (Log2FoldChange) as a measure of the degree of sex-biased expression (Loveet al. 2014). Genes showing a fold-change difference in expression between the two sexes smaller than 1.1 were classified as nonsex-biased. Genes showing statistically significant differences in expression between females and males and < 1 fpkm in one sex were considered to have sex-specific expression. Specifically, we detected 156 genes expressed in females but not in males and 1,660 genes showing the opposite pattern. Quantitative estimates of differences in gene expression between species, between parental species and hybrids, and between reciprocal interspecific hybrids, were also based on size effects from DESeq2. DESeq2 implements the Benjamini-Hochberg probability adjustment (padj) to correct for multiple testing (Benjamini and Hochberg 1995). Genes were considered differentially expressed if padj < 0.05 (FDR = 0.05). We excluded genes showing switches across species between sex-biased expression categories (e.g., genes with male-biased expression in D. yakuba but female-biased expression in D. santomea). We analyzed gene expression for a total of 11,959 genes.
Analyses of cis- and trans-regulatory variation
The detection of cis- and/or trans-regulatory variation is usually based on comparing the ratio of allelic expression in hybrids (i.e., allelic ratio) with the ratio of expression between species (i.e., species ratio) (Wittkoppet al. 2004; McManuset al. 2010; Coolonet al. 2014). With mRNAseq data, this requires the identification of parental stock-specific reads, which was based on whole-genome consensus sequences for the D. yakuba Taï18E2, the D. santomea STO.4, the D. yakuba attached-X, and the D. santomea attached-X stocks. The initial mapping steps for generating these new genome sequences were the same as those used for the analysis of transcript abundance (see above) except that, in this case, we used the D. yakuba reference genome sequence to align D. yakuba reads and a draft of the D. santomea genome sequence (Llopart 2015)to align D. santomea reads. SAMtools was used to (1) sort/index BAM files, (2) filter sites with mapping quality < 40, and (3) generate mpileup files with a maximum depth of 100,000 reads and a minimum base quality of 30 (Liet al. 2009). To further filter sites covered by fewer than 5 reads and to obtain fastq files of consensus sequences, we used the utility BCFtools (1.3.1) (Liet al. 2009). Fastq files for each of the four consensus sequences were converted into fasta format and heterozygous sites were randomly phased using Heng Li’s seqtk (https://github.com/lh3/seqtk). To minimize mapping biases across comparisons, regions containing ambiguous bases or missing base calls in at least one of the consensus sequences were also masked in all the other consensus sequences.
Based on comparisons between allelic and species ratios, we classified genes into six different regulatory evolution classes (cis only, trans only, cis + trans, cis × trans, compensatory, conserved) following McManus et al. (2010). For each type of hybrid analyzed, we identified stock-specific reads as those mapping with 0 mismatches to only one of the parental consensus sequences. The relative numbers of the two types of stock-specific reads in hybrids provide an estimate of the allelic ratio of gene expression, which was calculated as log2(read count allele 1/ read count allele 2). This ratio constitutes a quantitative measure of cis-regulatory divergence for every gene analyzed. To obtain an estimate of the species ratio of expression, we first constructed an in silico parental mixture by combining equal numbers of stock-specific reads from each of the two parental stocks used to generate the different hybrid types, as in Coolon et al. (2014). The species ratio was then calculated as log2(read count genotype 1/ read count genotype 2) (Coolonet al. 2014; Coolonet al. 2015). The difference between species and allelic ratios [log2(read count genotype 1/ read count genotype 2) - log2(read count allele 1/read count allele 2)] provides a quantitative measure of trans-regulatory divergence for each gene analyzed. Binomial exact tests were used to detect differences between read counts of allele 1 and allele 2, and also between read counts of genotype 1 and genotype 2. A χ2 test was applied to detect differences between allelic and species ratios. Sequential Bonferroni correction for multiple testing was used (FDR = 0.05). Only genes with at least 20 stock-specific mapped reads for the two types of alleles and genotypes were included in the analyses.
RESULTS
To understand the role of the X chromosome in hybrid gene expression, we examined transcript abundance in 1-day old virgin females and males of Drosophila yakuba, D. santomea and their interspecific hybrids. Our experimental design includes not only flies with normal karyotypes but also females who carry attached-X chromosomes (Coyneet al. 2004) (Figure 1). Attached-X hybrid females have two X chromosomes from the same species and a haploid set of autosomes from each parental species. They are as ‘unbalanced’ as hybrid males for the X/Autosome ratio and constitute an ideal genetic tool to study the effects of X-linked recessive factors. We obtained profiles of transcript abundance for 10,128 and 11,663 genes expressed in adult females and males, respectively. To investigate the effects of sexually dimorphic gene expression, we classified all genes into seven distinct categories based on degree of sex-biased expression (Figure 2). Importantly, this gradual classification allows for comparisons across studies regardless of statistical power. In addition and to capture extreme sex-biased effects, we also identified genes expressed in females but not in males (female-specific genes, FSGs) and genes showing the opposite pattern (male-specific genes, MSGs).
To investigate the relationship between sexually dimorphic expression and chromosome location in the D. yakuba-D. santomea system, we examined the genome-wide distribution of genes with variable degrees of sex-biased expression (Figure 2). The autosomal dot (fourth) chromosome reveals the most striking pattern, with 86% (68/79) of its genes being female-biased in expression. This represents a significant enrichment relative to the other autosomes [68/79 vs. 3,978/9,919; Fisher’s Exact Test (FET), P = 5.08×10-17], which we interpret as evidence of ancient feminization of the dot chromosome (Vicoso and Bachtrog 2013; Vicoso and Bachtrog 2015). The X chromosome shows a significant overrepresentation of FSGs relative to autosomes (40/1,951 vs. 116/9,919, FET P = 0.003) and a trend towards underrepresentation of MSGs (246/1,951 vs. 1,413/9,919, FET P = 0.058).
Gene expression divergence is a sexually dimorphic trait
To understand the molecular causes underlying hybrid misexpression, we first examined gene expression divergence using a quantitative measure of the difference in transcript abundance between species [i.e., log2(D. yakuba-to-D. santomea ratio of expression)]. These analyses uncovered three main patterns. First, gene expression divergence is positively correlated with degree of sex-biased expression [Spearman’s ρ = 0.10 (95% confidence intervals: 0.07–0.13) (P = 1.04×10-11) for genes with increasingly female-biased expression in females and ρ = 0.30 (95% CI: 0.27–0.32) (P = 3.47×10-111) for genes with increasingly male-biased expression in males] (Figure 3). Second, female transcriptomes show smaller overall gene expression divergence relative to male transcriptomes (0.28 vs. 0.30; Mann-Whitney Test (MWT) P = 6.75×10-12], which is consistent with faster-male evolution (Wu and Davis 1993; Wuet al. 1996) for gene expression (Meiklejohnet al. 2003; Parisiet al. 2003; Ranzet al. 2003; Haertyet al. 2007). Gene expression also evolves more rapidly in MSGs than in FSGs (0.58 vs. 0.41; MWT P = 0.007). Third, the comparative analysis of the same subset of genes in the female and male transcriptomes reveals that genes with sex-biased expression show high gene expression divergence in the sex toward which the expression is biased while transcript abundance tends to be conserved between species in the opposite sex; the trend is consistently observed across genes with different degrees of either female- or male-biased expression (Figure 3). As a result, some gene categories can show greater gene expression divergence when expressed in females than when expressed in males. This suggests that, in addition to the pervasive trend of faster-male evolution, there is a signal for rapid evolution in the female transcriptome as well. We suggest that gene expression divergence for genes with sex-biased expression behaves as a sexually dimorphic trait.
Weakly deleterious regulatory mutations contribute to gene expression divergence
To gain insight into the evolutionary forces overriding gene expression divergence, we analyzed transcript abundance on the dot chromosome. In Drosophila, the dot chromosome is a unique autosome because it does not experience meiotic recombination. Due to linked selection, the effective population size and the effectiveness of natural selection on the dot chromosome are severely reduced (Hill and Robertson 1966; Berryet al. 1991; Betancourt and Presgraves 2002; Comeron and Kreitman 2002; Jensenet al. 2002; Wanget al. 2002; Wanget al. 2004; Haddrillet al. 2007; Comeronet al. 2008; Betancourtet al. 2009; Arguelloet al. 2010). Because of the highly biased gene composition of the dot chromosome (Figure 2), we focused our analyses in genes with female-biased expression.
If weakly deleterious mutations are major contributors to gene expression divergence, we anticipate that gene expression on the dot chromosome will evolve faster than on other autosomes. This prediction is precisely observed but only in the female transcriptome: gene expression divergence for genes with female-biased expression on the dot chromosome is significantly higher than that on other autosomes (0.62 vs. 0.29; MWT P = 9.65×10-13). In male transcriptomes, by contrast, there is no evidence of increased gene expression divergence for female-biased genes on the dot chromosome (0.21 vs. 0.27; MWT P = 0.06), which indicates minimal effects of differences in effective population size and thus suggest selectively strong functional constraints. The presence of only eight genes with male-biased expression on the dot chromosome precludes us from formally testing whether genes with male-biased expression, when expressed in males, show any evidence of a contribution of weakly deleterious mutations to gene expression divergence. The comparison of female and male transcriptomes, therefore, suggests that gene expression divergence is not inherently high in genes with sex-biased expression, and that there is a major contribution of weakly deleterious regulatory mutations but only in the sex toward which expression is biased.
Faster-X evolution for gene expression divergence in highly male-biased genes
We also examined gene expression divergence in the context of the comparison between X-linked and autosomal genes. The female transcriptome shows an overall signal in support of slower-X evolution of gene expression (0.25 vs. 0.28, MWT P = 4.81×10-5), although for some gene categories the signal is weak or not significant (Figure 4A). The male transcriptome also shows slower-X evolution for gene expression divergence when all genes are considered as a single class (0.28 vs. 0.31, MWT P = 0.002). Despite this overall trend, we found significant evidence for faster-X evolution of gene expression in genes with high degree of male-biased expression (Figure 4B). Faster-X is particularly pronounced in MSGs (0.81 vs. 0.54, MWT P = 4.13×10-13). X-linked MSGs also show the greatest gene expression divergence of all gene classes in the male transcriptome, including the fastest evolving genes on the dot chromosome (0.81 vs. 0.68, MWT P = 0.017).
To gain insight into the genetic basis (cis- or trans-regulatory) of gene expression divergence, we examined stock-specific expression following Coolon et al. (2015) (Wittkoppet al. 2004; McManuset al. 2010; Coolonet al. 2014) (see also ‘Materials and Methods’). We found strong positive correlations between gene expression divergence and the magnitude of both cis- and trans-regulatory divergence (Table 1). These correlations imply that any increase in gene expression divergence is likely to be the result of contributions from cis- and trans-regulatory variation. For example, the overall trend of slower-X evolution for gene expression divergence in females is accompanied by significantly greater cis- (0.26 vs. 0.28; MWT z = 2.82, P = 0.005) and trans-regulatory (0.19 vs. 0.23; MWT z = 5.04, P = 4.8×10-7) divergence for autosomal than for X-linked genes. Likewise autosomal genes in the male transcriptome show increased cis- (0.30 vs. 0.28, MWT z = 3.22, P = 0.0012) and trans-regulatory (0.24 vs. 0.23, MWT z = 3.36, P = 0.0008) divergence relative to the same genes when analyzed in the female transcriptome, as expected from faster-male evolution. As a result, when we examined the different categories of genes with sexually dimorphic expression, we detected that the fraction of genes with significant cis-regulatory divergence increases with gene expression divergence; the same was observed for trans-regulatory divergence.
Recessive regulatory incompatibilities involving the X chromosome contribute to hybrid misexpression
To test the dominance theory (Muller 1940; Muller 1942) for hybrid transcriptomes, we compared gene expression profiles of hybrids and parental species using a quantitative measure of the degree of misexpression [i.e., ∼log2(parent-to-hybrid ratio of expression)]. It is important to emphasize that the degree of misexpression of each hybrid genotype was always assessed by comparison with the corresponding parental stocks. For example, the degree of misexpression of attached-X hybrid females was determined based on the comparison to females of the D. yakuba and D. santomea attached-X parental stocks combined. In this way, we took into account the putative effects of the Y chromosome on the transcriptome profiles of attached-X females. To make the female and male transcriptomes comparable, we examined misexpression of autosomal genes only. We found that the hybrid female transcriptome shows a significantly lower degree of misexpression than the hybrid male transcriptome (0.21 vs. 0.32; MWT P = 1.7×10-158 for hybrids from the cross between D. yakuba TaïE2 females and D. santomea STO.4 males; 0.20 vs. 0.34; MWT P < 1×10-308 for hybrids from the reciprocal cross).
The observation that genes in hybrid males are more prone to misexpression than genes in hybrid females can potentially be attributed to the hemizygosity of the X chromosome in males, to faster-male evolution, or a combination of both. To evaluate the contribution of recessive factors on the X chromosome to hybrid misexpression, we examined transcript abundance for autosomal genes in attached-X hybrid females (Figure 5 for hybrids from the cross between D. yakuba TaïE2 females and D. santomea STO.4 males). Reminiscent of males, attached-X hybrid females show a significantly greater degree of misexpression than normal hybrid females (0.29 vs. 0. 21; MWT P = 8.7×10-71), with FSGs showing a particularly extreme pattern (Figure 6). The increased misexpression in attached-X hybrid females relative to normal hybrid females is also observed in genes with nonsex-biased expression. These results suggest that there is a significant fraction of recessive regulatory incompatibilities involving the X chromosome that are masked in normal hybrid females but are fully uncovered in attached-X hybrid females and in hybrid males. The overall degree of misexpression in attached-X hybrid females, however, is not as high as in hybrid males, which suggests that additional factors like faster-male evolution are necessary to fully explain misexpression in hybrid males (0.29 vs. 0. 32; MWT P = 4.0×10-23).
To investigate the impact that genetic variation linked to the X (and/or Y) chromosome may have on gene expression patterns of autosomes, we examined differences in transcript abundance in interspecific hybrids from reciprocal crosses [i.e., ∼log2(HFY-to-HFS ratio of expression); Figure 1]. Our analysis showed that autosomal genes are more differentially expressed between the two reciprocal hybrid males than between the two attached-X hybrid females (0.20 vs. 0.15, MWT P = 3.28×10-80), even though these females are as ‘unbalanced’ for the X-to-autosome ratio as hybrid males. Many of these gene expression differences do not reflect cytoplasmic effects, for the median expression difference between reciprocal hybrid females with normal karyotypes is only 0.10. Remarkably, autosomal MSGs are particularly susceptible to genetic variation linked to sex chromosomes: the median difference in expression between reciprocal hybrid males is almost 8-fold increased relative to the median difference observed for FSGs between reciprocal attached-X hybrid females (1.55 vs. 0.20, MWT P = 1.35×10-42).
DISCUSSION
Our study provides a comprehensive view of transcriptome evolution in adult females and males of the D. yakuba-D. santomea system, its molecular mechanisms, and its consequences on patterns of gene expression in interspecific hybrids. Our emphasis lies on the contributions of the X chromosome. By taking advantage of attached-X stocks, we have been able to test the dominance theory for hybrid transcriptomes. We have also assessed the effects of linkage to the X chromosome on gene expression divergence, which we showed to be influenced by additional factors such as degree of sex-biased expression.
We found that both the female and male transcriptomes show evidence of accelerated gene expression divergence driven by genes with sex-biased expression in the D. yakuba-D. santomea system. Similar trends have been reported in other systems and have been attributed to relaxed selective constraints due to reduced pleiotropic effects, low levels of expression, and/or a considerable fraction of positive selection, especially in genes with male-biased expression (Meiklejohnet al. 2003; Parisiet al. 2003; Ranzet al. 2003; Zhanget al. 2004; Khaitovichet al. 2005; Proschelet al. 2006; Ellegren and Parsch 2007; Haertyet al. 2007; Voolstraet al. 2007; Zhanget al. 2007; Jiang and Machado 2009; Meisel 2011; Assiset al. 2012; Grath and Parsch 2012; Parsch and Ellegren 2013; Harrisonet al. 2015; Grath and Parsch 2016). Our results also emphasize the importance of studying gene expression divergence in both sexes and reveal the dual nature of transcriptome evolution. While gene expression divergence in genes with sex-biased expression is consistently high in the sex toward which the expression is biased, in the opposite sex transcript abundance appears to be more conserved between species. This low divergence, we propose, reflects strong selective constraints on tissues and pathways shared by both sexes. In contrast, we suggest that the high expression divergence observed for male-biased genes when expressed in males or female-biased genes when expressed in females may be the result of additional sex-specific regulatory components that can potentially evolve by positive selection or under relaxed functional constraints. Our analysis of gene expression divergence on the highly female-biased dot chromosome suggests that weakly deleterious regulatory mutations are major contributors to gene expression divergence of genes with sex-biased expression but only in the sex toward which the expression is biased. Our results also suggest that sexes adopt different expression strategies for their specialized function (Gibiliscoet al. 2016), possibly through sexually dimorphic trans-regulatory environments that often interact with cis-regulatory variation differently in females than in males (Coolonet al. 2013; Meiklejohnet al. 2014).
When we analyzed all the genes expressed in adults as a single class, we found evidence of slower-X evolution of gene expression in both the female the male transcriptomes. The molecular mechanism underlying this increased gene expression divergence in autosomal genes involves contributions from both cis- and trans-regulatory variation, as proposed for species of the D. melanogaster subgroup to explain faster-X evolution of gene expression (Coolonet al. 2015). The observations of slower-X evolution can be explained by a significant contribution of weakly deleterious mutations to gene expression divergence, with more effective purifying selection on the X chromosome. Alternatively, our findings could be the result of differences in mutation rates between the X chromosome and autosomes. In D. melanogaster the mutation rate in the male germline is slightly higher than that in the female germline, a difference that can potentially lead to slower-X evolution (Miyataet al. 1987; Bachtrog 2008; Keightleyet al. 2009). There is also a third explanation: in very closely related species ancestral polymorphism represents a non-negligible fraction of divergence and, therefore, slower-X evolution may just reflect the difference in ancestral NeX and NeA, with NeX < NeA. Consistent with this idea, the genome-wide analysis of putatively neutral sites [bases 8–30 of small introns (< 65 bp);(Llopart 2018)] in the D. yakuba-D. santomea system, indicates that nucleotide heterozygosity (Watterson 1975) is significantly smaller on the X chromosome than on autosomes for a D. yakuba population from Nairobi, Kenya, (0.015 vs. 0.019; Mann-Whitney test P = 0.02) (Rogerset al. 2014; Rogerset al. 2015). Similarly, a multilocus study of D. yakuba and D. santomea populations from São Tomé reported that the NeX /NeA was close to the theoretical expectation of 0.75 (Llopartet al. 2005). As such, a tendency towards slower-X is expected to be the default observation in many closely related species.
Studies of gene expression divergence in several Drosophila species show that the chances of observing faster-X (or slower-X) evolution in transcriptome data depend on species, sex, developmental stage, and tissue under study (Khaitovichet al. 2005; Brawandet al. 2011; Kayseriliet al. 2012; Llopart 2012; Meiselet al. 2012; Meisel and Connallon 2013; Coolonet al. 2015; Deanet al. 2015). We suggest that this variation may be the result of a combination of biological and experimental factors. Among the biological factors, the NeX /NeA has been shown by population genetics models to have a critical impact on whether faster-X evolution is expected (Vicoso and Charlesworth 2009a; Manket al. 2010b). This ratio refers to the long-term, historical NeX and NeA, which may not necessarily reflect the estimate based on extant variation. The number of breeding females and males, differences between the two sexes in the variance in reproductive success (i.e., mating system), demographic events, or differences in recombination rates between the X chromosome and autosomes (Langleyet al. 1988; Connallon 2007; Pool and Nielsen 2007; Vicoso and Charlesworth 2009b; Charlesworth 2012; Comeronet al. 2012; Avilaet al. 2015; Comeron 2017) are some of the additional biological factors that can be species specific and considerably affect NeX/NeA (Vicoso and Charlesworth 2009a). It is also important to recognize that factors associated with experimental design, like the specific stocks used in comparisons of closely related species, the subset of genes under study, or statistical cutoffs may also play a critical role in our ability to detect faster-X evolution of gene expression.
Of all the genes with sexually dimorphic expression analyzed in this study, X-linked MSGs are the most distinct class. They show transcriptional faster-X and their rampant gene expression divergence is even greater than that of the rapidly evolving female-biased genes on the dot chromosome. Based on neutral polymorphism surveys, there is a 5-10 fold reduction in effective population size and, therefore, in the scaled-population selection coefficient on the dot chromosome relative to the X and autosomes (Berryet al. 1991; Jensenet al. 2002; Wanget al. 2002; Wanget al. 2004; Betancourtet al. 2009; Arguelloet al. 2010). As a result, we suggest that it is unlikely that the faster-X gene expression evolution of MSGs is due to relaxed selection against deleterious mutations on the X chromosome. The overall finding of slower-X evolution for male transcriptomes also militates against this possibility. Instead, we propose that the faster-X observed for these genes in the D. yakuba-D. santomea system is the result of a higher incidence of beneficial regulatory mutations, perhaps associated with sexual selection and/or sexual conflict (Parsch and Ellegren 2013).
Our examination of gene expression in attached-X hybrid females shows that misexpression increases when the X chromosome is made homozygous. It is important to emphasize that this is not necessarily an inevitable result because, by carrying two different X chromosomes, normal hybrid females are expected to suffer twice as many possible hybrid incompatibilities involving the X chromosome as hybrid males (Orr 1993; Orr and Turelli 1996). It is also important to point out that none of the misexpression observed in attached-X hybrid females in the D. yakuba-D. santomea system is associated with sterility because these females are mostly fertile. Our results, however, suggest that there is a substantial fraction of recessive mutations that have the potential to generate regulatory incompatibilities involving the X chromosome. They open the possibility that the hemizygosity of the X chromosome in males may significantly contribute to misexpression and ultimately lead to hybrid male sterility. Thus, regulatory incompatibilities in hybrid transcriptomes appear to conform to the dominance theory in the D. yakuba-D. santomea system (Muller 1940; Muller 1942).
Both dominance theory and faster-X evolution have been proposed as forces that can explain the disproportionally large contribution of the X chromosome to hybrid dysfunction, particularly hybrid male sterility (i.e., the large X-effect; COYNE AND ORR 1989) (Turelli and Orr 1995; Orr and Turelli 1996; Turelli and Orr 2000; Masly and Presgraves 2007; Presgraves 2008; Moyleet al. 2010). As the rate of evolution of X-linked loci exceeds that of autosomal loci, the large X-effect becomes more likely (Coyne and Orr 1989; Naveira 2003). Similarly, if alleles decreasing fitness in hybrids tend to act recessively (i.e., dominance theory), it is easier to explain the large X-effect, even when the rates of evolution for the X chromosome and autosomes are not different (Turelli and Orr 1995; Orr and Turelli 1996; Turelli and Orr 2000). While either force alone can generate the large X-effect, in species like D. yakuba and D. santomea the two forces pull together in the same direction for genes with a high degree of male-biased expression to easily yield the observed large X-effect. These arguments echo those put forward by Coyne and Orr (2004) to explain the relative contributions of dominance theory and faster-male evolution to Haldane’s rule.
In the context of gene expression, our results suggest that recessive X-linked mutations and faster-X evolution of MSGs can potentially play a major role in hybrid male misexpression and perhaps the large X-effect. Our results indicate that genetic variation linked to sex chromosomes substantially impacts the expression levels of autosomal MSGs. Although we have not formally evaluated regulatory interactions, these observations open the possibility that regulatory incompatibilities between X-linked and autosomal factors may be facilitating the large X-effect on hybrid male sterility.