Abstract
The Drosophila Y chromosome is gene poor and mainly consists of silenced, repetitive DNA. Nonetheless, the Y influences expression of hundreds of genes genome-wide, possibly by sequestering key components of the heterochromatin machinery away from other positions in the genome. To directly test the influence of the Y chromosome on the genome-wide chromatin landscape, we assayed the genomic distribution of histone modifications associated with gene activation (H3K4me3), or heterochromatin (H3K9me2 and H3K9me3) in fruit flies with varying sex chromosome complements (X0, XY and XYY males; XX and XXY females). Consistent with the general deficiency of active chromatin modifications on the Y, we find that Y gene dose has little influence on the genomic distribution of H3K4me3. In contrast, both the presence and the number of Y chromosomes strongly influence genomewide enrichment patterns of repressive chromatin modifications. Highly repetitive regions such as the pericentromeres, the dot, and the Y chromosome (if present) are enriched for heterochromatic modifications in wildtype males and females, and even more strongly in X0 flies. In contrast, the additional Y chromosome in XYY males and XXY females diminishes the heterochromatic signal in these normally silenced, repeat-rich regions, which is accompanied by an increase in expression of Y-linked repeats. We find hundreds of genes that are expressed differentially between individuals with aberrant sex chromosome karyotypes, many of which also show sex-biased expression in wildtype Drosophila. Thus, Y chromosomes influence heterochromatin integrity genome-wide, and differences in the chromatin landscape of males and females may also contribute to sex-biased gene expression and sexual dimorphisms.
Author summary The Drosophila Y chromosome is gene poor and mainly consists of epigenetically silenced, repetitive junk DNA, yet the Y influences expression of thousands of genes genome-wide. Here we study the genome-wide chromatin landscape in flies with varying sex chromosome complements (X0, XY and XYY males; XX and XXY females), and show that the Y indirectly influences the epigenetic landscapes between sexes, by sequestering repressive chromatin marks. Differences in the sex-specific epigenome can have broad functional consequences: Increased amounts of repetitive DNA result in insufficient silencing of transposable elements and their remobilization in XXY and XYY flies. Hundreds of genes change their expression patterns in flies with different sex chromosome karyotypes, thereby contributing to sex-biased gene expression patterns and sexual dimorphism.
Introduction
The Drosophila Y is a degenerated, heterochromatic chromosome with only a few functional genes, primarily specialized in male reproductive function [1-4]. However, the D. melanogaster Y is about 40Mb in size and accounts for 20% of the male haploid genome [1, 5] (Figure 1A). Most of the Y chromosome is composed of repetitive satellite DNA, transposable elements (TEs), and Y-linked rDNA blocks [6], and it is transcriptionally silenced through heterochromatin formation [7]. Despite its small number of genes, natural variation on the Y chromosome is associated with variation in several traits, including male fitness [8] and position effect variegation (PEV; the ability of spreading heterochromatin to induce partial silencing of reporter genes in some cells, resulting in mosaic expression patterns [9]). More recently, it was found that natural variation on the Y has substantial effects on regulation of hundreds of protein-coding genes genome-wide [10-13].
The molecular basis of this phenotypic variation is unclear. Single-nucleotide polymorphism in protein-coding genes is low on the Y chromosome [14, 15], and it has been proposed that structural variation involving repetitive DNA is responsible for the observed phenotypic effects of different Y chromosomes [16]. Specifically, most of the highly repetitive Y chromosome is enriched for heterochromatic proteins and repressive histone modifications, and the Y may act as a ‘heterochromatin sink’ by sequestering core components of the heterochromatin machinery, thereby limiting the ability to silence other repetitive regions of the genome [16, 17]. Under the heterochromatin sink model, Y chromosomes vary in their ability to sequester heterochromatin components due to variations in the total amount or sequence content of their repetitive sequences. Protein-coding genes from the D. melanogasterY chromosome are only expressed in germ cells of males, but the effects on global gene expression by different Y chromosomes also occur in XXY females and somatic cells of XY males [11-13]. This observation is consistent with the heterochromatin sink model, where the Y chromosome exerts its effect indirectly by depleting or redistributing chromatin regulators across the genome. However, these studies have been limited to reporter loci to assess the effect of the Y chromosome on heterochromatin formation, and the global chromatin landscapes of individuals with different amounts of heterochromatic sequence have not yet been directly examined. Here, we test the hypothesis that the Y chromosome acts to modulate heterochromatin integrity and gene expression genome-wide by contrasting the chromatin landscapes and expression profiles of X0 and XYY males and XXY females to that of wildtype D. melanogaster.
Results
Fly strains
To compare the chromatin landscape between Drosophila that differ in their sex chromosome karyotype and their amount of repetitive DNA, we set up crosses between D. melanogaster stock number 2549 from the Bloomington Stock Center, which has a compound reversed metacentric X chromosome (C(1)RM) or a hetero-compound chromosome with the X chromosome inserted between the two arms of the Y chromosome (C(1;Y)), and the wildtype Canton-S stock (Figure 1B). We selected X0 males that contained a maternally transmitted X chromosome (as do wildtype males), and XXY females that contain a wildtype Y chromosome (rather than the C(1;Y) chromosome; see Figure 1B). Note that the resulting flies are not completely isogenic (and it is impossible to create completely isogenic flies using this crossing scheme), but different comparisons contrast flies with identical autosomal backgrounds. In particular, our wildtype male and female comparison share the same autosomal genotype (Canton-S), and our X0 males and XXY females both have one autosomal complement from Canton-S, and one from the 2549 stock. XYY males inherit 75% of autosomal genes from strain 2549. We used flow cytometry to estimate the genome sizes of the 5 karyotypes with different sex chromosome configurations, and using estimates of diploid euchromatic genome sizes of 231 Mb for individuals with two X chromosomes and 208 Mb for individuals with one X chromosome, we then estimated the amount of heterochromatic sequences in each karyotype. As expected, we found a gradient of heterochromatic sequence content for the 5 karyotypes, with XO males (~111Mb) < XX females (~124 Mb) < XY males (~151 Mb) < XXY females (~161 Mb) < XYY males (~181 Mb) (Table S1, Figure 1A).
Quantification of histone modifications
We aged all flies for 8 days, and carried out chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq) on head and thorax tissue using commercial antibodies against three post-translational histone modifications (H3K4me3, H3K9me2, H3K9me3). We employed a previously described normalization strategy to compare the genomic distribution and relative levels of chromatin marks across flies with different karyotypes. Specifically, we ‘spiked in’ a fixed amount of chromatin from female 3rd instar Drosophila miranda to each D. melanogaster chromatin sample prior to ChIP and sequencing. D. miranda chromatin served as an internal standard for the immunoprecipitation experiment (Table S2), and the relative recovery of D. melanogaster ChIP signal vs. D. miranda ChIP signal, normalized by their respective input counts, was used to quantity the relative abundance of the chromatin mark in D. melanogaster (see Methods for details). Note that this normalization strategy uses input coverage to account for differences in ploidy levels of sex chromosomes among the different karyotypes investigated and is agnostic to the total genome size of the sample (Figure S1). D. miranda is sufficiently diverged from D. melanogasterfor sequencing reads to be unambiguously assigned to the correct species: even in repetitive regions, <4% of the reads cross-mapped between species; these regions were excluded from the analysis. Signal for H3K4me3 is highly correlated across samples, and H3K9me2 and H3K9me3 correlate well with each other for all samples, and replicate ChIP data without a D. miranda chromatin spike are highly correlated (Table S3, see Methods for details). We used the total normalized number of D. melanogaster reads to compare the genome-wide distribution of chromatin modifications in flies with different sex chromosome karyotypes. Figure 2 shows the genomic distribution of the active H3K4me3 chromatin mark for the various karyotypes, and Figures 3 and 4 show genomic distributions for the repressive H3K9me2/3 marks, respectively.
The genomic distribution of active chromatin is similar in flies with different karyotypes
The histone modification H3K4me3 primarily associates with active genes [18, 19] and is highly underrepresented in repeat-rich regions, including the Y chromosome; we thus expect that its relative abundance and genomic distribution is little influenced by the dose of Y chromosomes. Indeed, we find that H3K4me3 peaks are primarily located along the euchromatic chromosome arms, and highly deficient in pericentromeric regions, and along the Y chromosome (Figure 2). Genomic enrichment patterns of H3K4me3 are similar across sexes and flies with varying numbers of Y chromosomes (Figure 2), both when comparing the relative position of peaks, but also the absolute magnitude of signal across samples (Figure 2). This confirms our expectation that Y dose should not dramatically influence the distribution of active chromatin marks, and also suggests that our normalization procedure is accurate in quantifying relative abundance of histone modifications across samples. Western blots confirm our inferences based on ChIP-seq, i.e. that H3K4me3 signal is similar across flies with different karyotypes (Figure S2).
Heterochromatic histone modifications in wildtype flies
We investigated the genomic distribution of two histone marks that are associated with heterochromatin formation, H3K9me2 and H3K9me3 [19]. If the Y chromosome indeed acts as a sink for components of the heterochromatin machinery, we expect global differences in the enrichment patterns of heterochromatic histone modifications across strains with different numbers of Y chromosomes, or more generally, across flies with different amounts of repetitive DNA (see Figure 1A). In wildtype Drosophila, heterochromatin is highly enriched in pericentromeric regions, the small dot chromosome, and along the entire length of the Y chromosome [1, 6, 20]. Note that the D. melanogaster Y chromosome is estimated to be about 40Mb (i.e. 20% of the haploid male genome; [1, 5]), but its assembled size is only 3.7Mb; thus our genome mapping analysis will underestimate the extent of heterochromatic histone modifications that are associated with the repetitive Y chromosome.
Overall, we find that levels of heterochromatin enrichment are similar for the H3K9me2 and H3K9me3 marks, but differ between flies with varying amount of repetitive DNA (Figure 3, 4). The male-specific Y chromosome is highly enriched for both of these repressive histone modifications in wildtype males, and we find that wildtype females have slightly higher levels of H3K9me2/3 enrichment than males in their pericentromeric regions, and on the dot chromosome, relative to euchromatic background levels (Figure 3, 4). Moreover, the heterochromatin / euchromatin boundary is slightly less clearly discernable from H3K9me2/3 enrichment patterns for males relative to females (Figure 5, Figure S3-S5). Western blots suggest that males harbor slightly more H3K9me2/3 compared to females (Figure S2). Thus, we find strong enrichment of the heterochromatic histone modifications on the Y and their relative deficiency at pericentromeric regions on autosomes and the X in wildtype males relative to females, despite similar amounts of overall H3K9me2/3. This observation is consistent with the hypothesis that the repeat-rich Y chromosome acts as a sink for components of the heterochromatic machinery, resulting in a relative paucity of heterochromatic histone modifications elsewhere in the genome. However, despite quantitative differences in levels of heterochromatic histone modifications, overall patterns of H3K9me2/3 enrichment are similar between sexes.
Heterochromatic histone modifications in X0, XXY & XYY flies
To investigate the Y chromosome’s role in the genome-wide distribution and enrichment for heterochromatic components, we studied histone modification profiles from female flies containing a Y chromosome (XXY females), and males with either zero or two Y chromosomes (X0 vs. XYY males). Female Drosophila that contain a wildtype Y chromosome show clear enrichment for both heterochromatic histone modifications on the Y chromosome, but an overall reduction in levels of H3K9me2/3 relative to wildtype females, both at pericentromeric regions and along the dot (Figure 3, 4). The genomic distribution of H3K9me2/3 in XXY females is consistent with the model of the Y chromosome acting as a sink for components of the heterochromatin machinery, sequestering heterochromatic proteins to the Y chromosome and diluting them away from autosomal and X-linked targets. XXY females also show less heterochromatic histone modifications at pericentromeric regions and the dot relative to wildtype XY males (Figure 3, 4). This is consistent with the higher repeat content in XXY flies compared to XY flies - due to the large heterochromatic block on the X - contributing to the heterochromatin sink effect. This suggests that the effect of the Y chromosome on heterochromatin distribution is not a unique property of the Y but instead a result of a large amount of any additional repetitive sequence. XYY males harbor the highest amount of repetitive DNA and show severely decreased levels of H3K9me2/3 enrichment along repeat-rich, normally heterochromatic regions, including their Y chromosomes, pericentromeric regions, and along the dot, relative to levels found in other karyotypes investigated (Figure 3, 4).
X0 males, on the other hand, have the lowest repeat content of all flies, and show the strongest enrichment of heterochromatic histone modifications at pericentromeric regions and along the dot chromosome (Figure 3, 4). Enrichment levels of H3K9me2/3 at repetitive regions (pericentromere and the dot) relative to euchromatic background levels in X0 males is well above that of wildtype males and also wildtype females (or XXY females, which have the same autosomal background as X0 flies; Figure 3, 4). Together, our data provide clear evidence that Y chromosomes, and repetitive DNA in general, affect heterochromatin formation genome-wide, consistent with a model of the Y chromosome or other large blocks of repetitive sequences acting as heterochromatin sinks, possibly by redistributing heterochromatin components across the genome.
The depletion of heterochromatic histone modifications from pericentromeric regions causes the euchromatin/ heterochromatin boundaries to be significantly diluted in XXY and XYY individuals (Figure 5, Figure S3, S4). X0 males, in contrast, show spreading of their pericentromeric heterochromatin into chromosome arms that are normally euchromatic in wildtype flies, which is consistent with previous studies that found enhanced position effect variegation in XO males (Figure 5, Figure S3, S4; [21, 22]).
Overall, we see that increasing the amount of repetitive DNA by changing the dose of both sex chromosomes corresponds with a decrease in the signal of heterochromatic histone modifications at pericentromeric regions and along the dot chromosome. This is consistent with a model of stoichiometric balance between protein components involved in the formation of heterochromatin and the amount of repetitive DNA sequences within a genome. Together, ChIP-seq profiles of histone modifications in wildtype flies, X0 and XXY males, and XXY females support the hypothesis that the Y chromosome acts as a heterochromatin sink in Drosophila.
Sex chromosome dose and gene expression
Polymorphic Y chromosomes affect expression of hundreds of autosomal and X-linked genes in D. melanogaster, a phenomenon known as Y-linked regulatory variation (YRV) [10-13]. To test if genes that respond to YRV are also expressed differentially in flies with different sex chromosome configurations, we collected replicate RNA-seq data from heads for wildtype males and females, as well as from X0, XXY and XYY flies. As noted above, protein-coding Y-linked genes in Drosophila are only expressed in male germ line and thus cannot directly contribute to differences in expression profiles in head samples among flies with different numbers of Y chromosomes. Overall, we find that 100s of genes show differential expression among flies with different sex chromosome karyotypes (Figure 6A). GO analysis revealed that differentially expressed genes tend to be enriched for functions associated with reproductive processes (Table S4), and are not simply clustered around pericentromeric regions (Figure S5, S6). Genes that are expressed most differently between XO and XY males, and XX and XXY females, show significantly greater difference in H3K9me2 signal compared to all genes, while these genes have significantly less difference in H3K4me3 signal compared to all genes (Figure S7). This is consistent with the hypothesis that the Y chromosome re-distributes heterochromatin components genome-wide, and can thereby influence the expression of hundreds of genes.
We used a consensus set of 678 genes that were classified as susceptible to YRV [23], and found that these genes were generally expressed more differently between different sex chromosome karyotypes compared to random genes (Figure 6A). This suggests that a similar mechanism is underlying both YRV and gene expression differences in flies with different sex chromosome configurations. Genes that are genetically defined to either suppress or enhance silencing in assays for PEV in D. melanogaster, i.e. Su(var) and E(var) genes [7], are expressed at similar levels in flies with different karyotypes (Figure 6). This is consistent with our Western blots that reveal no consistent differences in H3K9me2/3 differences among flies with different sex chromosome configurations (Figure S2).
Interestingly, genes susceptible to YRV are more likely to be differentially expressed between wildtype sexes, and genes that are differentially expressed between males and females in head tissue tend to also be differentially expressed between X0 and XY males, or XX and XXY females (p<1e-6, permutation test, Figure 6B, Figure S8). In particular, 160 of the top 10% genes that are differentially expressed between wildtype XX females and XY males, vs. X0 and XY males vs. XX and XXY females overlap, while we only expect 9 by chance. This suggests that a substantial fraction of sex-biased expression in somatic tissues may simply be an indirect consequence of the absence or presence of the Y, i.e. the sink effect of the Y chromosome may contribute to sex-biased expression patterns in D. melanogaster.
Repeat reactivation in XXY and XYY flies
Heterochromatin is established during early embryogenesis and leads to the transcriptional silencing of repetitive DNA and transposable elements (TEs) [7]. We used our RNA-seq data to assess whether changes in chromatin structure due to Y chromosome dose are associated with changes in gene expression patterns of repetitive elements. We first used consensus sequences of known TEs annotated by FlyBase (flybase.org), and found that overall repeat content correlated negatively with H3K9me2/3 enrichment at TEs: X0 flies had the highest level of H3K9me2/3 enrichment across TE families, followed by XX and XY wildtype flies, and XXY and XYY flies having the lowest amount of heterochromatin marks at their TEs (p<0.01 for each comparison; Figure 7A, Figure S9A; note that these estimates are corrected for differences in copy numbers between repeats, by looking at the enrichment of H3K9me2/3 enrichment over input for each karyotype). Despite dramatic differences in overall levels of repressive histone marks across repeat families, levels of expression for the various TEs between karyotypes are very similar (p>0.05, Figure 7A, Figure S10). A subset of TEs show an increase in expression in XYY males compared to other samples, including at least 5 retroviral elements (1731, 297, Max element, mdg1, and mdg3, Figure S10). Increased expression of these repeats appears in part be driven by an increased copy number in the XYY male genome; if we correct for genomic copy number, we find that only three of these repeats (1731, 297, and Max element) are expressed more highly in XYY males compared to the other karyotypes (Figure S11). Thus, despite global differences in heterochromatin formation associated with repeats across karyotypes, this does not manifest itself in a global de-repression of TEs, but seems to instead involve de-repression of just a subset of TE families.
Most of the Y chromosome has not yet been assembled [24], including its repetitive elements, and we were interested in whether expression of Y-linked repeats would be particularly sensitive to Y chromosome dosage. We thus used a de novo approach to identify male-specific Y-linked repeats that does not rely on a genome assembly, but instead uses kmer abundances from next generation sequencing reads to produce a repeat library [25]. We then mapped male and female genomic reads from the Canton-S strain back to our de novo assembled repeat library, in order to infer Y-linkage for repeats that were only covered by male genomic reads (Figure S12, S13). Male-specific repeats are highly enriched for H3K9me2/3 in wildtype males, and transcriptionally silenced (Figure 7B). However, while Y-linked repeats show similar enrichment for the H3K9me3 mark in all karyotypes (Figure S9B), XXY females and XYY males are highly deficient for H3K9me2 at Y-linked repeats and expression of Y-linked repeats is de-repressed relative to wildtype males (Figure 7B, Figure S14). If we account for differences in copy number of the Y-linked repeats, we still find that Y-linked repeats are expressed more highly in XXY females and XYY males compared to wildtype males (Figure S15). Thus, consistent with the ChIP-seq data that showed low levels of heterochromatic histone modifications (especially H3K9me2) along the Y of XXY females or the two Y chromosomes of XYY males, relative to wildtype males, our gene expression data demonstrate that Y-linked repeats become transcriptionally activated in female flies that normally do not have a Y chromosome, or male flies with double the dose of Y-linked repeats, and this is not simply a consequence of an increased copy number of Y-linked repeats.
Discussion
Dosage effects of chromatin components and repetitive DNA
Many eukaryotic genomes contain large amounts of selfish, repetitive DNA, and transcriptional silencing of repeats through heterochromatin formation is one way to alleviate the deleterious effects of repetitive DNA [7]. Studies of PEV in D. melanogaster have yielded important insights into the biology of heterochromatin [26-28], and frequently found dose-dependent effects of chromatin proteins and trans-activating factors [29]. For example, depletion of HP1, an important protein involved in both the recruitment and maintenance of heterochromatic histone modifications, suppresses variegation [30] (i.e. it results in less heterochromatin formation and thus less suppression at a reporter gene), whereas increased dosage of HP1 enhances variegation [31] (i.e. it increases silencing through increased heterochromatin formation). In D. melanogaster, the Y chromosome is a potent suppressor of variegation, i.e. it induces less heterochromatin at a reporter gene [9], and D. melanogaster males with different Y chromosomes in otherwise identical genetic backgrounds vary in their propensity to silence a heterochromatin-sensitive reporter gene in PEV assays [12]. Highly repetitive Y chromosomes are thought to sequester heterochromatic factors that are present in only limited amounts [32], and different Y chromosomes vary in their repeat content and thus the extent to which they sequester those heterochromatin components, thereby influencing PEV.
In our study, we directly demonstrate that the Y chromosome, and repeat-rich DNA in general, can act to globally affect heterochromatin formation in D. melanogaster. We find that increasing the amount of repetitive DNA generally decreases the amount of H3K9me2/3 enrichment at repeat-rich regions, such as pericentromeres, the dot, or the Y chromosome. Individuals with the lowest repeat content (X0 males in our experiment) show the highest enrichment of H3K9me2/3 in repeat-rich regions, and the pericentromeric heterochromatin on the autosomes of X0 flies clearly extends into genomic regions that are normally euchromatic in wildtype D. melanogaster. Wildtype females show slightly higher H3K9me2/3 levels at their pericentromeric regions and the dot chromosome and a slightly sharper euchromatin/ heterochromatin boundary at autosomes compared to wildtype males. Indeed, females generally show a higher degree of silencing in assays for PEV, suggesting that normally euchromatic regions are more prone to acquire a heterochromatic conformation in females [33, 34].
XYY males and XXY females, on the other hand, show a dramatic reduction of H3K9me2/3 enrichment at repeat-rich regions, and the boundaries between the heterochromatic pericentromere and the euchromatic chromosome arms become blurry. Thus, this dosage sensitivity of H3K9me2/3 enrichment in repetitive regions suggests that there is a stoichiometric balance among protein components and total repeat content of the genome to maintain proper heterochromatic silencing.
Functional consequences of the Y chromosome’s global effects on heterochromatin
Analyses of gene expression profiles suggest that global changes in heterochromatic histone modifications can have broad functional consequences for the organism. Specifically, we show that hundreds of genes are differentially expressed in individuals that differ in their sex chromosome karyotype, and genes that are susceptible to YRV are more prone to be differentially expressed in individuals with different sex chromosome complements. We find that increasing the amount of repetitive DNA leads to a decrease in heterochromatic histone modification signal at TEs. XYY males and XXY females have low levels of H3K9me2 signal in TEs, and especially so in male-specific repeats, and we show that this deficiency of heterochromatin is associated with a de-repression of Y-linked repeats that we detect as an increase in expression levels of these repeats. Thus, while fruit flies have efficient mechanisms in place to silence wildtype levels of repetitive DNA, a large increase in the amount of repetitive sequences, caused by introducing additional Y chromosomes, limits the organism’s ability to form heterochromatin and those additional repeats apparently cannot be efficiently silenced.
Whole-genome sequencing studies can provide information on the genome size by estimating the amount of euchromatic DNA, but cannot reliably estimate the amount of repetitive, heterochromatic sequences. Cytogenetic studies suggest that individuals within a population can differ greatly in how much repetitive heterochromatic DNA they contain. The size of the pericentromeric heterochromatic block on the D. melanogaster X chromosome, for example, varies by about 2-fold among strains [35], and dramatic variation in size and morphology of the Y chromosome has been reported in natural populations of D. pseudoobscura [36]. Moreover, haploid genome size estimates of different D. melanogaster strains using flow cytometry differ by almost 100Mb, and the vast majority of this variation is thought to result from differences in repetitive heterochromatin [37]. Similarly, a recent bioinformatics analysis that identified and quantified simple sequence repeats from whole genome sequences also found a 2.5-fold difference in their abundance between D. melanogaster strains [38]. Thus, natural variation in repetitive DNA among individuals may in fact span a wider range than that across sex chromosome karyotypes investigated here. This implies that repetitive DNA might serve as an important determinant of global chromatin dynamics in natural populations, and may be an important modifier of the differential expression of genes and TEs between individuals.
Heterochromatin/euchromatin balance between sexes
Males contain a Y chromosome that is highly repetitive and heterochromatic, and which may shift the genome-wide heterochromatin/ euchromatin balance between the sexes [39]. In particular, if the Y chromosome sequesters proteins required for heterochromatin formation, males may be more sensitive to perturbations of the balance between repetitive sequence content and heterochromatic protein components, and might have lower levels of heterochromatin-like features in the rest of their genome, as compared to females. Indeed, RNAi knockdown of the heterochromatin protein HP1 preferentially reduces male viability [40], and the presence of Y-linked heterochromatin is thought to underlie this differential sensitivity. Female Drosophila are also more tolerant of heat shock, survive heat-induced knock-down better, and become sterile at higher temperatures than males [41], and it is possible that differences in the chromatin landscape may contribute to sex-specific differences in heat stress response. As mentioned, female flies show stronger silencing in assays for PEV [33, 34], consistent with having more heterochromatin protein components relative to repetitive sequences, which can then spread into reporter genes more readily.
Many recent studies in animals have shown that a large portion of the transcriptome in animals is sex-biased [42, 43]. Sex-biased expression patterns are typically seen as an adaptation to form the basis of sexually dimorphic phenotypes [44]. In Drosophila, most sex-biased expression patterns are due to differences in expression in sex-specific tissues (i.e. gonads; [45, 46]); however, hundreds of genes also show differential expression in shared, somatic tissues [45, 46]. Interestingly, we find that a similar set of genes that show differences in expression patterns between males and females (in head) are also differentially expressed between XY and X0 males, or XX and XXY females. This suggests that not sex per se, but the absence or presence of the Y chromosome is responsible for much of the differences in expression patterns between sexes. Thus, while sex-biased expression is normally interpreted as a sex-specific adaptation to optimize expression levels of genes in males and females, it is also possible that sex-biased expression patterns are simply an indirect consequence of males having to silence a large repetitive Y chromosome, thereby changing the chromatin structure genome-wide as compared to females.
Materials & Methods
Drosophila strains
Fly strains were obtained from the Bloomington Stock Center. The following strains were used: Canton-S and 2549 (C(1;Y),y1cv1v1B/0 & C(1)RM,y1v1/0). The crossing scheme used to obtain X0 and XYY males and XXY females is depicted in Figure 1B. For chromatin and gene expression analyses, flies were grown in incubators at 25°C, 32% of relative humidity, and 12h light. Newly emerged adults were collected and aged for 8 days under the same rearing condition before they were flash-frozen in liquid nitrogen and stored at -80°C.
Genome size estimation
We estimated genome size of the 5 karyotypes of interest using flow cytometry methods similar to those described in [47]. Briefly, samples were prepared by using a 2mL Dounce to homogenize one head each from an internal control (D. virilis female, 1C=328 Mb) and one of the 5 karyotypes in Galbraith buffer (44mM magnesium chloride, 34mM sodium citrate, 0.1% (v/v) Triton X-100, 20mM MOPS, 1mg/mL RNAse I, pH 7.2). After homogenizing samples with 15-20 strokes, samples were filtered using a nylon mesh filter, and incubated on ice for 45 minutes in 25 ug/mL propidium iodide. Using a BD Biosciences LSR II flow cytometer, we measured 10,000 cells for each unknown and internal control sample. We ran samples at 10-60 events per second at 473 voltage using a PE laser at 488 nm. Fluorescence for each D. melanogaster karyotype was measured using the FACSDiva 6.2 software and recorded as the mode of the sample’s fluorescent peak interval. We calculated the genome size of the 5 karyotypes by multiplying the known genome size of D. virilis (328 Mb) by the ratio of the propidium iodide fluorescence in the unknown karyotype to the D. virilis control.
Western blotting
We performed Western blots from acid-extracted histones, probing for H3K9me2, H3K9me3, H3K4me3, and total H3. Briefly, approximately 30 flies of each karyotype were dissected on dry ice to remove the abdomen. The resulting heads and thoraces were ground in PBS plus 10mM sodium butyrate, and were acid-extracted overnight at 4°C. Samples were then run on a 4-12% gradient bis-tris gel and transferred to a nitrocellulose membrane using Invitrogen’s iBlot Dry Transfer Device. After blocking with 5% milk in PBS, we incubated membranes overnight with either 1:1000 H3K9me2 antibody (Abcam ab1220), 1:2000 H3K9me3 antibody (Abcam ab8898), 1:2000 H3K4me3 antibody (Abcam ab8580), or 1:2000 H3 antibody (Abcam ab1791) in Hikari Signal Enhancer (Nacalai 02272). We then incubated membranes with 1:2500 secondary antibody (Licor 68070 and 32213), imaged bands on a Licor Odyssey CLx Imager, and quantified intensity using ImageJ.
Chromatin-IP and sequencing
We performed ChIP-seq experiments using a standard protocol adapted from [48]. Briefly, approximately 2 mL of adult flash-frozen flies were dissected on dry ice, and heads and thoraces were used to fix and isolate chromatin. Following chromatin isolation, we spiked in 60uL of chromatin prepared from female Drosophila miranda larvae (approximately 1ug of chromatin). We then performed immunoprecipitation using 4uL of the following antibodies: H3K9me2 (Abcam ab1220), H3K9me3 (Abcam ab8898), and H3K4me3 (Abcam ab8580).
After reversing the cross-links and isolating DNA, we constructed sequencing libraries using the BIOO NextFlex sequencing kit. Sequencing was performed at the Vincent J. Coates Genomic Sequencing Laboratory at UC Berkeley, supported by NIH S10 Instrumentation Grants S10RR029668 and S10RR027303. We performed 50bp single-read sequencing for our input and H3K4me3 libraries, and 100bp paired-end sequencing for the H3K9me2 and H3K9me3 libraries, due to their higher repeat content.
For H3K4me3, Pearson correlation values between the 5 karyotypes is very high, and the magnitude of difference between the samples is low (Table S2). For the two heterochromatin marks, Pearson correlation values between the two marks were generally high for all samples, and overlap of the top 40% of 5kb windows was similarly high for all samples (Table S2). Additionally, we obtained replicates for H3K9me3 for all samples except XX female, which has extremely high correlation values between H3K9me2 and H3K9me3. The unspiked replicate data for H3K9me3 correlate well with the D. miranda chromatin spike data that was used for the bulk of our analyses (Table S2).
RNA extraction and RNA-seq
We collected mated males and females of the various karyotypes, aged them for 8 days, and dissected and pooled 5 heads from each karyotype. We then extracted RNA and prepared stranded total RNA-seq libraries using Illumina’s TruSeq Stranded Total RNA Library Prep kit with Ribo-Zero ribosomal RNA reduction chemistry, which depletes the highly abundant ribosomal RNA transcripts (Illumina RS-122-2201). We performed 50bp single-read sequencing for all total RNA libraries at the Vincent J. Coates Genomic Sequencing Laboratory at UC Berkeley.
Mapping of sequencing reads, and data normalization
For all D. melanogaster alignments, we used Release 6 of the genome assembly and annotation [24]. For all ChIP-seq datasets, we used Bowtie2 [49] to map reads to the genome, using the parameters “-D 15 –R 2 –N 0 –L 22 –i S,1,0.50 --no-1mm-upfront”, which allowed us to reduce cross-mapping to the D. miranda genome to approximately 2.5% of 50bp reads, and 1% of 100bp paired-end reads. We also mapped all ChIP-seq datasets to the D. miranda genome assembly [50] to calculate the proportion of each library that originated from the spiked-in D. miranda chromatin versus the D. melanogaster sample.
To calculate ChIP signal, we first calculated the coverage across 5kb windows for both the ChIP and the input, and then normalized by the total library size, including reads that map to both D. melanogaster and the D. miranda spike. We then calculated the ratio of ChIP coverage to input coverage, and normalized by the ratio of D. melanogaster reads to D. miranda reads in the ChIP library, and then by the ratio of D. melanogaster reads to D. miranda reads in the input, to account for differences in the ratio of sample to spike present before immunoprecipitation. Note that this normalization procedure is accounts for differences in ploidy as well as genome size by using a ratio of ChIP coverage to input coverage (see Figure S1).
Gene expression analysis
We first mapped RNA-seq reads to the ribosomal DNA scaffold in the Release 6 version of the genome, and removed all reads that mapped to this scaffold, as differences in rRNA expression are likely to be technical artifacts from the total RNA library preparation. We then mapped the remaining reads to the Release 6 version of the D. melanogaster genome using Tophat2 [51], using default parameters. We then used Cufflinks and Cuffnorm to calculate normalized FPKMs for all samples. GO analysis was performed using GOrilla using ranked lists of differentially expressed genes [52].
Repeat libraries
We used two approaches to quantify expression of repeats. Our first approach was based on consensus sequences of known repetitive elements that were included in the Release 6 version of the D. melanogaster genome and are available on FlyBase. These included consensus sequences for 125 TEs. We also added the consensus sequences of three known satellite sequences, (Dodeca, Responder, and 359), to include larger non-TE repetitive sequences in our repeat analyses.
We were particularly interested in mis-regulation of the Y chromosome, which is poorly assembled. We therefore assembled repetitive elements de novo from male and female genomic DNA reads using RepARK [25], setting a manual threshold for abundant kmers of 5 times the average genome coverage, which corresponds to a repetitive sequence occurring at least 5 times in the genome. To identify male-specific repeats, we mapped male and female genomic reads back to our de novo assembled repeats, and identified repeats that had high coverage in males and either no coverage or significantly lower coverage in females (Figure S9). After filtering in this way, we obtained 101 male-specific repeats comprising 13.7kb of sequence, with a median repeat size of 101bp.