Abstract
Homologous chromosomes colocalize to regulate gene expression in processes including genomic imprinting and X-inactivation, but the mechanisms driving these interactions are poorly understood. In Drosophila, homologous chromosomes pair throughout development, promoting an interchromosomal gene regulatory mechanism called transvection. Despite over a century of study, the molecular features that drive chromosome-wide pairing and transvection are unknown. Here, we find that the ability to pair with a homologous sequence is not a general feature of all loci, but is specific to “button” loci interspersed across the genome. Buttons are characterized by topologically associated domains (TADs), which drive pairing with their endogenous loci from multiple genomic locations. Using a button spanning the spineless gene as a paradigm, we find that pairing is necessary but not sufficient for transvection. spineless pairing and transvection are cell-type-specific, suggesting that local buttoning and unbuttoning regulates transvection efficiency between cell types. Together, our data support a model in which button loci bring homologous chromosomes together to facilitate cell-type-specific interchromosomal gene regulation.
Introduction
Chromosomes are organized in a complex manner in the nucleus. In higher organisms, they localize to distinct territories (1) Regions of chromosomes interact to form compartments, which are segregated based on gene expression states (2). Chromosomes are further organized into TADs, regions of self-association that are hypothesized to isolate genes into regulatory domains and ensure their activation by the correct cis-regulatory elements (2). TADs vary in size from ∼100 kb in Drosophila melanogaster to ∼1 Mb in mammals (2, 3) Disruptions of nuclear organization, such as alteration of TAD structure and localization of genes to incorrect nuclear compartments, have major effects on gene expression (4-7). However, it is unclear how elements within the genome interact between chromosomes to organize chromatin and regulate gene expression.
One key aspect of nuclear architecture involves the tight colocalization, or “pairing,” of homologous chromosomes to facilitate regulatory interactions between different alleles of the same gene (8). In Drosophila melanogaster, homologous chromosomes are paired throughout interphase in somatic cells (9). This stable pairing provides an excellent paradigm to study the mechanisms driving interactions between chromosomes.
Despite over a century of study, it is poorly understood how homologous chromosomes come into close physical proximity. Two main models have been proposed. In the “zipper” model, all regions of the genome have an equal ability to pair based on sequence homology, and chromosome pairing begins at the centromere and proceeds distally to the telomeres (Fig. 1A) (10) The “button” model proposes that regions of high pairing affinity are interspersed along the chromosome arms and come together through a random walk to initiate pairing (Fig. 1A) (11-13). A handful of DNA transgenes are known to drive pairing with their endogenous loci at a relatively low frequency (14-17), but the general sequence and structural features that contribute to stable, chromosome-wide pairing are unknown.
Pairing of homologous chromosomes is required for a gene-regulatory mechanism known as transvection, in which two different mutant alleles interact between chromosomes to rescue gene expression (Fig. 1B) (10). Transvection has been described for a number of Drosophila genes (18). Generally, the efficiency of transvection decreases in the presence of chromosome rearrangements, which are assumed to disrupt chromosome pairing (10, 18) However, it is unclear if the same DNA elements are required for both homologous chromosome pairing and transvection and whether pairing and transvection are mechanistically separable.
Homologous chromosome pairing occurs more strongly in some cell types than in others (11-13) Similarly, transvection efficiency varies widely between cell types (19-21) However, a direct link between the level of pairing in a given cell type and the strength of transvection in that cell type has not been established.
Here, we develop a method to screen for DNA elements that pair and find that regions interspersed across the genome drive pairing, supporting the button hypothesis. A subset of TADs are associated with buttons and can drive pairing from different positions in the genome. By testing mutant alleles and transgenes of the spineless gene, we find that pairing and transvection are mechanistically separable and cell-type-specific. Together, our data suggest that buttons drive homologous chromosome pairing, promoting cell-type-specific interchromosomal gene regulation.
Results
“Button” loci are interspersed along chromosome arms
To distinguish between the zipper and button models of pairing, we tested whether transgenes inserted on different chromosomes were sufficient to drive pairing with their endogenous loci. We hypothesized that if chromosomes come together through a zipper mechanism, all transgenes would drive pairing, whereas if chromosomes are buttoned together, only a subset of transgenes would drive pairing.
We first screened a set of ∼80-110 kb transgenes tiling a ∼1 Mb region on chromosome 3R (Fig. 1G). We inserted individual transgenes into a site on chromosome 2L (site 1; Fig. 1C) and visualized their nuclear position using Oligopaints DNA FISH (22). As the endogenous and transgenic sequences were identical, we distinguished between them by labeling the sequence neighboring the endogenous locus with red probes and the sequence neighboring the transgene insertion site with green probes (Fig. 1C). We examined pairing in post-mitotic larval photoreceptors to avoid disruptions caused by cell division.
Only a subset (5/17) of transgenes drove pairing between chromosomes 2L and 3R, bringing the average distances between the red and green FISH signals significantly closer than in a control with no transgene (Fig. 1C-E, G-H; Fig. S1A). The red and green signals did not completely overlap, likely because they did not directly label the paired sites (Fig. S2A-C). For the remaining 12/17 transgenes, the distances between the red and green signals were not significantly different from the no transgene control, indicating that they did not drive pairing (Fig. 1C-D, F-H; Fig. S1B). Thus, remarkably, a subset of transgenes overcame endogenous nuclear architecture to drive pairing between non-homologous chromosomes, supporting the button model.
The pairing observed between transgenes on chromosome 2L and endogenous sequences on chromosome 3R could be affected by the transgene insertion site. To test the position independence of button pairing, we inserted Transgene E onto chromosome 3L (site 3; Fig. S3A) and found that it paired with its homologous endogenous locus on chromosome 3R (Fig. S3B-D), showing that buttons can drive pairing from different sites in the genome.
Our data suggest that pairing initiates through a button mechanism, in which specific loci interspersed along chromosomes drive homologous chromosomes together.
TADs are features of buttons
Only a subset of transgenes drove pairing, suggesting that unique chromatin structures within these transgenes might enable button function. We examined 14 publicly available Hi-C datasets to determine the relationship between buttons and topologically associated domains (TADs), genomic regions of self-association. We defined TADs using directionality indices, which measure the bias of a genomic region towards upstream or downstream interactions along the chromosome (23). TADs on a directionality index are read from the beginning of a positive peak, which indicates downstream interactions, to the end of a negative peak, which indicates upstream interactions (Fig. 2A; Fig. S4A; fig. S5A-E; Fig. S6A-E).
We found that 60% of transgenes that drove pairing (“pairers”) encompassed a complete TAD, including both TAD boundaries, compared to only 8% of transgenes that did not drive pairing (“non-pairers”) (Fig. 1H; Fig. S4A; Fig. S7A-B). In a striking example, Transgenes E and F overlapped significantly, but only Transgene E, which contained a full TAD, drove pairing (Fig. 2A; Fig. S1A-B). Together, these data suggest that specific TADs contribute to button function.
To test the hypothesis that TADs are a feature of buttons, we examined additional transgenes spanning regions on chromosomes X, 2L, 2R, and 3R. Six of these transgenes encompassed entire TADs, while the remaining four did not span TADs (Fig. 2F; Fig. S1E; Fig. S5A-E; Fig. S6A-E; Fig. S7A-B). Based on the availability of Oligopaints probes, we used an alternative FISH strategy for a subset of these transgenes, in which the identical transgene and endogenous sequences were labeled with the same red fluorescent probes (Fig. 2B). Half of these transgenes drove pairing with their endogenous loci (Fig. 2C-F; Fig. S1C-E). All of the pairers spanned a TAD, whereas only one of five non-pairers spanned a TAD (Fig. 2F; Fig. S1E; Fig. S5A-E; Fig. S6A-E; Fig. S7A-B), further supporting the importance of TADs for button activity.
In total for all transgenes tested in. Fig, 1H Fig. 2F, and Fig. S1E, 80% of pairers spanned a TAD (8/10) while only 12% of non-pairers spanned a TAD (2/17) (Fig 2G), indicating that specific TADs contribute to button activity and drive pairing.
The ∼80-110 kb size limitation of publicly available transgenes prevented testing larger TADs for pairing with our transgene assay. Transgenes that covered only parts of a large TAD on chromosome 3R did not drive pairing (Fig. S8A). To test this large TAD for pairing, we utilized a 460kb duplication of chromosome 3R onto chromosome 2R (Fig. S8B), which encompassed the entire TAD (Fig. S8A). We found that the duplication drove pairing with its homologous endogenous site (Fig. S8C-E), further supporting a role for TADs in pairing.
Because TAD boundaries are often enriched for insulator protein binding sites (3), we hypothesized that pairers might contain a higher number of insulator protein binding sites than non-pairers. We examined modENCODE ChIP data and found that pairers were enriched for insulator binding sites (Fig. 2H), consistent with the conclusion that TADs contribute to button function to drive homologous chromosome pairing.
One prediction of the button model for chromosome pairing is that the content of a transgene (i.e. TADs), not the length of DNA homology, determines pairing. We found no relationship between transgene length and ability to drive pairing (Fig. 2I). Indeed, transgenes of near identical lengths had different pairing abilities (Fig. 2I), further indicating that buttons have distinct features beyond DNA sequence homology.
To identify additional genomic elements that contribute to pairing, we further examined modENCODE ChIP data and found that activating H3K4me3 marks positively correlated with pairing (Fig. 2J). As pairing was not associated with Polycomb Group (PcG) binding sites, repressing epigenetic marks, or non-coding RNAs (ncRNAs) (Fig. S9A-F), we hypothesized that active transcription plays a role in pairing. To test this possibility, we performed RNA-seq on larval eye discs, the same tissue we used in our pairing experiments. We found that pairing positively correlated with gene expression (Fig. 2K), suggesting that gene activity, in addition to TADs, is a feature of buttons that drives pairing.
Together, our data indicate that buttons, characterized by TADs and gene activity, drive pairing of homologous chromosomes.
Pairing and transvection occur despite chromosomal rearrangements
We next interrogated the relationship between button pairing and the gene regulatory process of transvection. Chromosomal rearrangements have been shown to disrupt pairing of genes located near rearrangement breakpoints (10, 18) However, we observed pairing of ∼100 kb transgenes with their endogenous loci, suggesting that intact homologous chromosomes are not required for pairing and that pairing tolerates nearby breakpoints. We therefore reexamined how rearrangements affect pairing, focusing on a button defined by a TAD spanning the spineless (ss) locus (“ss button”; Fig. 3A).
To assess the effects of local rearrangements on ss button pairing, we examined a naturally occurring chromosomal inversion with a breakpoint in the gene immediately upstream of ss (ssinversion) and a duplication with a breakpoint immediately downstream of ss (Fig. 3E). Both ssinversion and the duplication paired with endogenous ss (Fig. s8c-E; Fig. S10A-B), showing that ss button pairing occurs despite chromosomal rearrangements. Consistent with these findings, pairing also occurred at the ss locus in flies with balancer chromosomes containing numerous large inversions and rearrangements (Fig. S10F-J). Thus, in some cases, pairing occurs despite chromosomal rearrangements, consistent with the button model.
Pairing is required for the genetic phenomenon of transvection, in which DNA elements on a mutant allele of a gene act between chromosomes to rescue expression of a different mutant allele (Fig. 1B). In cases where chromosomal rearrangements perturb pairing, transvection is also disrupted (10, 18) Since chromosomal rearrangements did not ablate pairing at the ss button, we hypothesized that transvection would occur at the ss locus in these genetic conditions.
In the fly eye, Ss is normally expressed in ∼70% of R7 photoreceptors to activate expression of Rhodopsin 4 (Rh4) and repress Rhodopsin 3 (Rh3; Fig. 3B-D). Ss is absent in the remaining 30% of R7s, allowing Rh3 expression (Fig. 3B-D) (24) Regulatory mutations in the ss gene cause decreases or increases in the ratio of SsON: SsOFF cells. When two ss alleles with different ratios are heterozygous, transvection between chromosomes (also known as Interchromosomal Communication) determines the final ratio of SsON: SsOFF R7s (25). Thus, the SsON: SsOFF ratio is a phenotype that allows for quantitative assessment of transvection. Throughout our ss transvection experiments, we evaluated Rh3 and Rh4 expression, as they faithfully report Ss expression in R7s (i.e. SsON = Rh4; SsOFF = Rh3). We previously observed transvection at the ss locus for the duplication and balancer chromosome alleles (25). We similarly observed transvection at the ss locus for the ssinversion allele (Fig. S10C-E). Together, these data suggested that buttons can drive pairing and transvection despite chromosomal rearrangements.
Pairing is necessary but not sufficient for transvection
As chromosomal rearrangements did not impair ss pairing or transvection, we further investigated the relationship between pairing and transvection using ss transgenes. Both Transgene S and Transgene T are expressed in 100% of R7s (Fig. S11A-J) because they lack a silencer DNA element, but do not produce functional Ss protein because they lack critical coding exons (Fig. 3E) (25). Transgene T differs from Transgene S in that it lacks 6 kb at its 5’ end (Fig. 3E). We predicted that if Transgenes S and T performed transvection, they would upregulate expression of endogenous ss.
When inserted onto chromosomes 2L or 3L (sites 1 and 3), Transgenes S and T did not drive pairing with the endogenous ss locus on chromosome 3R (Fig. 3F-G, I, K; Fig. S12A-C, E, Q). At these sites, Transgenes S and T did not upregulate ss expression, indicating that they could not perform transvection when unpaired (Fig. 3G-L; Fig. S12A-F).
We next wondered whether Transgenes S and T could perform transvection if we mimicked pairing by forcing them into close physical proximity with endogenous ss. We performed a FISH screen to identify genomic sites that naturally loop to endogenous ss (Fig. 3M) and identified three such sites, located 4.8 Mb upstream of ss, 0.4 Mb upstream of ss, and 4.6 Mb downstream of ss (sites 2, 4, and 5; Fig. 3N-O; Fig. S12G-H, M-N, Q).
When we inserted Transgene S at these sites, it was forced into close proximity with endogenous ss (Fig. 3Q; Fig. S12I, O, Q) and upregulated Ss (Rh4) into nearly 100% of R7s (Fig. 3P, R; Fig. S12J, P) (25). Thus, natural chromosome looping can force loci into proximity and, like pairing, facilitate transvection. In contrast, when we forced Transgene T into close proximity with endogenous ss, it did not upregulate Ss (Rh4) expression, indicating that it could not perform transvection even when paired (Fig. 3P, S-T; Fig. S12K-L, Q). Thus, pairing is necessary but not sufficient for transvection.
We compared the DNA sequences of Transgene T, which does not perform transvection, to Transgene S, the duplication, and the ssinversion, which perform transvection. An upstream region of ∼1.6 kb is present in Transgene S, the duplication, and the ssinversion, but missing from Transgene T, suggesting that this region contains a critical element for transvection (Fig. 3E). ModENCODE ChIP-seq data showed that this region was bound by the Drosophila insulator proteins CTCF, BEAF, Mod(Mdg4), and Cp190. Additionally, this DNA sequence performed P-element homing (25), an indicator of insulator activity. Together, these data suggested that the DNA element required for transvection is an insulator.
To further test whether this insulator was required for transvection, we examined Transgene E, which drove pairing and contained the complete ss locus, except for the insulator element (Fig. 1H; Fig. 3E; Fig. S1A; Fig. S3B-D). We utilized genetic backgrounds in which Transgene E was the only source of Ss protein, so that any changes in Ss (Rh4) expression would indicate transvection effects on Transgene E As a control, we examined Transgene E expression when the endogenous ss locus was hemizygous for a protein null allele (ssprotein null) that did not perform transvection (Fig. S13A-B). In this background, ss on Transgene E was expressed in 52% of R7s (Fig. S13A-B). We next tested Transgene E for transvection with a high-frequency protein null allele (sshigh freq null), which can perform transvection to increase ss expression (Fig. S10D) (25). When the endogenous ss locus was hemizygous for the sshigh freq null, we observed no increase in Transgene E expression, indicating that it did not perform transvection (51% Ss (Rh4); Fig. S13A, C). Moreover, Transgene E did not perform transvection in other genetic conditions (Fig. S13D-E). Thus, Transgene E paired with the endogenous ss locus but failed to perform transvection. These data show that an insulator is required for transvection but not for pairing, indicating that transvection and pairing are mechanistically separable.
ss pairing and transvection are cell-type-specific
It is poorly understood how pairing impacts transvection in a cell-type-specific manner. We propose two models: constitutive and cell-type-specific buttoning. In the constitutive model, all buttons drive pairing in all cell types, and differences in transvection would occur due to variation in transcription factor binding or chromatin state between cell types (Fig. 4A). In the cell-type-specific model, different buttons drive pairing in each cell type, bringing different regions into physical proximity to control transvection efficiency (Fig. 4A).
We tested these models by investigating pairing and transvection of ss in two different tissues. In addition to its role in R7 photoreceptors, ss is required for the development of the arista, a structure on the antenna (Fig. 4C-D) (24, 26). Transgene E, which contains the ss button, drove pairing in the eye (Fig. 1H; Fig. 4B; Fig. S1A; Fig. S3B-D) but not the antenna from two different insertion sites (sites 1 and 3; Fig. 4B; Fig. S14A-H), suggesting that button pairing is cell-type-specific.
As pairing is required for transvection and the ss button pairs in a cell-type-specific manner, we hypothesized that transvection at the ss locus is cell-type-specific. To test this hypothesis, we examined an allele of ss that specifically affects arista development (ssarista 1) (Fig. 4g-J; Fig. S15A-F). In flies transheterozygous for ssarista 1 and a ss deficiency (ssdef), aristae were transformed into legs (i.e. aristapedia) (Fig. 4G-H; Fig. S15A, C). Aristapedia was also observed for ssprotein null / flies (Fig. 4E-F). In the eye, ssprotein null performed transvection to rescue ss expression (Fig. S16A-D). However, the aristapedia mutant phenotype persisted in ssarista 1 / ssprotein null flies (Fig. 4I-J; Fig. S15D, F), suggesting that, unlike in the eye, transvection does not rescue ss expression in the arista. Cell-type-specific transvection of the ss gene in the eye but not the arista was also observed in other genetic conditions (Fig. S15G-L; S17A-L).
As ss button pairing and transvection are cell-type-specific and pairing is required for transvection, our data support the cell-type-specific model, in which local buttoning and unbuttoning occur in a cell-type-specific manner to determine transvection efficiency (Fig. 4A).
Discussion
Despite the discovery of homologous chromosome pairing in flies over 100 years ago (9), the mechanisms that facilitate pairing have remained unclear. We find that the ability to pair with a homologous sequence is not a general feature of all loci, but is specific to a subset of loci (buttons) interspersed across the genome. Specific TADs drive button activity and can pair from multiple locations in the genome. Individual TADs may take on unique chromatin conformations or bind unique combinations of proteins to create nuclear microcompartments that enable homologous TAD association and pairing (Fig. 2L). As gene activity is also a feature of buttons, the mechanisms that promote specific enhancer-promoter interactions on the same chromosome may also act between chromosomes to pair active regions together (Fig. 2L). Additional small DNA elements may also facilitate pairing (14-17) Complementary work from Erceg, AlHaj Abed, & Goloborodko, et al. (27) and AlHaj Abed, Erceg, & Goloborodko, et al. (28) using Hi-C also reveals variable levels of pairing across the genome, with implications for genome function.
Our data indicate that pairing and transvection are mechanistically separable: TADs and gene activity facilitate pairing, while an insulator element facilitates transvection to the endogenous spineless locus. Consistent with our findings using endogenous alleles, an insulator is required for transvection but not pairing between transgenes containing the snail enhancer and the eve promoter (29).
We find that the ss locus drives pairing and performs transvection in the eye but not in the antenna. Our results support a model in which different buttons drive pairing in different cell types. In this model, local buttoning or unbuttoning at a specific gene determines its transvection efficiency in a given cell type. Variation in levels of pairing or transvection across cell types has been observed for a number of loci (12, 13, 21), suggesting that differences in pairing between cell types may be a general mechanism regulating gene expression.
The mechanisms driving chromosome pairing and transvection have remained a mystery of fly genetics since their initial discoveries by Nettie Stevens and Ed Lewis (9, 10). Our results provide strong support for the button model of pairing initiation and offer the first evidence of a general feature, specialized TADs, that drives homologous chromosomes together. Furthermore, we find that pairing is necessary but not sufficient for transvection and that distinct elements are required for these processes. Both pairing and transvection are cell-type-specific, suggesting that tighter pairing in a given cell type enables more efficient transvection in that cell type. Our findings suggest a general mechanism that drives homologous chromosome pairing and interchromosomal gene regulation across organisms to facilitate processes including X-inactivation and imprinting.
Materials and Methods
Drosophila lines
Flies were raised on standard cornmeal-molasses-agar medium and grown at 25° C.
Constructs were inserted via PhiC31 integration at the following landing sites:
Oligopaints probe libraries
Antibodies
Antibodies and dilutions were as follows: mouse anti-Lamin B (DSHB ADL67.10 and ADL84.12), 1:100; rabbit anti-GFP (Invitrogen), 1:500; rabbit anti-Rh4 (gift from C. Zuker, Columbia University), 1:50; mouse anti-Rh3 (gift from S. Britt, University of Texas at Austin), 1:50; mouse anti-Prospero (DSHB MR1A), 1:10; rat anti-Elav (DSHB 7E8A10), 1:50; guinea pig anti-Ss (gift from Y.N. Jan, University of California, San Francisco), 1:500. All secondary antibodies (Molecular Probes) were Alexa Fluor-conjugated and used at a dilution of 1:400.
Antibody staining (pupal and adult eyes)
Dissections were performed as described in references (9, 12-14) Eyes were dissected and fixed at room temperature for 15 minutes in 4% formaldehyde diluted in 1X PBX (PBS+0.3% Triton-X), then washed three times in 1X PBX. Eyes were incubated overnight at room temperature in primary antibody diluted in 1X PBX, then washed three times in 1X PBX and incubated in PBX at room temperature for ≥3 hours. Secondary antibody diluted in 1X PBX was added and incubated overnight at room temperature. Eyes were then washed three times in 1X PBX and incubated in PBX at room temperature for ≥2 hours. Adult eyes were mounted in SlowFade Gold (Invitrogen), and pupal eyes were mounted in Vectashield (Vector Laboratories, Inc.). Images were acquired on a Zeiss LSM700 confocal microscope.
The adult eye dissection protocol was used for Fig. 3H, J, L, P, R, T; Fig. S10C-E; Fig. S12D, F, J, L, P; Fig. S13B-C, E; Fig. S15B, E, H, K; Fig. S16B, D; and Fig. S17B, E, H, K. The pupal dissection protocol was used for Fig. 3C and Fig. S11B-J.
Oligopaints probe design
Probes for DNA FISH were designed using the Oligopaints technique (15, 16) Target sequences were run through the bioinformatics pipeline available at http://genetics.med.harvard.edu/oligopaints/ to identify sets of 42-bp (for old ss 90K probes) or 50-bp (for all other probes) optimized probe sequences (i.e. “libraries”) tiled across the DNA sequence of interest. Five 19-bp barcoding primers, gene F and R; universal (univ) F and R, and either sublibrary (sub) F or random (rando) R, were appended to the 5’ and 3’ ends of each probe sequence (Fig. S18A-B). To ensure that all probes were the same length, an additional 8-bp random sequence was added to the 3’ end of the old ss 90K probes. The gene F and R primers allowed PCR amplification of a probe library of interest out of the total oligo pool, and the univ F and R primers allowed conjugation of fluorophores, generation of single-stranded DNA probes, and PCR addition of secondary sequences to amplify probe signal. The ss 50-kb left and right extension libraries had a sub F primer between the gene and universal forward primers to allow PCR amplification of probes targeting a specific sub-region of the locus of interest (Fig. S18A). All other probe libraries had a rando R primer appended at the 3’ end to maintain a constant sequence length between all probes (Fig. S18B).
Barcoding primer sequences were taken from a set of 240,000 randomly generated, orthogonal 25-bp sequences (17) and run through a custom script to select 19-bp sequences with ≤ 15-bp homology to the Drosophila genome. Primers were appended to probe sequences using the orderFile.py script available at http://genetics.med.harvard.edu/oligopaints/. Completed probe libraries were synthesized as custom oligo pools by Custom Array, Inc. (Bothell, WA), and fluorescent FISH probes were generated as described in references (15, 16)
DNA FISH
DNA FISH was performed using modified versions of the protocols described in references (15, 16) 20-50 eye-antennal discs attached to mouth hooks from third instar larvae were collected on ice and fixed in 129 μL ultrapure water, 20 μL 10X PBS, 1 μL Tergitol NP-40, 600 μL heptane, and 50 μL fresh 16% formaldehyde. Tubes containing the fixative and eye discs were shaken vigorously by hand, then fixed for 10 minutes at room temperature with nutation. Eye discs were then given three quick washes in 1X PBX, followed by three five-minute washes in PBX at room temperature with nutation. Eye discs were then removed from the mouth hooks and blocked for 1 hour in 1X PBX+1% BSA at room temperature with nutation. They were then incubated in primary antibody diluted in 1X PBX overnight at 4°C with nutation. Next, eye discs were washed three times in 1X PBX for 20 minutes and incubated in secondary antibody diluted in 1X PBX for two hours at room temperature with nutation. Eye discs were then washed two times for 20 minutes in 1X PBX, followed by a 20-minute wash in 1X PBS. Next, discs were given one 10-minute wash in 20% formamide+2X SSCT (2X SSC+.001% Tween-20), one 10-minute wash in 40% formamide+2X SSCT, and two 10-minute washes in 50% formamide+2X SSCT. Discs were then predenatured by incubating for four hours at 37°C, three minutes at 92°C, and 20 minutes at 60°C. Primary probes were added in 45 μL hybridization buffer consisting of 50% formamide+2X SSCT+2% dextran sulfate (w/v), + 1 μL RNAse A. All probes were added at a concentration of ≥5 pmol fluorophoreμL. For FISH experiments in which a single probe was used, 4 μL of probe was added. For FISH experiments in which two probes were used, 2 μL of each probe was added. After addition of probes, eye discs were incubated at 91 °C for three minutes and at 37°C for 16-20 hours with shaking. Eye discs were then washed for 1 hour at 37°C with shaking in 50% formamide+2X SSCT. 1 μL of each secondary probe was added at a concentration of 100 pmolμL in 50 μL of 50% formamide+2X SSCT. Secondary probes were hybridized for 1 hour at 37°C with shaking. Eye discs were then washed twice for 30 minutes in 50% formamide+2X SSCT at 37°C with shaking, followed by three 10-minute washes at room temperature in 20% formamide+2X SSCT, 2X SSCT, and 2X SSC with nutation. Discs were mounted in SlowFade Gold immediately after the final 2X SSC wash, and imaged using a Zeiss LSM700 confocal microscope.
Generation of CRISPR lines
CRISPR lines were generated as described in references (18-21) For both ssenh del and ssupstream del, sense and antisense DNA oligos for the forward and reverse strands of four gRNAs were designed to generate BbsI restriction site overhangs. The oligos were annealed and cloned into the pCFD3 cloning vector (Addgene, Cambridge, MA). A single-stranded DNA homology bridge was generated with 60-bp homologous regions flanking each side of the predicted cleavage site and an EcoRI (for ssenh del) or NaeI (for ssupstreamdel) restriction site to aid in genotyping. The gRNA constructs (125 ng/μl) and homologous bridge oligo (100 ng/μl) were injected into Drosophila embryos (BestGene, Inc., Chino Hills, CA). Single males were crossed with a balancer stock (yw; +; TM2/TM6B), and F1 female progeny were screened for the insertion via PCR, restriction digest, and sequencing. Single F1 males whose siblings were positive for the deletion were crossed to the balancer stock (yw; +; TM2/TM6B), and the F2 progeny were screened for the deletion via PCR, restriction digest, and sequencing. Deletion-positive flies from multiple founders were used to establish independent stable stocks.
The following oligos were used for the ssenh del CRISPR:
The following oligos were used for the ssupstream del CRISPR:
Scanning electron microscopy
Adult Drosophila heads were removed and immediately mounted on a pin stub without fixation or sputtering. Heads were imaged at high vacuum at a voltage of 1.5 kV. All SEM was performed on a FEI Quanta ESEM 200 scanning electron microscope. SEM was used for Fig. 4D, F, H, J; Fig. S15C,F, I, L; and Fig. S17C, F, I, L.
Pairing quantifícations
All quantifications were performed in 3D on z-stacks with a slice thickness of 0.2 μm. Quantifications were performed manually using Fiji (22, 23). To chart the z position of each FISH dot, a line was drawn through the dot and the Plot Profile tool was used to assess the stack in which the dot was brightest. To determine the x-y distance between the two FISH dots, a line was drawn from the center of one dot to the center of the other dot and the length of the line was measured with the Plot Profile tool. The distance between the FISH dots was then calculated in 3D. A total of 50 nuclei from three eye discs were quantified for each genotype (i.e. N=3, n=50).
For experiments in which the transgene and endogenous site were both labeled with red fluorescent probes, FISH punctae ≤0.4 μm apart could not be distinguished as separate and were assigned a distance of 0.4 μm apart. For all controls in Fig. 2F, green probes labeling the transgene insertion site were pseudocolored red and data were quantified in the same way as experiments in which the transgene and endogenous site were both labeled with red probes. Thus, 3L-X control data in Fig. 2F are the same as in Fig. S1E, but the data were re-quantified with the green probes pseudocolored red. Similarly, 2L-3R control data in Fig. 2F are the same as in Fig. 1H, S10J, and S12Q (site 1 control), but the data were re-quantified with the green probes pseudocolored red.
Adult eye quantifications
The frequencies of Rh4-and Rh3-expressing R7s were scored manually for at least eight eyes per genotype. R7s co-expressing Rh3 and Rh4 were scored as Rh4-positive. 100 or more R7s were scored for each eye. For Fig. S17E, H, and K, only the ventral half of each eye was scored.
Hi-C mapping and TAD calling
Directionality index scores were calculated across 15-kb windows, stepping every 5 kb, by finding the log2 transform of the difference in the ratios of downstream versus upstream summed observed over expected interactions ranging from 15 kb to 100 kb in size. The expected value of a bin was defined as the sum of the product of fragment corrections for each valid fragment pair with both interaction fragments falling within the bin.
Directionality indices were generated using 14 published Hi-C datasets (24-27):
TADs were read from the beginning of a positive directionality index peak to the end of a negative directionality index peak. Parameters for calling a TAD were as follows: 1) The positive peak must have a signal of ≥0.8; 2) The negative peak must have a signal of ≤-0.8; and 3) The TAD must be present in at least two datasets. Any transgene covering ≥95% of a TAD was considered to span a TAD.
mRNA sequencing and analysis
RNA-seq was performed on three biological replicates, each consisting of 30 third instar larval eye discs. Eye discs were dissected in 1X PBS, separated from the mouth hooks and antennal discs, and placed directly into 300 μL of Trizol. RNA was purified using a Zymo Direct-zol RNA MicroPrep kit (catalog number R2062). mRNA libraries were prepared using an Illumina TruSeq Stranded mRNA LT Sample Prep Kit (catalog number RS-122-2101). Sequencing was performed using an Illumina NextSeq 500 (75 bp, paired end). Sequencing returned an average of 23,048,349 reads per replicate.
The following pipeline was used for mRNA-sequencing analysis: 1) FASTQ sequencing datasets were assessed for quality using FastQC; 2) Pseudoalignment with the Drosophila dm6 transcriptome and read quantifications were performed using kallisto (28); 3) Transcript abundance files generated by kallisto were joined to a file containing the genomic coordinates of all Drosophila mRNA transcripts (dmel-all-r6.20.gtf, available from Flybase); 4) The joined transcript coordinate file was compared to a file containing the coordinates of all tested transgenes using the bedtools intersect tool (http://bedtools.readthedocs.io/en/latest/content/tools/intersect.html). The output file contained a list of all of the active genes per transgene.
Assessment of chromatin marks and ncRNA, Polycomb Group Complex, and insulator density
NcRNA content of transgenes was assessed manually using the GBrowse tool on FlyBase. tRNAs, miRNAs, snoRNAs, and lncRNAs were included in the analysis of ncRNA content.
Transgenes were evaluated for insulator binding sites, Polycomb Group Complex binding sites, and the presence of chromatin marks using publicly available modENCODE ChIP-seq datasets. The following ChIP-seq datasets were used for this analysis:
For each protein or chromatin mark, .bed files containing the genomic coordinates of all ChIP peaks in each ModENCODE dataset were downloaded and merged into one file using the bedtools merge tool (http://bedtools.readthedocs.io/en/latest/content/tools/merge.html). The merged file was compared to a .bed file containing the genomic coordinates of all transgenes using the bedtools intersect tool (http://bedtools.readthedocs.io/en/latest/content/tools/intersect.html). This pipeline output the number of protein or chromatin mark ChIP peaks contained in each transgene. The number of ChIP peaks for BEAF-32, Su(Hw), CTCF, Cp190, Mod(Mdg4), and GAF were added together to calculate the total number of insulator binding sites per transgene in Fig. 2H.
Statistical analysis
All datasets were tested for a Gaussian distribution using a D’Agostino and Pearson omnibus normality test and a Shapiro-Wilk normality test. If either test indicated a non-Gaussian distribution, datasets were tested for statistical significance using a Wilcoxon rank-sum test (for single comparisons) or a one-way ANOVA on ranks with Dunn’s multiple comparisons test (for multiple comparisons). If both the D’Agostino and Pearson and the Shapiro-Wilk tests indicated a Gaussian distribution, datasets were tested for statistical significance using an unpaired t-test with Welch’s correction (for single comparisons) or an ordinary one-way ANOVA with Dunnett’s multiple comparisons test (for multiple comparisons).