Abstract
The Drosophila genome contains >13,000 protein coding genes, the majority of which remain poorly investigated. Important reasons include the lack of antibodies or reporter constructs to visualise these proteins. Here we present a genome-wide fosmid library of ≈10,000 GFP-tagged clones, comprising tagged genes and most of their regulatory information. For 880 tagged proteins we have created transgenic lines and for a total of 207 lines we have assessed protein expression and localisation in ovaries, embryos, pupae or adults by stainings and live imaging approaches. Importantly, we can visualise many proteins at endogenous expression levels and find a large fraction of them localising to subcellular compartments. Using complementation tests we demonstrate that two-thirds of the tagged proteins are fully functional. Moreover, our clones also enable interaction proteomics from developing pupae and adult flies. Taken together, this resource will enable systematic analysis of protein expression and localisation in various cellular and developmental contexts.
Impact statement
We provide a large-scale transgenic resource, which enables live imaging, subcellular localisation and interaction proteomics of selected gene products at all stages of Drosophila development.
Introduction
With the complete sequencing of the Drosophila genome (Adams et al., 2000) genome-wide approaches have been increasingly complementing the traditional single gene, single mutant studies. This is exemplified by the generation of a genome-wide transgenic RNAi library (Dietzl et al., 2007) to systematically assess gene function in the fly or by the documentation of the entire developmental transcriptome during all stages of the fly’s life cycle by mRNA sequencing (Graveley et al., 2010). Furthermore, expression patterns were collected for many genes during Drosophila embryogenesis by systematic mRNA in situ hybrisation studies in different tissues (Hammonds et al., 2013; Tomancak et al., 2002; 2007). Particularly for transcription factors (TFs) these studies revealed complex and dynamic mRNA expression patterns in multiple primordia and organs during development (Hammonds et al., 2013), supposedly driven by specific, modular enhancer elements (Kvon et al., 2014). Furthermore, many mRNAs are not only dynamically expressed but also subcellularly localised during Drosophila oogenesis (Jambor et al., 2015) and early embryogenesis (Lécuyer et al., 2007). Together, these large-scale studies at the RNA level suggest that the activity of many genes is highly regulated in different tissues during development. Since the gene function is mediated by the encoded protein(s), the majority of proteins should display particular expression and subcellular localisation patterns that correlate with their function.
However, a lack of specific antibodies or live visualisation probes thus far hampered the systematic survey of protein expression and localisation patterns in various developmental and physiological contexts in Drosophila. Specific antibodies are only available for about 450 Drosophila proteins (Nagarkar-Jaiswal et al., 2015), and the versatile epitope-tagged UAS-based overexpression collection that recently became available (Bischof et al., 2013) is not suited to study protein distribution at endogenous expression levels. Collections of knock-in constructs are either limited to specific types of proteins (Dunst et al., 2015) or rely on inherently random genetic approaches, such as the large-scale protein-trap screens or the recently developed MiMIC (Minos Mediated Insertion Cassette) technology (Venken et al., 2011). The classical protein-trap screens are biased for highly expressed genes, and altogether recovered protein traps in 514 genes (Buszczak et al., 2007; Lowe et al., 2014; Morin et al., 2001; Quiñones-Coello et al., 2007). The very large scale MiMIC screen isolated insertions in the coding region of 1,862 genes, 200 of which have been converted into GFP-traps available to the community (Nagarkar-Jaiswal et al., 2015). Both approaches rely on transposons to mediate cassette insertion and require integration into an intron surrounded by coding exons for successful protein tagging. Thus, about 3,000 proteins, whose ORF is encoded within a single exon, cannot be tagged by these approaches. Together, this creates a significant bias towards trapping a particular subset of the more than 13,000 protein coding genes in the fly genome.
Hence, the Drosophila community would profit from a resource that enables truly systematic protein visualisation at all developmental stages for all protein coding genes, while preserving the endogenous expression pattern of the tagged protein. One strategy to generate a comprehensive resource of tagged proteins is to tag large genomic clones by recombineering approaches in bacteria and transfer the resulting tagged clones into animals as third copy reporter allele as was previously done in C. elegans (Sarov et al., 2012). In Drosophila it is possible to insert this tagged copy of the gene as a transgene at a defined position into the fly genome (Venken et al., 2006). The third copy reporter allele approach was used successfully with large genomic BAC or fosmid clones derived from the fly genome. It has been shown that these transgenes recapitulate the endogenous expression pattern of the gene in flies and thus likely provide a tagged functional copy of the gene (Ejsmont et al., 2009; Venken et al., 2009).
Here, we introduce a comprehensive genome-wide library of almost 10,000 C-terminally tagged proteins within genomic fosmid constructs. For 880 constructs, covering 826 different genes we generated transgenic lines, 765 of which had not been tagged by previous genetic trapping projects. Rescue experiments showed that two thirds of the tagged proteins are functional. We characterised the localisation pattern for more than 200 tagged proteins at various developmental stages from ovaries to adults by immunohistochemistry and by live imaging. This identified valuable markers for various tissues and subcellular compartments, many of which are detectable in vivo by live imaging. Together, this shows the wide range of possible applications and the potential impact this publically available resource will have on Drosophila research and beyond.
Results
Our goal was to generate a comprehensive resource that allows the investigation of protein localisation and physical interactions for any fly protein of interest through a robust, generic tagging pipeline in bacteria, which is followed by a large-scale transgenesis approach (Figure 1). We based our strategy on a Drosophila melanogaster FlyFos library of genomic fosmid clones, with an average size of 36 kb, which covers most Drosophila genes (Ejsmont et al., 2009). Our two-step tagging strategy first inserts a generic ‘pre-tag’ at the C-terminus of the protein, which is then replaced by any tag of choice at the second tagging step, for example with a superfolder-GFP (sGFP) tag to generate the sGFP TransgeneOme clone library. These tagged clones are injected into fly embryos to generate transgenic fly-TransgeneOme (fTRG) lines, which can be used for multiple in vivo applications. (Figure 1).
sGFP TransgeneOme – a genome-wide tagged FlyFos clone library
We aimed to tag all protein coding genes at the C-terminus of the protein, because a large number of regulatory elements reside within or overlap with the start of genes, including alternative promoters, enhancer elements, nucleosome positioning sequences, etc. These are more likely to be affected by a tag insertion directly after the start codon. Signal sequences would also be compromised by an inserted tag after the start codon. Additionally, the C-termini in the gene models are generally better supported by experimental data than the N-termini due to an historical bias for 3’ end sequencing of ESTs. Thus, C-terminal tagging is more likely to result in a functional tagged protein than N-terminal tagging, although we are aware of the fact that some proteins will be likely inactivated by addition of a tag to the C-terminus. Moreover, only about 1,400 protein coding genes contain alternative C-termini, resulting in all protein isoforms labelled by C-terminal tagging in almost 90 % of all protein coding genes.
In a series of pilot experiments we tested the functionality of several tagging cassettes with specific properties on a number of proteins (Figure 2 Supplement 1, Table 1). For the genome-wide resource we applied a two-step tagging strategy, whereby we first inserted a non-functional ‘pre-tagging’ cassette consisting of a simple bacterial selection marker, which is flanked with linker sequences present in all of our tagging cassettes. This strategy enables a very efficient replacement of the pre-tag by any tag of interest by homologous recombination mediated cassette exchange in bacteria. As fluorescent proteins and affinity tags with improved properties are continuously being developed, specific clones or the entire resource can be easily re-fitted to any new tagging cassette (Figure 1). For the genome-scale resource we selected a tagging cassette suitable for protein localisation and complex purification studies, consisting of the 2xTY1 tag as a flexible linker, the superfolder GFP coding sequence (Pédelacq et al., 2005), the V5 tag, followed by specific protease cleavage sites (for the PreScission- and TEV-proteases), the biotin ligase recognition peptide (BLRP) tag allowing for specific in vivo or in vitro biotinylation (Deal and Henikoff, 2010; Vernes, 2014), and the 3xFLAG tag (Figure 2 Supplement 1).
Of the 13,937 protein coding genes in the dmel5.43 genome assembly, 11,787 genes (84.6 %) were covered by a suitable fosmid from the original FlyFos library (Ejsmont et al., 2009), extending at least 2.5 kb upstream and 2.5 kb downstream of the annotated gene model. For picking clones, designing oligonucleotides for recombineering and for tracking all steps of the transgene engineering process, as well as for providing access to all construct sequences and validation data we used the previously developed TransgeneOme database (Sarov et al., 2012).
For high-throughput tagging of the Drosophila FlyFos clones, we developed an improved version or our previously applied high-throughput, 96-well format liquid culture recombineering pipeline (Ejsmont et al., 2011; Sarov et al., 2012). The high efficiency of recombineering in E. coli allowed for multi-step DNA engineering in 96-well format liquid cultures with single clone selection only at the last step. The specific pipeline consists of five steps (Figure 2A). First, the pRedFlp helper plasmid containing all genes required for homologous recombination as well as the Flippase-recombinase (under L-rhamnose and tetracyclin control, respectively) was introduced into E. coli by electroporation. Second, the ‘pre-tagging’ cassette containing only a bacterial antibiotic resistance gene was inserted by homologous recombination with gene-specific homology arms of 50 base pairs. Third, the sGFP-V5-BLRP tagging cassette, including an FRT-flanked selection and counter-selection cassette, was inserted to replace the ‘pre-tagging’ cassette. Since the linker sequences in the ‘pretagging’ cassette are identical to the tagging cassette, the tagging cassette was simply excised from a plasmid by restriction digest and no PCR amplification was required. This strongly reduced the risk of PCR induced mutations in the tagging cassette. Fourth, the selection marker was excised by the induction of Flippase expression. Fifth, the helper plasmid was removed by suppression of its temperature sensitive replication at 37 °C (Meacock and Cohen, 1980) and single clones were isolated from each well, by plating on selective solid agar plates.
All five steps of the engineering pipeline were highly efficient (between 95.8 and 99.7 %), resulting in an overall efficiency of 93.6 % or 10,995 growing cultures (Figure 2A). To validate the sequence of the engineered clones we developed a new next-generation-sequencing (NGS)-based approach (Figure 2B). In short, we pooled single clones from all 96-well plates into 8 rows and 12 columns pools and deep sequenced the barcoded and pooled mate pair libraries using the Illumina platform. The mate pair strategy allowed us to map the otherwise common tag coding sequence to a specific clone in the library and thus to verify the integrity of the tagging cassette insertion in the clones with single nucleotide resolution (see Material and Methods for details). When applied to the final sGFP TransgeneOme collection we detected no mutations for 9,580 constructs (87.1 %). 8,005 (72.8 %) of these clones had complete sequence coverage in the tag and thus represent the most reliable subset of the tagged library (Figure 2B). For 1,417 of the clones (12.8 %) one or more differences to the expected sequences were detected. The most common differences were point mutations, which cluster almost exclusively to the homology regions in the oligonucleotides used to insert the ‘pre-tagging’ cassette. This is suggestive of errors in the oligonucleotide synthesis but could also reflect polymorphisms in the genomic sequence of the homology arms. Another subset of point mutations clustered around the junctions between the homology arms and the rest of the tagging cassette, indicating an imprecise resolving of the homology exchange reaction in small subset of clones (Figure 2C). Finally, a small group of clones (165) still contained an unflipped selection cassette. The NGS results were confirmed by Sanger sequencing of the entire tag coding sequence for a subset of constructs (Supplementary Table 1).
Taken together, the sGFP TransgeneOme and our pilot tagging experiments resulted in 10,711 validated tagged clones, representing 9,993 different Drosophila genes. (Table 1, Supplementary Table 1). Moreover, the ‘pre-tagged’ TransgeneOme library is a versatile resource for generating fosmid clones with arbitrary tags at the C-terminus of the gene models.
Fly TransgeneOme (fTRG) – a collection of flies with tagged fosmids
We next established a pipeline to systematically transform the tagged TransgeneOme clones into flies. To efficiently generate fly transgenic lines we injected the tagged fosmid constructs into a recipient stock carrying the attP landing site VK00033 located at 65B on the third chromosome using a transgenic nanos-ΦC31 source (Venken et al., 2006). For some genes positioned on the third chromosome we injected into VK00002 located on the second chromosome at 28E to simplify genetic rescue experiments. In total, we have thus far generated lines for 880 tagged constructs representing 826 different genes (Table 1). These genes were partially chosen based on results of a public survey amongst the Drosophila community to identify genes for which there is the strongest demand for a tagged genomic transgenic line. 765 (87 %) of the newly tagged genes have not been covered by the previous protein-trap projects (Supplementary Table 2), hence, these should be particularly useful for the fly community. From our pilot tagging experiments, we made 51 lines for the 2xTY1-sGFP-3xFLAG tag and 30 lines for the 2xTY1-T2A-sGFPnls-FLAG transcriptional reporter. The majority of the lines (799) were generated with the versatile 2xTY1-sGFP-V5-Pre-TEV-BLRP-3xFLAG tag, used for the genome-wide resource (Figure 2 Supplement 1, Table 1). The collection of fly lines is called ‘tagged FlyFos TransgeneOme’ (fTRG) and all 880 fTRG lines have been deposited at the VDRC stock centre for ordering (http://stockcenter.vdrc.at).
To assess whether the tagged fosmids in our transgenic library are functional, we have chosen a set of 46 well-characterised genes, mutants of which result in strong developmental phenotypes. For most cases, we tested null or strong hypomorphic alleles for rescue of the respective phenotypes (embryonic lethality, female sterility, flightlessness etc.) with the tagged fosmid lines. More than two-thirds of the lines (31 of 46), including tagged lines of babo, dlg1, dl, fat, Ilk, LanB1, numb, osk, rhea, sax, smo and yki completely rescued the mutant phenotypes (Figure 3, Supplementary Table 3), demonstrating that the majority of the tagged fosmids are functional. Our rescue test set is biased towards important developmental regulators; 10 of the 15 genes that did not show a complete rescue are transcription factors with multiple essential roles during development, such as esg, eya, odd, sna and salm. Thus, their expression is likely regulated by complex cis-regulatory regions that may not be entirely covered by the available fosmid clone; for example wing-disc enhancers are located more than 80 kb away from the start of the salm gene (de Celis, 1999). Hence, we expect that a typical gene, which is embedded within many other genes in the middle of the fosmid clone, is more likely to be functional. Together, these data suggest that both the genome-wide tagged construct library and the transgenic fTRG library provide functional reagents that are able to substitute endogenous protein function.
Expression of fTRG lines in the ovary
To demonstrate the broad application spectrum of our TransgeneOme library we analysed tagged protein expression and subcellular localisation in multiple tissues at various developmental stages. Germline expression in flies differs substantially from somatic expression, requiring particular basal promoters and often specialised 3’UTRs (Ni et al., 2011; Rørth, 1998). Therefore, we used ovaries to test the fTRG library and probed the expression of 115 randomly selected lines in germline cells versus somatic cells during oogenesis (Figure 4A). From the 115 lines 91 (79 %) showed detectable expression during oogenesis, with 45 lines being expressed in both, germ cells and the somatic epithelial cells (Figure 4B, C and Supplementary Table 4). 76 (66 %) fTRG lines showed interesting expression patterns restricted to subsets of cells or to a subcellular compartment (Figure 4B - D). For example, Tan-GFP is expressed in germline stem cells only, whereas the ECM protein Pericardin (Prc-GFP) is concentrated around the neighbouring cap cells and the transcription factor Delilah (Dei-GFP) is specifically localised to the nuclei of somatic stem cells, which will give rise to the epithelial cells surrounding each egg chamber (Figure 4A, C). In early egg chambers Reph-GFP is expressed in germ cells only, whereas the ECM protein Viking (Vkg-GFP) specifically surrounds all the somatic epithelial cells. Interestingly, the transcription factor Auracan (Ara-GFP) is only expressed in posterior follicle cells, whereas the putative retinal transporter CG5958 is only detectable in the squamous epithelial cells surrounding the nurse cells (Figure 4C).
We further investigated the subcellular localisation of the tagged proteins, which revealed a localisation for the RNA helicase l(2)35Df to all nuclei, whereas the predicted C2H2-Zn-finger transcription factor Crooked legs (Crol-GFP) is restricted to the nuclei of the epithelial cells (Figure 4D). Interestingly, Corolla-GFP is exclusively localised to the oocyte nucleus in early egg chambers. This is consistent with the function of Corolla at the synaptonemal complex attaching homologous chromosomes during early meiosis (Collins et al., 2014). In contrast, the uncharacterised homeobox transcription factor E5 (E5-GFP) is largely restricted to the nuclei of anterior and posterior epithelial cells (Figure 4D). Apart from nuclear patterns, we found a significant number of cortical localisations, including the well characterised Crumbs (Crb-GFP) (Bulgakova and Knust, 2009) and the PDZ-domain containing Big bang (Bbg-GFP) (Bonnay and Cohen-Berros, 2013) at the apical cortex of the epithelial cells, the Na+/K+ transporter subunit Nervana 2 (Nrv2-GFP) at the lateral epithelial membrane, and the EGF-signalling regulator Star (S-GFP) as well as the TGF-β receptor Saxophone (Sax-GFP) localised to the cortex or membrane of the germ cells (Figure 4D and Supplementary Table 4). Furthermore, we find a perinuclear enrichment for the uncharacterised predicted NAD binding protein CG8768, and oocyte enrichments for the Tom22 homolog Maggie (Mge-GFP) (Vaskova et al., 2000), glycosyltransferase Wollknäuel (Wol-GFP) (Haecker et al., 2008) and the TGF-α homolog Gurken (Grk-GFP), the latter with its well established concentration around the oocyte nucleus (Neuman-Silberberg and Schüpbach, 1993) (Figure 4D).
To test if genes expressed from FlyFos system also undergo normal post-transcriptional regulation we analysed the osk-GFP line, which was recently used as a label for germ granules (Trcek et al., 2015). osk mRNA is transcribed from early stages of oogenesis onwards in the nurse cell nuclei and specifically transported to the oocyte, where it localises to the posterior pole (Ephrussi et al., 1991; Kim-Ha et al., 1991). Only after the mRNA is localised, it is translated from stage 9 onwards (Kim-Ha et al., 1995). Indeed, fosmid derived osk-GFP mRNA localizes normally during all stages of oogenesis and its translation is repressed during mRNA transport, as Osk-GFP can only be detected at the posterior pole from stage 9 onwards (Figure 4 Supplement 1A, B). osk-GFP also fully rescues all aspects of an osk null allele (Figure 3B, C). Additionally, we discovered a post-transcriptional regulation for corolla. corolla-GFP mRNA is localised to the oocyte at stage 6 and Corolla-GFP protein is transported into the oocyte nucleus. However, despite the presence of the corolla-GFP mRNA at stage 8, Corolla protein is undetectable, suggesting either a translational block of the RNA or targeted degradation of the protein (Figure 4 Supplement 1C - F). Taken together, these expression and protein localisation data recapitulate known patterns accurately and identify various unknown protein localisations in various cell types during oogenesis, and thus emphasise the value of the fly TransgeneOme resource.
Live in toto imaging of fTRG lines during embryogenesis
For many genes, the expression patterns at the mRNA level are particularly well characterised during Drosophila embryogenesis (Hammonds et al., 2013; Tomancak et al., 2002; 2007). However, in situ hybridisation techniques on fixed tissues do not visualise dynamics of expression over time and thus do not allow tracking of the expressing cells during development. As our tagging approach enables live imaging at endogenous expression levels we set out to test if in toto imaging using the SPIM (Selective Plane Illumination Microscopy) technology (Huisken et al., 2004) can be applied to the fly TransgeneOme lines. We pre-screened a small subset of lines (Supplementary Table 5) and selected the Na+/K+ transporter subunit Nrv2, as it shows high expression levels, for long-term time-lapse live imaging with a multi-view dual-side SPIM (Huisken and Stainier, 2007). During embryogenesis Nrv2 expression was reported in neurons (Sun et al., 1999) and glial cells (Younossi-Hartenstein et al., 2002). Interestingly, we find that Nrv2-GFP is already expressed from stage 11 onwards in most likely all cell types, where it localises to the plasma membrane, similar to the localisation in ovaries (Figure 4D). The expression level increases during stage 15 in all cells, with a particularly strong increase in the developing central nervous system (CNS) labelling the CNS and motor neuron membranes (Figure 5, Supplementary Movie 11). These live in toto expression data are consistent with expression data of a recently isolated GFP trap in nrv2 (Lowe et al., 2014), thus validating our methodology.
We wanted to extend our approach beyond highly expressed structural genes towards transcription factors that enable to follow cell lineages in the embryo. For this purpose we crossed the fTRG line of the homeobox transcription factor gooseberry (Gsb-GFP) to H2A-mRFPruby, which labels all nuclei, and recorded a two-colour multi-view dual-side SPIM movie. We find that Gsb-GFP is expressed in the presumptive neuroectoderm of the head region, labelling segmentally reiterated stripe-like domains at stage 10 (Figure 5 Supplement 1A, B, Supplementary Movie 2) as was described from fixed images (Gutjahr et al., 1993). Focusing on the deuterocerebral domain, we could reconstruct the delamination of two neuroblasts, which up-regulate Gsb-GFP while initiating their asymmetric divisions (Figure 5 Supplement 1C - F). It was possible to individually follow their neural progeny. Gsb-GFP expression also allowed us to directly follow the gradual down-regulation of Gsb-GFP in ectodermal cells that remained at the head surface after neuroblast delamination. As opposed to the neuroblasts, these cells, which give rise to epidermis, did not divide at all, or underwent only one further division (Figure 5 Supplement 1G - J, Supplementary Movie 2).
gsb is in part required for gooseberry-neuro (gsb-n; also called gsb-d) expression (He and Noll, 2013), which marks a defined set of brain neuroblasts in the embryo, with three gsb-n positive neuroblasts in the deuterocerebral domain (Urbach and Technau, 2003). We assume that these cells correspond to the three Gsb-GFP expressing neuroblasts visible in our live 4D stack. Notably expression in the Gsb-n-GFP fTRG line becomes detectable only at the end of germ-band extension (end of stage 11) in the developing CNS, where it lasts until stage 17 (Figure 5 Supplement 2, Supplementary Movie 3). Again this is consistent with published immuno-histochemistry data (Gutjahr et al., 1993; He and Noll, 2013). We conclude that our fly TransgeneOme library can be used for live in toto imaging, even for transcription factors expressed at endogenous levels. This will be of significant importance for ongoing efforts linking the transcription factor expression patterns of embryonic neuroblasts to the morphologically defined lineages that structure the larval and adult Drosophila brain (Hartenstein et al., 2015; Lovick et al., 2013; Pereanu, 2006).
Expression of fTRG lines in the adult thorax
Cells in the embryo are generally rather small and thus not ideally suited to document subcellular protein localisation patterns. Thus, we decided to apply our TransgeneOme library to the adult thorax that contains some very large cells. We scored expression in the large indirect flight muscles (IFMs), in leg muscles, in visceral muscles surrounding the gut, in the gut epithelium, the tendon epithelium, the trachea and the ventral nerve cord including the motor neurons. In total, we found detectable expression in at least one tissue for 101 of 121 (83.5 %) analysed fTRG lines, thus creating a large number of valuable markers for cell types and subcellular structures (Supplementary Tables 6, 7).
The large IFMs are fibrillar muscles, which have a distinct transcriptional program resulting in their distinct morphology (Schönbauer et al., 2011). This is recapitulated by the expression of Act88F-GFP, which localises to the thin filaments of IFMs only (Figure 6A - C), whereas Mlp84B-GFP is not in IFMs but at the peripheral Z-discs of leg and visceral muscles only (Figure 6D - F), similar to the published localisation in larval muscle (Clark et al., 2007). We find various dotty patterns indicating localisation to intracellular vesicles; a particularly prominent example is Tango1-GFP in the midgut epithelium (Figure 6G, H, Supplementary Table 6). Tango1 regulates protein secretion in S2 cells, where it localises to the Golgi apparatus upon over-expression (Bard et al., 2006), suggesting that the pattern described here is correct. We find Par6 with the expected apical localisation in the proventriculus epithelium and in trachea (Figure 6I, J), whereas we identified a surprising pattern for the TRP channel Painless (Tracey et al., 2003). Pain-GFP is not only highly expressed in motor neurons (Figure 6K) but also in particular cells in the gut epithelium and most surprisingly, in the tendon cells to which the IFMs attach (Figure 6L, M). At this point we can only speculate that Pain might be involved in mechanical stretch-sensing in these cells. We have also tagged various ECM components, with LamininB1 (LanB1-GFP), LamininA (LanA-GFP) and BM40-SPARC resulting in the most prominent expression patterns. All three ensheath most adult tissues, particularly the muscles (Figure 6N, O, Figure 6 Supplement 1A, B, E, F). Interestingly, LanA-GFP and LanB1-GFP also surround the fine tracheal branches that penetrate into the IFMs, whereas BM40-SPARC is only detected around the large tracheal stalk and the motor neurons (Figure 6P, Figure 6 Supplement 1C, D, G, H). Finally, we also detected prominent neuromuscular junction (NMJ) markers; the IκB homolog Cactus shows a distinct pattern on leg muscles, visceral muscles and IFMs, the latter we could confirm by co-staining with the neuronal marker Futsch (Figure 6 Supplement 1I - M). Interestingly, such a NMJ pattern for Cactus and its binding partner Dorsal has been shown in larval body muscle by antibody stainings (Bolatto et al., 2003). Together, these results suggest that our fly TransgeneOme library provides a rich resource for tissue-specific markers in the adult fly that can routinely be used to visualise subcellular compartments in various tissues.
To further validate the advantages of our TransgeneOme lines to label subcellular structures we imaged the large IFMs of the same 121 lines at high resolution. We found various markers for the thick filaments (e.g. the myosin associated protein Flightin, Fln-GFP) (Vigoreaux et al., 1993), for the myofibrils (e.g. the protein kinase Fray-GFP), the M-lines (e.g. the titin related protein Unc-89/Obscurin-GFP) (Katzemich et al., 2012), the Z-discs (e.g. CG31772-GFP) and the muscle attachment sites (e.g. Integrin-linked-kinase, Ilk-GFP). Furthermore, we identified markers for the T-tubules (e.g. Dlg1-GFP), for different vesicular compartments (e.g. the TGFβ receptor Baboon-GFP) and for mitochondria (CG12118) within the IFMs (Figure 7, Supplementary Tables 6, 7). Additionally, we documented the nuclear localisation in IFMs and leg muscles for a variety of fTRG proteins, including the uncharacterised homeodomain protein CG11617 and the C2H2 Zinc-fingers CG12391 and CG17912 (Figure 7 Supplement 1A - C, E - G); both of the latter result in flightless animals when knocked-down by muscle-specific RNAi (Schnorrer et al., 2010) suggesting that these genes play an essential role for IFM morphogenesis or function. Interestingly, the well characterised C2H2 Zinc-finger protein Hunchback (Hb) is only localised to leg muscle nuclei, but absent from IFMs suggesting a leg muscle-specific function of Hb (Figure 7 Supplement 1G, H).
However, differences between muscle types are not only controlled transcriptionally but also by alternative splicing (Oas et al., 2014; Spletter and Schnorrer, 2014; Spletter et al., 2015). To investigate if our tagging approach can be used to generate isoform-specific lines, we have chosen two prominent muscle genes, mhc and rhea (the fly Talin), both of which have predicted isoforms with different C-termini (Figure 7 Supplement 2A, H). Interestingly, we found that Mhc-isoforms K, L, M are expressed in IFMs and all leg muscles, however the predicted Mhc-isoforms A, F, G, B, S, V with the distal STOP codon are selectively expressed in visceral muscle and in a subset of leg muscles, however absent in adult IFMs (Figure 7 Supplement 2B - G). Even more surprisingly, while the long ‘conventional’ rhea (Talin) isoforms B, E, F, G show the expected localisation to muscle attachment sites in IFMs and leg muscles (Weitkunat et al., 2014), the short Talin-isoforms C and D do not localise to muscle attachment sites, but are selectively concentrated at costamers of leg muscles (Figure 7 Supplement 2I - N). Hence, our TransgeneOme library is ideally suited to label subcellular compartments and protein complexes, and in some cases can even distinguish between closely related protein isoforms.
Expression of the fTRG lines in the living pupal thorax
An attractive application of the fly TransgeneOme library is live in vivo imaging. In the past, we had established live imaging of developing flight muscles in the pupal thorax using overexpressed marker proteins (Weitkunat et al., 2014). Here, we wanted to test, if live imaging of proteins at endogenous expression levels is also possible in the thick pupal thorax. We selected six fTRG lines for well established genes and indeed could detect expression and subcellular localisation for all of them using a spinning-disc confocal microscope either at the level of the pupal epidermis or below the epidermis, in the developing flight muscles, or both (Figure 8). The adducin-like Hts-GFP labels the cytoplasm of fusing myoblasts from 10 to 20 h APF (after puparium formation) and the developing SOPs (sensory organ precursors) with a particular prominent concentration in developing neurons and their axons (Figure 8B - E). In contrast, Dlg1-GFP localises to cell-cell-junctions of the pupal epidermis and to a network of internal membranes in the developing IFMs (Figure 8F - I) that may resemble developing T-Tubules, for which Dlg1 is a well established marker (Razzaq et al., 2001). Interestingly, the long isoforms of Talin-GFP (rhea isoforms B, E, F, G) are largely in the cytoplasm and at the cortex of the epidermal cells, with a marked enrichment in the developing SOPs at 10 to 20 h APF (Figure 8J, K). Further, Talin-GFP is strongly concentrated at muscle attachment sites of developing IFMs from 24 h onwards (Figure 8L, M) consistent with antibody stainings of IFMs (Weitkunat et al., 2014).
The dynamics of the extracellular matrix is little described thus far as very few live markers existed. Hence, we tested our LamininB1 fosmid and found that LanB1-GFP is readily detectable within the developing basement-membrane basal to the epidermal cells of the pupal thorax at 10h APF (Figure 8N). It also labels the assembling basement-membrane around the developing IFMs from 16 to 30h APF without a particularly obvious concentration at the muscle attachment sites (Figure 8O - Q). To specifically visualise the developing IFMs we chose Actin88F, which is specifically expressed in IFMs and a few leg muscles (Nongthomba et al., 2001). We find that the Act88F-GFP fTRG line indeed very strongly labels the IFMs from about 18 h APF but is also expressed in the developing pupal epidermis again with an enrichment in the forming SOPs from 10 to 20 h APF (Figure 8R - U). The latter is not surprising as Act88F-lacZ reporter has been shown to also label the developing wing epithelium (Nongthomba et al., 2001), again suggesting that our fTRG line recapitulates the endogenous expression pattern. We finally tested the βTub60D fTRG-line, as βTub60D was reported to label the myoblasts and developing myotubes in embryonic and adult muscles (Fernandes et al., 2005; Leiss et al., 1988; Schnorrer et al., 2007). Indeed, we detect βTub60D-GFP in fusing myoblasts and the developing IFMs, with particularly prominent label of the microtubule bundles at 24h APF (Figure 8V - Y). In addition, βTub60D-GFP also strongly marks the developing hairs of the sensory organs of the pupal epidermis (Figure 8X, see also Supplementary Movie 6).
In order to test, if the fly TransgeneOme lines and the sGFP-tag are indeed suited for long-term live imaging we chose Act88F-GFP and βTub60D-GFP and imaged the developing IFMs for more than 19 h with a two-photon microscope using an established protocol for over-expressed markers (Weitkunat and Schnorrer, 2014). For both proteins we can detect strongly increasing expression after 18h APF in the developing IFMs, with Act88F-GFP being restricted to the myotubes and the developing myofibrillar bundles (Supplementary Movie 4, Figure 8 Supplement 1A - F) whereas βTub60D-GFP also labels the fusing myoblasts and is largely incorporated into prominent microtubule bundles (Supplementary Movie 6, Figure 8 Supplement 1L - Q).
As photo bleaching was no serious problem in these long movies we also recorded movies at higher time and spatial resolution. We labelled the developing IFMs with Act88F-GFP and the myoblasts with a him-GAL4, UAS-palm-Cherry and acquired a 3D stack every two minutes using a spinning disc-confocal. This enabled us to visualise single myoblast fusion events in developing IFMs of an intact pupa (Supplementary Movie 5, Figure 8 Supplement 1G - K). The six dorsal longitudinally oriented IFMs develop from three larval template muscles to which myoblasts fuse to induce their splitting into six myotubes (Fernandes et al., 1991). Using high resolution imaging of βTub60D-GFP we find that most myoblasts fuse in the middle of the developing myotube during myotube splitting, with prominent microtubules bundles located at the peripheral cortex of the splitting myotube (Supplementary Movie 7, Figure 8 Supplement 1R). These prominent microtubules bundles are then relocated throughout the entire developing myotube (Supplementary Movie 7, Figure 8 Supplement 1S - V). Taken together, these live imaging data suggest that many of the fTRG lines will be well suited for high resolution live imaging of dynamic subcellular protein localisation patterns in developing Drosophila organs. This will strongly expand the set of live markers available for research in flies.
FlyFos library as bait for proteomics
For the proper composition, localisation and in vivo function of most protein complexes the expression levels of the individual components are critical (Rørth et al., 1998; Tseng and Hariharan, 2002). Hence, the TransgeneOme library would be an ideal experimental set-up to purify protein complexes from different developmental stages using endogenous expression levels of the bait protein. In principle, all the small affinity tags (TY1, V5, FLAG) (Figure 2 Supplement 1) can be used for complex purifications. The presence of precision and TEV cleavage sites even allow two-step purifications. For proof of principle experiments, we selected four tagged proteins as baits: Ilk, Dlg1, Talin and LanB1, and analysed two different developmental stages. In each case we homogenised hundred 24 to 48 h pupae and hundred adult flies per experiment and mixed the cleared lysate with a GFP antibody matrix to perform single step affinity enrichment and mass-spec analysis modified from the QUBIC protocol (Hein et al., 2015; Hubner et al., 2010; Keilhauer et al., 2014). Each affinity-enrichment was performed in triplicate and intensity profiles of all identified proteins were quantified in a label-free format by running all 30 purifications consecutively on the same Orbitrap mass-spectrometer and analysing the data with the MaxQuant software suite (Cox and Mann, 2008; Cox et al., 2014) (Supplementary Table 8). Interestingly, enriching Ilk-GFP from both developing pupae and adult flies recovered the entire Ilk, PINCH, Parvin, RSU-1 complex (Figure 9), which had previously been purified in vitro from Drosophila S2 cells (Kadrmas et al., 2004) and mammalian cells (Dougherty et al., 2005; Tu et al., 2001) giving us confidence in our methodology. We also successfully enriched Talin-GFP from pupae or adults, however did not identify an obvious strong and specific binding partner (Figure 9, Supplementary Table 8). In contrast, we identified Mesh as a novel interactor of Dlg1 from pupae and adult flies. Mesh colocalises with Dlg1 at septate junctions of the embryonic Drosophila midgut, however a molecular interaction of both proteins was not established (Izumi et al., 2012). Finally, we purified the laminin complex by pulling on LanB1, which recovered LanB2 and LanA roughly stoichiometrically, both from pupae and adult flies, as had been found in cell culture experiments (Fessler et al., 1987), showing that extracellular matrix complexes can also be purified from in vivo samples with our methodology. In summary, these data demonstrate that interaction proteomics with the fly TransgeneOme library can confirm known interaction partners and discover novel in vivo complex members, making the system attractive for a variety of biochemical applications.
Discussion
The TransgeneOme resource presented here adds new powerful component into the arsenal of tools available to the Drosophila research community. It complements the genetic resources for gene disruption and localisation (Buszczak et al., 2007; Lowe et al., 2014; Morin et al., 2001; Nagarkar-Jaiswal et al., 2015; Quiñones-Coello et al., 2007; St Johnston, 2012; Venken and Bellen, 2012) with a comprehensive genomescale library that does not suffer the biases of random mutagenesis. Analogously to the powerful MiMIC system (Nagarkar-Jaiswal et al., 2015) the TransgeneOme resource is versatile and can be adapted to the developments in tag chemistry and to various specialised applications. Although the resource is designed to study behaviour of proteins, it can for example be easily converted into a toolkit for live imaging of mRNAs. By designing a tagging cassette with an array of MS2 binding sites (Forrest and Gavis, 2003) the existing ‘pre-tagged’ TransgeneOme can be converted into an MS2-tagged TransgeneOme by a simple liquid culture recombineering step in bacteria (Figure 1).
However, any new TransgeneOme has to be transformed into flies and this process still represents a significant bottleneck. We present here an optimised protocol for transgenesis of fosmid size clones into Drosophila melanogaster. It took three years and four dedicated technicians to generate the 880 fly lines presented in this study. Although, the systematic transgenesis is a continuing process in our laboratories, the value of the TransgeneOme collection is highlighted by the fact that any specific set of genomically tagged gene clones is now available. These can be efficiently transformed by in house transgenesis of Drosophila labs around the world using the optimised protocol presented here. For that purpose we will not only deposit the sGFP TransgeneOme but also the ‘pre-tagged’ TransgeneOme collection at Source Biosciences from which any subset of clones can be conveniently ordered (http://www.sourcebioscience.com/).
One caveat of designed expression reporters is the necessity to place the tag into a defined position within the gene model. We chose to generate our ‘pre-tagged’ collection at the most commonly used C-terminus predicted by the gene model, thus labelling most isoforms. In a few cases a tag at the C-terminus will inactivate the protein, however such a reagent can still be useful for visualising the protein. This has been demonstrated for a number of sarcomeric GFP-traps, some of which lead to lethality when homozygous, yet result in interesting localisation patterns when heterozygous (Buszczak et al., 2007; Morin et al., 2001 and F.S., unpublished observation). For particular genes, it will be useful to tag differential protein isoforms, which in some cases can be done by tagging alternative C-termini, as shown here for Mhc and rhea. However, tagging a particular isoform requires a very informed construct design, which cannot easily be automated at the genome-scale.
Genome engineering is experiencing a tremendous growth with the introduction of CRISPR/Cas technology and it will be only a matter of time before a larger collection of precisely engineered fusion proteins at endogenous loci will become available in flies. However to date, such examples are still limited to a few genes (Baena-Lopez et al., 2013; Gratz et al., 2014; Port et al., 2014; Zhang et al., 2014), which had been carefully picked and were individually manipulated with custom-designed, gene-specific tools. It remains to be tested which proportion of such engineered loci will be fully functional and thus potentially superior to the fTRG collection. Having a transgenic third allele copy, as is the case in our TransgeneOme collection, might even be advantageous, if the tagging interferes with protein function, because the TransgeneOme lines still retain two wild-type endogenous gene copies. In some cases, addition of GFP might destabilise the protein, regardless of Nor C-terminal fusion, as recently shown for the Engrailed protein (Sokolovski et al., 2015). However, our ability to detect the protein product in the vast majority of our tagged lines argues that this could be a relatively rare, gene specific phenomenon. Nevertheless, caution should be taken with respect to protein turnover dynamics of any tagged protein.
Together, the FlyFos library, the fly TransgeneOme library and the fTRG collection of strains, enable genome-scale examination of expression and localisation of proteins comparable with the high-throughput mRNA in situ screens (Tomancak et al. 2002, Tomancak et al. 2007). Our data for tagged Oscar protein show that fosmid reporters can in principle recapitulate all aspects of gene expression regulation at transcriptional and post-transcriptional levels. It will be particularly interesting to combine the spatial expression data of mRNAs with that of proteins. Since many transcripts show subcellular localisation in various developmental contexts (Jambor et al., 2015; Lécuyer et al., 2007), the question arises whether RNA localisation generally precedes localised protein activity. Systematic examination of protein patterns expressed from localised transcripts in systems such as the ovary will provide a genome-scale overview of the extent and functional role of translational control. At the tissue level, the patterns of mRNA expression may be different from the patterns of protein expression, for example due to translational repression in some cells or tissue specific regulation of protein stability, as shown here for the Corolla protein. The combined mRNA and protein expression patterns may therefore uncover a hidden complexity in overall gene activity regulation and the fTRG lines will help to reveal these combinatorial patterns in a systematic manner.
The fTRG lines faithfully recapitulate gene expression patterns in ovaries, embryos, larvae, pupae and adults suggesting that they can be used to visualise proteins in every tissue during the life cycle of the fly. This includes adult tissues such as the flight or leg muscles, which thus far had not been subjected to systematic protein expression and localisation studies. However, due to their size and the conservation of the contractile apparatus these tissues are particularly attractive to study with this new resource. In general, antibody or FISH stainings with a single standard anti-tag reagent are easier to optimise, compared to antibody stainings or mRNA in situ with gene specific antibodies/probes. This simplicity makes it possible to explore the expression of the available genes across multiple tissues, as has been done for the rab collection (Dunst et al., 2015). Such an approach is orthogonal to the collections of expression data generated thus far, in which many genes were examined systematically but only at particular stages or in certain tissues, i.e. embryos or ovaries (Jambor et al., 2015; Lécuyer et al., 2007; Tomancak et al., 2007). We are confident that the analysis of regularly studied as well as less explored Drosophila tissues will be stimulated by the fTRG collection.
When protein expression levels are sufficiently high, the fusion proteins can be visualised by live imaging approaches in intact animals. It is difficult to estimate the absolute expression levels required for live visualisation, as this depends on the imaging conditions, the accessibility and transparency of the tissue and importantly on the observed protein pattern. A strongly localised protein can result in a very bright local signal, such as Talin or Ilk at the muscle attachment sites, compared to a protein homogenously distributed throughout the entire cell. In particular for tissues, such as the adult legs, antennae or the adult fat body, which are difficult to dissect and stain without losing tissue integrity, these live markers should be enormously beneficial.
One important limitation for examining the pattern of protein expression is the accessibility of the tissue of interest for imaging. We have shown that light sheet microscopy can be used to image the dynamics of tagged protein expression throughout embryogenesis. We further demonstrated that two-photon microscopy can be applied to study protein dynamics during muscle morphogenesis in developing pupae. Other confocal or light sheet based imaging paradigms could be adapted for in toto imaging of living or fixed and cleared specimen from other life cycle stages. Establishing standardized protocols for preparation, staining and imaging of Drosophila stages, isolated tissues and organs will be necessary to realise the full potential of the fTRG collection.
Protein interaction data in fly are available from a number of studies (Formstecher et al., 2005; Giot et al., 2003; Guruharsha et al., 2011). These results were generated using yeast-two-hybrid, or overexpression in a tissue culture system, followed by affinity-purification and mass spectrometric analysis. Despite high-throughput, these approaches face the problem that the interacting proteins might not be present at the same place within a cell, or not even co-expressed in a developing organism. This is circumvented by affinity purifications of endogenously expressed proteins, which thus far at genome-scale was only reported from yeast (Gavin et al., 2002; Ho et al., 2002; Krogan et al., 2006). In higher organisms, BAC-based systems, which are closely related to our fosmid approach, elegantly solved these issues, as shown by a recent human interactome study (Hein et al., 2015).
The collection of transgenic flies covers currently only about 10 % of the available tagged fosmids. Expanding the collection to include most genes of the genome and importantly characterising the transgenes-encoded proteins by imaging in various biological contexts is best achieved by spreading the clones and transgenic lines amongst the community of researchers using Drosophila as a model system. Therefore, all transgenic lines are available from the VDRC stock collection. Despite the expanding CRISPR-based genome engineering technologies, the fTRG collection will continue to be an important resource for the fly community, in particular, if the full functionality of certain fTRG lines has been demonstrated, as we did here for a selection of important developmental regulators. As with many genome-scale resources it is typically easier to produce them than to fully characterise and exploit their potential. Comprehensive generation of thousands of transgenes and their thorough analysis takes time; it took us 4 years to assemble the collection presented here. Development of protocols and techniques to image these collections of tagged lines and assembling open access databases to share the data needs to continue and will eventually become useful also for the characterisation of resources whose production began only recently.
Wangler, Yamamoto and Bellen convincingly argued that the Drosophila system remains an indispensable model for translational research because many essential fly genes are homologs of Mendelian disease genes in humans (Wangler et al., 2015). Yet, even after decades of research on fruit flies only about 2000 of the estimated 5000 lethal mutations have been investigated. Resources like ours will therefore provide essential functional information about gene expression and localisation in Drosophila tissues that can serve as a starting point for the mechanistic understanding of human pathologies and their eventual cures.
Materials and Methods
TransgeneOme clone engineering
Fosmids were engineered as described previously (Ejsmont et al., 2009; 2011), except for the inclusion of the ‘pre-tagging’ step in the genome-wide TransgeneOme set. All tagging cassettes were generated from synthetic DNA and cloned into R6K carrying plasmids, which require the presence of the pir gene product for replication (Metcalf et al., 1996). The pir gene is not present in the FlyFos library host strain, thereby ensuring near-complete lack of background resistance in the absence of the correct homologous recombination event.
Details of the recombineering steps are as follows (Figure 2A): Step 1. The E. coli cells containing a FlyFos clone covering the gene locus of interest are transformed with the pRedFlp plasmid, containing the genes necessary for the homologous recombination and the Flp recombinase under independently inducible promoters. Step 2. Next, a ‘pre-tagging’ cassette carrying an antibiotic resistance gene (NatR, nourseothricin resistance) surrounded by regions of homology to all specific tagging cassettes (Figure 2 Supplement 1) and flanked by gene specific homology arms is electroporated as linear DNA fragment produced by PCR. By combination of induced (L-rhamnose) pRedFlp homologous recombination enzyme action and strong selection with a cocktail of three antibiotics (one to maintain the fosmid (chloramphenicol, Cm), one to maintain the pRedFlp (hygromycin, Hgr) and nourseothricin (Ntc) to select for the inserted fragment) the electroporated linear ‘pretagging’ fragment becomes inserted in front of the STOP codon of the gene of interest. Step 3. The ‘pre-tagging’ cassette is exchanged for a cassette of the chosen tag coding sequence including an FRT flanked selection / counter selection marker (rpsL-neo). This cassette is now universally targeting the homologous sequences shared by the tagging and pre-tagging cassettes and is produced in bulk by restriction enzyme mediated excision from a plasmid. Note that in this way, no PCR induced mutations can be introduced at this step. Step 4. Upon Flp induction (with anhydrotetracycline) the rpsL-neo cassette is excised, leaving a single FRT site, positioned in frame with the tag coding sequence. Step 5. Finally, the recombineering plasmid is removed from the cells containing the engineered fosmids by inhibition of its temperature sensitive origin of replication and release from Hgr selection. The cells are plated on a selective chloramphenicol agar plate from which a single colony is picked and further validated.
NGS-based validation of the TransgeneOme clones
For NGS-based resource validation single colonies for each TransgeneOme clone were picked into 96-well plates and individual wells of all 96-well plates were pooled into 8 rows and 12 columns pools. Fosmid DNA was isolated from these pools, and mate pair fragment libraries were prepared and sequenced on an Illumina platform. First, adapters and low quality sequences were trimmed with Trimmomatic0.32. (parameters: ILLUMINACLIP:NexteraPE-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36). Second, in order to detect un-flipped fosmid sequences (where the FLP-mediated excision of selection cassette failed), the read pairs were mapped with Bowtie2 (Langmead and Salzberg, 2012) against the unflipped tag sequence and the genome. If any read of the mate of the pair mapped to the un-flipped sequence while the second mate mapped to the genome consistent with the estimated mate pair insert size of 3000 bp ± 1000 bp, the fosmid was flagged a unflipped and was not further analysed. Third, in order to identify mutations in the tag and in the immediate genomic surrounding (± 1000 bp), the NGS reads were mapped against the fosmid references that included the flipped tag. The Bowtie2 was set to report only hits where both reads of the pairs map concordantly to the insert size in the tag and in the genome (parameters: -I 2200 -X 3700 --rf --no-discordant --no-unal --no-mixed). This mapping was further filtered by deleting PCR duplicated read pairs with samtools1.1 rmdup (Li et al., 2009). Mutations were identified by utilising SNP calling implemented in FreeBayes (Garrison and Marth, 2012) using the standard filters and vcffilter to eliminate reported SNPs with scores < 20. Finally, in the last step, the information of the row and column pools were compared and summarized using a custom C-program that read the results of the SNP calling and the Bowtie mappings and counted the coverage for each read pair anchored in tag sequence with at least 20 bp. To correct for random PCR or sequencing errors the reported SNPs were compared for the row and column pools of each fosmid and SNPs occurring in both pools with coverage of 3 or more reads were considered as real.
Drosophila stocks and genetic rescue experiments
Fly stocks were maintained using standard culture conditions. All crosses were grown at 25°C unless otherwise noted. Most of the fly mutant or deficiency strains for the rescue experiments were obtained from the Bloomington Drosophila Stock Center and if located on X- or 2nd chromosome crossed together with the respective fTRG line. If the mutant gene was located on the 3rd chromosome it was recombined with the fTRG line. Rescue was generally tested in trans-heterozygotes as indicated in Supplemental Table 3. The rescue for 6 genes (bam, fat, mask, rap, RhoGEF2 and yki) was done by others, who communicated or published the results (Supplementary Table 3). For rescue of flightlessness a standard flight test was used (Schnorrer et al., 2010).
Generation of transgenic FlyFos (fTRG) lines
Most TransgeneOme fosmid clones were injected into the y[1], w[*], P{nos-phiC31\int.NLS}X; PBac{y+-attP-3B}VK00033 (BL-32542). This stock has white eyes and no fluorescent eye markers, which would interfere with screening for the red fluorescent eye marker used in the FlyFos clones (Ejsmont et al., 2009). A few fosmid clones were also injected into y[1], w[*], P{nos-phiC31\int.NLS}X; PBac{y+-attP-3B}VK00002, with the attP site located on the 2nd chromosome. The osk-GFP fosmid was injected into attP40.
Detailed injection protocol
A) Bacterial culture of fosmid clones: 1. Inoculate 2 ml LB-medium plus chloramphenicol (Cm 25 μg / ml) with fosmid clone and grow over night at 37 °C. 2. Dilute to 10 ml (9 ml LB-medium + Cm and 1 ml bacterial culture) and add 10 μl 10 % arabinose (final concentration 0.01 %) to induce the fosmid to high copy number. 3. Grow at 37 °C for 5 h and collect the pellet by 10 min centrifugation at 6000 rpm. Pellet can be stored at -20 °C.
B) Preparation of fosmid DNA: Use the HiPure Plasmid Miniprep Kit from Invitrogen (order number: K2100-03) according to the supplied protocol (MAN0003643) with following modifications: before starting: pre-warm the elution buffer (E4) to 50 °C; step 4: incubate the lysate for 4 min at room temperature; step 5: incubate 4 min on ice before centrifuging at 4 °C for 10 min; step 8: add 850 μl elution buffer (pre-warmed to 50 °C) to the column; step 9: add 595 μl isopropanol to the elution tube, centrifuge 20 min at 4 °C; wash pellet with 800 μl 70 % ethanol; centrifuge for 2 min; step 12: air dry the pellet for 4 min. Add 20 μl EB-buffer (Qiagen) to the pellet and leave at 4 °C overnight to dissolve without pipetting to avoid shearing of the DNA. Do not freeze the DNA. Adjust the concentration to 250 ng / μl and centrifuge 5 min at full speed before injections. Do not inject DNA older than one week.
C) Embryo injections: Collect young embryos (0 - 30 min) on an agar plate, bleach away the chorion, wash and collect the embryos on a cellulose filter (Whatman 10409814). Align the embryos, transfer them to a glued slide and dry them with silica gel for 10 - 15 min (Roth T199.2). Cover the embryos with Voltalef 10S oil (Lehmann & Voss) and inject the prepared fosmid DNA using a FemtoJet set-up (Eppendorf 5247). The injected DNA should be visible within the embryo. Incubate the injected embryos for 48 h at 18 °C in a wet chamber and collect the hatched larvae with a brush. Cross the surviving mosaic adults individually to y, w males or virgins.
Immuno-stainings
Ovaries: sGFP-protein detection in egg-chambers was done as previously described (Dunst et al., 2015). Detection of the oskar-GFP mRNA was performed with a gfp-antisense probe (Jambor et al., 2014) and co-staining of osk mRNA and Osk protein was done as previously described (Jambor et al., 2011) using a gfp-antisense probe and a rabbit anti-GFP antibody (1:1000, ThermoFisher).
Adult thoraces: Antibody stainings of adult thoraces, including flight, leg and visceral muscles, were done essentially as described for adult IFMs (Weitkunat and Schnorrer, 2014). Briefly, thoraces from young adult males were fixed for 15 min in relaxing solution (20 mM phosphate buffer, pH 7.0; 5 mM MgCl2; 5 mM EGTA, 5 mM ATP, 4 % PFA) + 0.5 % Triton X-100, cut sagittally with a sharp microtome blade and blocked for 1 hour at room temperature with 3 % normal goat serum in PBS-0.5% Triton X-100. Samples were stained with primary antibodies overnight at 4 °C (rabbit anti-GFP 1:2000 Amsbio; mouse anti-Futsch 1:100, Hybridoma Bank) washed and incubated with secondary antibodies coupled to Alexa dyes and rhodamine-phalloidin or phalloidin-Alexa-660 (all from Molecular Probes). After washing, samples were mounted in Vectashield containing DAPI. Images were acquired with a Zeiss LSM 780 confocal microscope and processed with Fiji (Schindelin et al., 2012) and Photoshop (Adobe).
Live imaging
SPIM imaging of embryos: De-chorionated embryos of the appropriate age were embedded in 1 % low melting point agarose and mounted into a glass capillary. Fluorescent microspheres (FY050 Estapor microspheres, Merck Millipore; 1:4000) were included in the embedding medium for multi-view registration. The embryos were imaged using the Zeiss Lightsheet Z.1 with a Zeiss 20x/1.0 water-immersion Plan Apochromat objective lens with 0.8x zoom at 25 °C using 488 nm laser set at 4 mW. Five views were imaged using dual-sided illumination with Zeiss 10x/0.2 illumination lenses. A mean fusion was applied to fuse both illumination sides after acquisition using the ZEN software (Zeiss). The views were acquired at 72° angles with a stack size of 130 μm and a step size of 1.5 μm. Exposure time were 30 ms per slice. Each slice consists of 1920 × 1200 pixels with a pixel size of 0.29 μm and a bit depth of 16 bits. The light sheet thickness was 4 μm at the center of the field of view. The embryos were imaged from onset of expression of the fosmids transgenes (determined empirically) until late embryogenesis with a time resolution of 15 min. Multi-view processing of the dataset was carried out using the Fiji plugin for multiview reconstruction (Preibisch et al., 2009; Schmied et al., 2014), which was executed on a high performance computing cluster (Schmied et al., 2015). The multi-view reconstruction was followed by multi-view deconvolution (Preibisch et al., 2014), for which the images were down sampled by a factor of two. Movies were extracted via the Fiji plugin BigDataViewer (Pietzsch et al., 2015).
The Gsb-GFP fTRG line was crossed with the H2Av-mRFPruby line (Fischer et al., 2004; Preibisch et al., 2014), the embryos of this cross were imaged using a 40x/1.0 water immersion Plan Apochromat lens from Zeiss with 1x zoom at 25 °C at 17.5 mW of the 488 nm laser and 4 mW of the 561 nm laser. A single angle with dual sided illumination was imaged. The stack size was 82.15 μm with a step size of 0.53 μm. Exposure time was 30 ms per slice. Each slice consisted of 1920 × 1920 pixels with a pixel size of 120 nm and a bit depth of 16-bit. The light sheet thickness was 3.21 μm at the center of the field of view. The embryos were imaged from early blastoderm onwards until late embryogenesis focusing on the head with a time resolution of 7 min.
Imaging of pupae: Staging and live imaging of the pupae were performed at 27 °C. Live imaging of pupae at the appropriate stage was done as described previously (Weitkunat and Schnorrer, 2014). Briefly, the staged pupa was cleaned with a brush and a small observation window was cut into the pupal case with sharp forceps. The pupa was mounted on a custom-made slide and the opening was covered with a small drop of 50 % glycerol and a cover slip. Z-stacks of either single time points or longterm time lapse movies were acquired using either a spinning disc confocal microscope (Zeiss, Visitron) or a two-photon microscope (LaVision), both equipped with heated stages.
Proteomics
Per sample about hundred pupae or adult flies were snap-frozen in liquid nitrogen and ground to a powder. The powder was re-suspended and further processed as described in the quantitative BAC-GFP interactomics protocol (Hubner et al., 2010). In brief, 800 μl of lysate per sample were cleared by centrifugation. The cleared lysate was mixed with magnetic beads pre-coupled to anti-GFP antibodies and run over magnetic micro-columns (both Miltenyi Biotec). Columns were washed, and samples subjected to in-column tryptic digestion for 30 min. Eluates were collected and digestion continued overnight, followed by desalting and storage on StageTips. Eluted peptides were analysed with an Orbitrap mass spectrometer (Thermo Fisher). Raw data were analysed in MaxQuant version 1.4.3.22 (Cox and Mann, 2008) using the MaxLFQ algorithm for label-free quantification (Cox et al., 2014). Interacting proteins were identified by the similarity of their intensity profiles to the respective baits (Keilhauer et al., 2014). Heat maps were plotted in the Perseus module of the MaxQuant software suite.
Author contributions
D.S., S.H., E.V. and M.S performed the liquid culture recombineering experiments. M.S., S.J., P.K. and S.S. analysed the NGS data and contributed to the TransgeneOme database.
B. S., N.P., K.F. and F.S performed the transgenesis of about half the fTRG lines. V.K.J.V., R.T.K., K.A., M.R. and K.V performed the transgenesis of the other half of the fTRG lines.
C. B. performed most of the genetic rescue experiments.
H.J. performed and analysed all experiments in ovaries.
C.S. collected and analysed SPIM in toto images of embryos.
C.B. and F.S. performed and analysed all imaging data in pupae and adults.
M.Y.H and M.M. performed the proteomics analysis and analysed the data.
V.H. contributed analysis of Gsb-GFP cell tracking.
R.K.E. constructed the tagging cassettes.
I.R.S.F. performed initial proof of principle tagging experiments.
P.T., F.S., M.S. and E.K. initiated the collaborative project and obtained dedicated funding.
P.T. and F.S. conceived and coordinated the study and co-wrote the manuscript with input from the other authors.
Competing interests
The authors declare that no competing interests exist.
Supplementary Movie 1
Multi-view SPIM movie of a stage 12 Nrv2-GFP expressing embryo. A stack was acquired every 15 minutes, lateral, dorsal, ventral and transverse views of the same time points are displayed. From stage 11 onwards Nrv2-GFP is present ubiquitously in the plasma membrane. Later its expression increases in the CNS, particularly in the neuropil and the motor neurons. Movie plays with 7 frames per second. Time is given in hh:mm. Scale bar indicates 50 μm.
Supplementary Movie 2
Lateral head section from a SPIM movie of a stage 10 Gsb-GFP (green, white in the top movie), Histone-2A-mRFPruby (red) embryo. A stack was acquired every 7 minutes. The segmentally re-iterated stripe-like gsb expression domain in the head neuroectoderm is visible. Later, gsb is expressed in ganglion mother cells and nerve cells that are the progeny of gsb expressing neuroblasts. Movie plays with 7 frames per second. Time is given in hh:mm. Scale bar indicates 50 μm.
Supplementary Movie 3
Ventral view of a SPIM movie of a stage 6 Gsb-n-GFP embryo. A stack was acquired every 15 minutes. Gsb-n-GFP is only detectable at the end of germ-band extension. During germ-band retraction it is expressed in characteristic L-shaped expression domains in the hemi-segments of the trunk. In the late stage embryo Gsb-n-GFP is present in the neurons of the shortening ventral nerve cord. Movie plays with 7 frames per second. Time is given in hh:mm. Scale bar indicates 50 μm.
Supplementary Movie 4
Z-projection of a two-photon movie of an about 14h APF pupa expressing Act88F-GFP. A stack was acquired every 20 min for 19 h. Expression of Act88F-GFP increases in the indirect flight muscles dramatically, thus contrast was reduced several times in course of the movie to avoid over-exposure. Movie plays with 5 frames per second. Time is given in hh:mm.
Supplementary Movie 5
Single plane of a spinning disc confocal movie of an about 14 h APF old pupa expressing Act88F-GFP (green) in the flight muscle myotubes and him-GAL4; UAS-palm-Cherry in the myoblasts. An image stack was acquired every two minutes. Note the newly fused myoblasts acquired the GFP label within a single time interval (highlighted by green arrows). Movie plays with 5 frames per second. Time is given in minutes.
Supplementary Movie 6
Z-projection of a two-photon movie of an about 14h APF pupa expressing βTub60D-GFP. A stack was acquired every 20 min for 25 h. Note the high expression of βTub60D-GFP in fusing myoblasts and the thick microtubules bundles in the developing flight muscles. Hair cells of the developing sensory organs also show strong expression, however move out of the Z-stack over time. Movie plays with 5 frames per second. Time is given in hh:mm.
Supplementary Movie 7
Single plane of a two-photon movie of an about 16 h APF old pupa expressing βTub60D-GFP in myoblasts and the forming flight muscle myotubes. An image stack was acquired every two minutes for more than 3 h. Note that single myoblasts can be followed during fusion. Most myoblasts fuse in the center of the myotube, which gradually splits into two myotubes. Movie plays with 5 frames per second. Time is given in hh:mm.
Acknowledgements
Stocks obtained from the Bloomington Drosophila Stock Center (NIH P40OD018537) were used in this study. We thank Anne Ephrussi for sharing of fly lines. We thank Franziska Friedrich for drawing the ovariole scheme used for Figure 4A. We thank Andreas Dahl from the Deep Sequencing Group at CRTD/BIOTECH, Dresden for the NGS library preparation and sequencing. We also thank light microscopy facility and the computer department for assistance with imaging and data processing. We are grateful to Sandra Lemke, Aynur Kaya-Copur and Xu Zhang for help with the genetic rescue experiments and to Caroline Sonsteby for assistance during some of the pupal imaging experiments. We thank the entire Schnorrer lab for helpful comments on this manuscript. We are particularly grateful to Reinhard Fässler and Herbert Jäckle for continuous support of this work. This work was funded by the EMBO Young Investigator Program (F. S.) the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant 310939 (F. S.), the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant 260746 (P.T.), European Commission GENCODYS (P.T. and H.J.), Human Frontier Science Program (HFSP) RGY0093/2012 (P.T. and C. S.) and the Max Planck Society (P.T., F.S., M.S., M.M. and E.K.).
Footnotes
↵1 Examine raw data in BigDataViewer (Fiji -> Plugins -> BigDataViewer -> Browse BigDataServer and http://bds.mpi-cbg.de:8087