Colias butterflies (the “clouded sulphurs”) often occur in mixed populations where females exhibit two color morphs, yellow/orange or white. White females, known as the Alba morph, reallocate resources from the synthesis of costly colored pigments to reproductive and somatic development 1. Due to this tradeoff Alba females develop faster and have higher fecundity than orange females 2. However orange females, that have instead invested in pigments, are preferred by males who in turn provide a nutrient rich spermatophore during mating 2,3,4. Thus the wing color morphs represent alternative life history strategies (ALHS) that are female-limited, wherein tradeoffs, due to divergent resource investment, result in distinct phenotypes with associated fitness consequences. Here we map the genetic basis of Alba in Colias crocea to a transposable element insertion downstream of the Colias homolog of BarH-1. To investigate the phenotypic effects of this insertion we use CRISPR/Cas9 to validate BarH-1’s functional role in the wing color switch and antibody staining to confirm expression differences in the scale building cells of pupal wings. We then use scanning electron microscopy to determine that BarH-1 expression in the wings causes a reduction in pigment granules within wing scales, and thereby gives rise to the white color. Finally, lipid and transcriptome analyses reveal additional physiological differences that arise due to Alba, suggesting pleiotropic effects beyond wing color. Together these findings provide the first well documented mechanism for a female ALHS and support an alternative view of color polymorphism as indicative of pleiotropic effects with life history consequences.
Evolutionary theory predicts that positive selection will remove variation from natural populations, as genotypes with the highest fitness go to fixation 5. However, across diverse taxa ALHS are maintained within populations at intermediate frequencies due to balancing selection6. Modelling and mechanistic insights have advanced our understanding of ALHS evolution and maintenance (e.g. negative - frequency dependent selection)7. However, the majority of studies, and consequently our insights, are biased toward male strategies that are morphologically dramatic (e.g. ruff 8,9 and side-blotched lizards 10). Whether this bias reflects true differences in the frequency of alternative strategies between the sexes or is simply an artifact is unknown 11. As trade-offs and selection regimes are often sex specific, the lack of female insights severely limits our understanding of the mechanisms, maintenance, evolution, and co-evolution of alternative strategies in general 11. Yet despite calls for further investigation11, a well documented mechanism for a female limited ALHS has yet to be identified. Here we identify one such mechanism in the butterfly Colias crocea (Pieridae).
Approximately a third of the nearly 90 species within the butterfly genus Colias exhibit a female-limited ALHS known as Alba12. The switch between strategies is controlled by a single, autosomal locus that causes Alba females to reallocate guanosine triphosphate (GTP), amounting to several percent of their nitrogen budget, from the synthesis of pteridine pigments to other areas of development 1. Consequently, Alba females have white wings, while non-Alba females are orange/yellow (Fig. 1A). As a result of this trade-off, Alba females gain fitness advantages over orange females due to faster pupal development, a larger fat body, and significantly more mature eggs at eclosion 2. Despite these developmental advantages and the dominance of the Alba allele, females remains polymorphic due to tradeoffs in abiotic and biotic factors 2,13–15. For example, Alba’s development rate advantage is higher only in cold temperatures, also as a result of density-dependent, interference competition with other white Pierid species and sexual selection, males preferentially mate with orange females 2,3,13,14. The mating bias likely has significant fitness costs for Alba because males transfer essential nutrients during mating and multiply mated females have more offspring over their lifetime 4,16. Field studies confirm Alba frequency and fitness increases in species that inhabit cold and nutrient poor habitats, where the occurrence of other white Pierid butterflies is low, while in warm environments with nutrient rich host plants and a high co-occurrence with other white species, orange females exhibit increased fitness and frequency 3,14.
Using a de novo reference genome for C. crocea that we generated via Illumina and PacBio sequencing (Extended Data Table 1), and three rounds of bulk segregant analyses (BSA) using whole genome sequencing of a female and two male informative crosses for Alba, we mapped the Alba locus to a ∼3.7 Mbp region (Extended Data Fig.1, & Supplementary Information). Then, with whole genome re-sequencing data from 15 Alba and 15 orange females from diverse population backgrounds, a SNP association study was able to fine map the Alba locus to a single ∼430 kb contig that fell within the ∼3.7 Mbp BSA locus (Fig. 1B) (Supplementary Information). The majority of SNPs significantly associated with Alba (n=70 of 72) were within or flanking a Jockey-like transposable element (TE) (Fig. 1B & 1C). We determined the TE insertion is unique to the Alba morph in C. crocea by assembling the orange and Alba haplotypes for this region, then quantifying differences in read depth between morphs within and flanking the insertion, and comparing the region to other butterfly genomes (Danaus plexippus & Heliconius melpomene) (Extended Data Figs.2 & 3). Additionally we validated the presence and absence of the insertion, respectively, across 82 wild females, 25 Alba and 57 orange (Extended Data Fig.4).
The Alba specific TE insertion was located ∼30 kb upstream of a DEAD-box helicase, and ∼6kb downstream of BarH-1, a homeobox transcription factor (Fig. 1C). BarH-1 was an intriguing find as its knockout in Drosophila melanogaster causes a dramatic decrease in pigment granules within the eye, changing eye color from red (wild type) to white 17. To validate the functional role of BarH-1 in the Alba phenotype we generated CRISPR/Cas9-mediated deletions of exons 1 and 2 in a mosaic knockout approach (Extended Data Fig.5 & Supplementary Information). BarH-1 deletions gave rise to a mosaic lack of pigmentation in the eyes of males and females of both morphs, consistent with BarH-1’s expected role in insect eye development (Fig. 1D). Additionally, on the dorsal side of the wings, females with an Alba genotype exhibited a white/orange color mosaic, while males and orange females displayed no wing KO phenotypes, despite those individuals exhibiting mosaic phenotypes in the eye. (Fig. 1E).
To further investigate the role of BarH-1 in developing wing scales, we used in situ hybridization of BarH-1 on wings from 2 day old pupae of orange and Alba females of C. crocea, as well as Vanessa cardui. The BarH-1 protein is highly expressed in the scale building cells of both species (Fig. 2), suggesting a previously undescribed role of BarH-1 in the developing wing scales of butterflies. Comparison between orange and Alba females of C. crocea further documents Alba as a gain of BarH-1 function, as scale building cells in the developing wing show a BarH-1 expression pattern that is Alba limited (Fig. 2).
In butterflies both pigments and scale morphology can affect wing color 18 and while Alba females exhibit large reductions in colored pteridine pigments compared to orange 1, whether morphs differed in wing scale morphology was unknown. Using scanning electron microscopy we found Alba scales exhibited significantly less pigment granules, the structures that store pteridine pigments in Pierid butterflies 19, compared to orange (t5.97 = 2.93, p = 0.03), suggesting reduced granule formation as the basis of the Alba color change (Fig. 3 A & B). As expected, the number of pigment granules were also significantly reduced in the white regions of the CRISPR/Cas9 BarH-1 KO individuals (t5.45 = 10.78, p < 0.001) (Fig. 3C & D), demonstrating that BarH-1 is affecting pigment granule formation to give rise to Alba. To further test whether reduction in pigment granule number alone was sufficient for the orange/white color change, we chemically removed the pigment granules from the wing of an orange C. crocea female, resulting in formerly orange regions turning white likely due to the scattering of light from remaining non-lamellar microsctructures on the wing (Fig. 3 E) 20. Thus, the white wings of Alba C. crocea (Fig. 1A & 3A) differ from other white Pierid species, as the latter exhibit abundant pigment granules in their scales (Fig. 3F and Extended Data Fig.6), documenting that there are multiple routes to white wing color in Pieridae.
We next tested whether the physiological tradeoffs of Alba reported for New World species 1,2, which were discussed in the introduction, were also seen in C. crocea, an Old World species, as shared tradeoffs would suggest the Alba mechanism is conserved genus wide. To compare abdominal lipid stores between morphs, we conducted high performance thin layer chromatography on 2 day old adult females reared under two temperature treatments (Hot: 27°C vs. Cold: 15°C during pupal development). We found Alba females had larger abdominal lipid stores than orange in both temperature treatments, though the difference was only significant in the cold treatment (cold: n=32, t29.12 = 3.42, P = 0.002, hot: n=25, t22.71 = 0.67, P = 0.51) (Fig. 4A), consistent with the known effects of temperature on Alba fitness in New World Colias species 2.
We then conducted RNASeq on pupal wing and abdomen tissue, at the time of pteridine synthesis (i.e. when allocation tradeoffs are realized) to identify genes that exhibited differential expression between morphs (Fig. 4B,C, Supplementary Information). We found that vitellogenin 1, which encodes an egg yolk precursor protein synthesized in the fat bodies of insects 21, was significantly upregulated within Alba abdomen tissue (log fold change [log FC] of 4.8) (Fig. 4B). Additionally, consistent with previous reports of GTP reallocation in Alba females 1, RIM, a Rab GTPase effector 22, was one of the most highly differentially expressed (DE) genes in both tissues (logFC increase in Alba of 3.4 in the abdomen and 5.1 in the wings) (Fig. 4B,C). RIM acts as a molecular switch by converting guanosine diphosphate to GTP, thereby activating its associated Rab GTPase, which is in turn involved in synaptic vesicle exocytosis and secretory pathways 23. These results are consistent with previous qualitative findings of Alba females in the North American species C. eurytheme (Alba females have a larger fat body, emerge from the pupa with significantly more mature eggs, and reallocate GTP from pigment synthesis to somatic development 1,2). Our findings thus quantitatively demonstrate that the trade-offs associated with the Alba ALHS are likely consistent across the Colias genus, suggesting that Alba may be due to the same genetic mechanism and corroborating previous work that proposed Alba is homologous across Colias 12.
Gene set enrichment analyses identified 85 functional categories that were significantly enriched in the abdomen tissue of Alba females (Extended Data Fig.7), notably downregulation of ‘positive regulation of GTPase activity’ (adjusted p value < 0.0001), ‘regulation of Notch signalling pathway’ (adjusted p value = 0.03), and ‘canonical Wnt signalling’ (adjusted p value < 0.01). While the Wnt pathway is known to regulate wing patterns in several butterfly species 24,25, these findings are curious as they are observed in abdomen tissue rather than wing, suggesting potential unexplored pleiotropic effects of these pathways outside of the wing. In wing tissue, 35 functional categories were significantly enriched and downregulated in Alba including ‘regulation of transcription’ (adjusted p value < 0.0001) and ‘positive regulation of GTPase activity’ (adjusted p value < 0.0001), while ‘protein catabolic process’ (adjusted p value < 0.0001) was upregulated (Supplementary Information). BarH-1 was not DE between morphs in our RNASeq data, suggesting that morph specific expression differences are temporal and likely occur earlier in development. Further functional studies of candidate genes are needed to better understand their mechanistic roles in the trade-off.
Here we report that the genetic basis of a female-limited ALHS arises from the co-option of the homeobox transcription factor BarH-1, primarily known for its role in the morphogenesis of the insect eye, neurons and leg segments 26. We document that BarH-1 has a similar function in eye morphogenesis of butterflies, and also find it is expressed during wing scale development in butterflies from the families Pieridae and Nymphalidae, which last shared a common ancestor over 70 million years ago. This novel finding suggests a conserved function of BarH-1 in scale morphogenesis that warrants further study and suggests a parsimonious route to BarH-1’s gain of function in the ALHS Alba phenotype, with co-option from a role in wing scale development rather than its previously described functions. BarH-1’s well characterized role in determining cell fate through gene repression 27 suggests it is involved in the repression of pigment granule formation, providing an explanation for the Alba allele being dominant and a gain of function that results in the absence of a phenotype (i.e. orange wing color). To what extent BarH-1 has an active pleiotropic role in other tissues or developmental stages remains to be determined, as the extensive physiological responses we document could easily arise from a simple reallocation following the absence of pigment granule formation. Given the emergence of “toolkit” genes for butterfly wing patterning, wherein specific genes have been found to be repeatedly involved in wing color variation across distant species (e.g. cortex 28), determining to what extent BarH-1 is involved in other wing phenotypes and ALHS is of interest, especially given the pleiotropic effects on life history documented here. Finally, our results and others (e.g. side-blotched lizards 29 & damselflies 30) suggest that investigating to what extent ALHS are associated with color variation in other systems is warranted, especially in cases where such variation is female limited.
Author Contributions
AW conducted butterfly rearings and lab work, analysed the data, and wrote the manuscript with CWW and input from the coauthors. MWP, KT, CWW, and AW conducted the CRISPR/Cas9 knockout experiment. AW and KT conducted the electron microscopy. MWP conducted antibody staining. RN and JH assisted with bioinformatics. PL and RK conducted HPTLC and PL and AW analyzed the data. AW, CS, CWW and OB conducted fieldwork. MC conducted lab work. CWW supervised the work at all stages.
Methods
For detailed methods, including all bioinformatic commands, please see the supplementary information.
Data availability
SRA reference numbers for the genome and sequencing data will be included upon acceptance.
Genome assembly
An orange female and male carrying Alba (offspring from wild caught butterflies, Catalonia, Spain) were mated in the lab. DNA from an Alba female offspring of this cross was extracted. Quality and quantity were assessed using a Nanodrop 8000 spectrophotometer (Thermo Scientific) and a Qubit 2.0 fluorometer (dsDNA BR, Invitrogen). A 180 insert size paired end library (101bp reads) was prepared (TruSeq PCR free) and sequenced on an Illumina Hiseq 4000 at the Beijing Genomics Institute (Shenzhen, China). A Nextera mate-pair library with a 3 kb insert size was prepared and sequenced on an Illumina HiSeq 2500 (125bp reads) at the Science for Life Laboratory (Stockholm, Sweden). Raw data was cleaned and high quality reads were used as input for the AllPaths-LG (v. 50960) 31 assembly pipeline. High molecular weight DNA was extracted from two more Alba females from the above mentioned cross (i.e full siblings). Equal amounts of DNA from each individual were pooled sent to the Science for Life Laboratory (Stockholm, Sweden) for PacBio sequencing on 24 SMRT cells (∼17GB of data was produced). A Falcon (v. 0.4.2) 32 assembly was generated by the Science for Life Laboratory. We then used Metassembler (v. 1.5) 33 to merge our AllPathsLG and Falcon assemblies, using the AllPathsLG assembly as the primary assembly. Bulk segregant analyses (BSA): The female informative cross data and mapping protocol described in Woronik and Wheat, 2017 34 was applied to the high quality reference genome to identify the contigs that made up the Alba chromosome. Male Informative Cross (MIC) I: DNA was extracted from a wild caught orange mother (Catalonia Spain) and 26 of her Alba and 24 of her orange female offspring. DNA quality and quantity of each individual was assessed via a Nanodrop 8000 spectrophotometer (Thermo Scientific, MA, USA) and a Qubit 2.0 Fluorometer (dsDNA BR; Invitrogen, Carlsbad, CA, USA) before pooling equal amounts of high-quality DNA from Alba and orange offspring into two pools, respectively. The library preparation (TruSeq PCR-free) and Illumina sequencing (101 bp PE HiSeq2500), was performed at the Beijing Genomics Institute (Shenzhen, China). Raw reads were cleaned and then mapped to the reference genome using NextGenMap v0.4.10 (-i 0.09) 35. SAMTOOLS v1.2 36 was used to filter (view ‐f 3 ‐q 20), sort and index the bam files and generate mpileup files for the two pools and the orange mother. Popoolation2 37 were used to calculate the allele frequency difference between Alba and orange pools. SNP sites were filtered in R 38, for a read depth ≥ 30 and ≤ 300, a bi-allelic state, and a minimum minor allele frequency of 3. The orange mother mpileup was similarly analyzed using Popoolation 39 (read depth ≥ 5 and ≤ 30); but the major and minor allele frequencies were calculated in R 38 by dividing the major and minor allele count by the read depth at each site respectively. A SNP site was considered a MIC I Alba SNP when it met the following expectations: 1) homozygous in the orange mother, 2) homozygous in the orange pool, 3) the allele frequency difference in the Alba pool compared to the orange was 0.45-0.55. MIC II: A male carrying Alba mated an orange female in the lab at Stockholm University. DNA was prepared as described above for 26 Alba and 28 orange female offspring resulting in two DNA pools. Library preparation (TruSeq PCR-free) and Illumina sequencing (150 bp paired-end reads with 350bp insert, HiSeqX), was performed at Science for Life Laboratory (Stockholm, Sweden). The same mapping and SNP calling pipeline used on the MIC I was applied. A site was considered an Alba SNP if 1) it was homozygous in the orange pool and 2) the allele frequency difference in the Alba pool compared to the orange was 0.45-0.55. A contig was considered Alba associated if it had ≥ 3 Alba SNPs in all crosses. Nineteen Alba associated contig were identified. They total ∼3.7Mbp and are considered the Alba BSA locus.
Genome wide association study
DNA for genome re-sequencing was extracted from 15 Alba and 15 orange females from diverse population backgrounds (Catalonia, Spain and Capri, Italy). High quality DNA was prepared using Illumina TruSeq and sequenced at the Science for Life Laboratory (Stockholm, Sweden) (150 bp paired-end reads HiSeqX). Cleaned reads were mapped to the annotated reference genome using NextGenMap v0.4.10 (-i 0.6 ‐X 2000) 35. Bam files were filtered and sorted using SAMTOOLS v1.2 (view ‐f 3 ‐q 20) 36. A VCF file was generated using SAMTOOLS v1.2 (-t DP ‐t SP ‐Q 15) 36 and bcftools v.1.2 (-Ov ‐m) 36. VCFtools (v0.1.13) 40 was used to call SNP sites with no more than 50% missing data, an average read depth between 15-50 across individuals, and a minimum SNP quality of 30. An association analysis was performed with PLINK (v1.07) 41 and a Benjamini & Hochberg step-up FDR control was applied. SNPs with FDR <0.05 were considered Alba SNPs. We conducted this analysis both genome wide and only within the BSA locus. Both analyses fine mapped the Alba locus to the same genomic region.
Antibody Generation and Staining
A Rabbit-anti-Bar antibody was generated against the full length sequence of the Vanessa cardui Bar homolog: MTVQRDERDARAPRTRFMITDILDAAPRDLSAHRDSDSDRSATDSPGVKDDSDDVSSKSCGG DASGLAKKQRKARTAFTDHQLQTLEKSFERQKYLSVQDRMELAAKLGLTDTQVKTWYQNRRT KWKRQTAVGLELLAEAGNYAAFQRLYGGYWAGVPAYPAQPAPAAADLYYRQAAATAAAAASA SANTLQKPLPYRLYPGAPLGGVPPLGLGLPGPSAHLGSLGAPGLGALGYYAQARRTPSPDVDP GSPAPPPRSPREPSIEQRSDDEDDDETIHV. Protein was generated by GenScript (Piscataway, NJ) and purified to >80% purity. DNA sequences to produce this protein were codon-optimized for bacterial expression and made via gene synthesis. GenScript injected resultant protein into host animals, collected serum for testing, and affinity purified the product using additional target protein bound to a column. Antibody staining was performed as described previously for Drosophila and butterfly tissues 42. Staged pupal wings and retinas were dissected and fixed between 24-72 hours post-pupation. The Rabbit-anti-Bar antibody was used at 1:100, followed by secondary antibody staining with AlexFluor-555-anti-Rabbit secondaries at 1:500 and counterstaining with DAPI. Images were captured using standard confocal microscopy on a Leica SP5.
CRISPR/Cas9 knockouts
The guide-RNA (gRNA) sequences were generated using the protocol described in Perry et al. 2016. Viable Cas9 target-sites were located by manually looking for PAM-sites (NGG) in the exon region of BarH-1. Uniqueness of the target regions was confirmed using a NCBI nucleotide blast (ver. 2.5.0+ using blastn-short flag and filtering for an e-value of 0.01) against the C. crocea reference genome. gRNA constructs were ordered from Integrative DNA Technologies (Coralville, Iowa, USA) as DNA (gBlocks). Full gRNA constructs had the following configuration: an M13F region, a spacer sequence, a T7-promotor sequence, the Target specific sequence, a Cas9 binding sequence, and finally a P505 sequence. Upon delivery, gBlocks were amplified using PCR to generate single-stranded guide RNA (sgRNA). For each gBlock, four 50ul reactions were conducted using the M13f and P505 primers and Platinum Taq (Invitrogen cat. 10966-034). The four reactions were then combined and purified in a Qiagen Minelute spin column (cat. 28004, Venlo, Netherlands). The resulting template was transcribed using the Lucigen AmpliScribe T7-flash Transcription Kit from Epicentre/Illumina (cat. ASF3507, Madison, WI, USA) followed by purification via ammonium acetate precipitation. Products were resuspended with Qiagen buffer EB, concentrations were quantified by Qubit and further diluted to 1000 ng/μl. They were then mixed with Cas9-NLS protein (PNA Bio, Newbury Park, CA, USA) and diluted to a final concentration of 125-250 ng/μl. C. crocea females (n > 40) from Aiguamolls de l’Empordà, Spain were captured and kept in morph-specific flight cages in the lab at Stockholm University where they oviposited on alfalfa (Medicago sativa). Eggs were collected between 1-7h post-laying and sterilized in 7% benzalkonium chloride for ∼5 minutes before injection. Injections were either at a concentration of 125 or 250 ng/ul and conducted using a M-152 Narishige micromanipulator (Narishige International Limited, London, UK) with a 50 ml glass needle syringe, with injection pressure applied by hand via a syringe fitting.
CRISPR/Cas9 validation
To validate the mutation, Cas9 cut sites were PCR-amplified and a ∼370bp region, centered on the intended cut site were sequenced using Illumina MiSeq 300bp paired-end sequencing. Primers were designed using Primer3 (http://biotools.umassmed.edu/bioapps/primer3_www.cgi). DNA was isolated from KO-individuals using KingFisher Cell and Tissue DNA Kit from ThermoFisher Scientific (N11997) and the robotic Kingfisher Duo Prime purification system. DNA quality and quantity were assessed via a Nanodrop 8000 spectrophotometer (Thermo Scientific, MA, USA) and a Qubit 2.0 Fluorometer (dsDNA BR; Invitrogen, Carlsbad, CA, USA). Aliquots were then taken and diluted to 1ng/ul before amplifying the region over the cleavage-site. Sequences were amplified and ligated with Illumina adapter and indexes in a two-step process following the protocol provided by Science for Life Laboratories (Stockholm, Sweden) and Illumina. First, we amplified the ∼370bp long sequence around the cut sites and attach the first Illumina adapter, onto which we later attach Illumina handles and Index using a second round of PCR (Accustart II PCR Supermix from Quanta Bio [Beverly, MA, USA], settings 94C x 2 min followed by 40 cycles of 94 C x 30 sec + 60 C x 15 sec + 68 C x 1 min followed by 68 C x 5 min). PCR products were purified using Qiagen Qiaquick (Cat. 28104). Concentration and quality of the product were assessed via Nanodrop and gel electrophoresis. DNA was diluted to ∼0.5ng/ul and then the unique double indices were attached by the second round of PCR (same protocol as above). The final PCR products were purified again using Qiaquick spin columns and concentration and size was assessed using Qubit fluorometer and gel electrophoresis. All samples were then mixed at equal molarity and sent for sequencing at Science for Life Laboratories (Stockholm, Sweden). Sequences were aligned to their respective fragments (area surrounding cut site) using SNAP (ver. 1.0beta18) 43, identical reads were clustered using the collapser utility in Fastx-Toolkit. Sequences containing deletions were extracted and the most abundant sequences containing deletions were selected for confirmation of deletion in the expected region.
Electron Microscopy
To quantify pigment granule differences between Alba and orange individuals pieces of the forewing were mounted on aluminum pin stubs (6mm length) with the dorsal side upwards. Samples were coated in gold for 80 seconds using an Agar sputter coater and imaged under 5 kV acceleration voltage, high vacuum, and ETD detection using a scanning electron microscope (Quanta Feg 650, FEI, Hillsboro, Oregon, USA). To quantify pigment granules within the photos we selected images from the same magnification and drew randomly placed three 4 μm2 squares on the images. We counted the number of pigment granules within each square and took the average, then conducted a two sample t-test in R. To quantify pigment granule differences between KO and wild type regions in our CRISPR KO mosaic individuals, a biopsy hole punch a 2mm in diameter circle was used to cut out one piece mostly containing white scales and one piece with mostly orange scales. These pieces were first photographed using a Leica EZ4HD stereo microscope in order to allow us to confirm the color of each scale once they were covered with gold sputter. Five white and five orange scales were then selected and the granules from a 4μm2 square were counted from each of those scales and a two sample t-test was then conducted in R.
Lipid Analysis
Wild caught C. crocea Alba females (Catalonia, Spain) oviposited in the lab. Eggs were moved into individual rearing cups and split between two temperature treatments (hot: 27 °C and 16 hour day length during larval and pupal development, cold: reared at 22 °C with a 16 hour day length during larval development and 15°C with a 16 hour day length during pupal development). Once pupated individuals were checked a minimum of every 12 hours. Upon eclosion adults were stored at 4 °C until the next day to provide time for meconium excretion. Butterflies were not allowed to feed before dissection. Body weight was taken using a Sauter RE1614 scale before dissection. Total lipids were extracted using the Folch method 44 according to the procedures outlined in Woronik et. al. 2018 13. HPTLC was conducted as described in Woronik et. al. 2018 13. In brief, 5 μl of the sample lipid extract was applied on a silica plate with a Camag Automatic TLC Sampler 4 (Camag, Muttenz, Switzerland). After the silica plate developed it was scanned with a Camag TLC plate scanner 3 at 254 nm using a deuterium lamp with a slit dimension of 6 × 0.45 mm and analyzed with the Win-CATS 1.1.3.0 software. Peaks representing the four major neutral lipid classes (diacylglycerols, triacylglycerols, cholesterol and cholesterol esters) were identified by comparing their retention times against known standards. Then the peak areas were integrated and the amount of lipid within each class was calculated using the formula: pmolsample = (Areasample / Areastandard) x pmolstandard. The total lipid content (nmol per abdomen) was calculated as a sum of pmol contents of all neutral lipid classes. For the statistical analyses this value was regressed against abdomen weight and standardized residuals (i.e. mass-corrected storage lipid amount) and were subsequently used as dependent variable.
Transcriptome assembly and differential expression analysis
Offspring from a wild caught Alba female from Catalonia, Spain were reared at Stockholm University. When larvae reached the fifth instar they were checked at least every six hours and the pupation time of each individual was recorded. Tissue was collected between 82% and 92% of pupal development. Pupae were dissected in PBS solution, and the abdomen and wings were flash frozen in liquid nitrogen and stored at ‐80 °C. RNA was extracted from the abdomen and wing tissues using Trizol. RNA quality and quantity was assessed using a Nanodrop 8000 spectrophotometer (Thermo Scientific) and an Experion electrophoresis machine using the manufacturer protocol (Bio-Rad, Hercules, CA). Library preparation (Strand-specific TruSeq RNA libraries using poly-A selection) and sequencing (101 bp PE HiSeq2500 ‐ high output mode) was performed at the Science for Life Laboratories (Stockholm, Sweden). In total 16 libraries were sequenced (4 orange and 4 Alba individuals ‐ wings and abdomen from each individual). Raw data was cleaned and reads from all libraries were used in a de novo transcriptome assembly (Trinity version trinityrnaseq_r2013_08_14 with default parameters) 45. To reduce the redundancy among contigs and produce a biologically valid transcript set, the tr2aacds pipeline from the EvidentialGene software package 46 was run on the raw Trinity assembly. The sixteen RNA-Seq libraries were mapped to the resulting transcriptome using NextGenMap v0.4.10 (-i 0.09) 35. SAMTOOLS v1.2 35 was then used to filter (view ‐f 3 ‐q 20), sort and index the sixteen bam files. SAMTOOLS v1.2 35 idxstats was then used to calculate the read counts per gene for each of the sorted bam files. These counts were then joined in a CSV file using an in-house pipeline and csvjoin. A differential expression analysis was conducted in EdgeR 47. A Benjamini Hochberg correction was applied to the raw p values to correct for false discovery rate and differentially expressed genes were called (adjusted p value <0.05). Babelomics (version 4.2) 48 was used to conduct a gene set enrichment analysis (Fatiscan, two tailed Fisher’s exact test). Revigo 49 was used to cluster significantly enriched GO terms by semantic similarity (default settings, C = 0.7). The GO term clusters were named and assigned p-values based on the most significant GO term in the cluster.
Acknowledgements
We would like to thank Lovisa Wennerström, Elishia Harji, Jofre Carnicer, and Christina Hansen Wheat for help with fieldwork. We thank Marianne Ahlbom for assistance with the SEM. Finally we would like to thank Christen Bossu, Naomi Keehnen, and Peter Pruisscher for helpful comments on the manuscript. We thank the Department of Zoology at Stockholm University, the Swedish Research Council 2012–3715, the Academy of Finland 131155, the Knut and Alice Wallenberg Foundation 2012.0058 and the Erik Philip-Sörensens foundation for funding.