Abstract
Coral reefs are suffering a major decline due to the environmental constraints imposed by climate change. Over the last 20 years, three major coral bleaching events occurred in concomitance of anomalous heat waves, provoking a severe loss of coral world-wide. Recent works, however, reported of reefs at which recurrent bleaching events resulted in increased resistance to thermal stress, suggesting that adaptation might have occurred. The conservation strategies for preserving the reefs, as they are conceived now, cannot cope with global climatic shifts. In this regard, researchers advocated the set-up of a preservation framework to reinforce coral adaptive potential. The main obstacle to this implementation is that studies on coral adaption are usually hard to generalize at the scale of a reef system. In this work, we combined a seascape genomics and connectivity analysis to characterize the adaptive potential of a flagship coral species of the Ryukyu Archipelago (Japan). By associating genotype frequencies with descriptors of historical environmental conditions, we discovered six genomic regions hosting polymorphisms that might promote resistance against thermal stress. Remarkably, annotations of genes in these regions are consistent with molecular roles in heat responses. Furthermore, we confronted genetic distances between reefs with their spatial separation according to sea currents to predict connectivity patterns across the region. The cross-talk between the results of these analyses portrayed the adaptive potential of this population: we were able to identify reefs carrying potential adaptive traits and to understand how they disperse to neighbouring reefs. This information was summarized by objective, quantifiable and mappable indexes covering the whole region that can be extremely useful for prioritization in conservation plans. This framework is transferable to any coral species and on any reef system and therefore represents a valuable tool for empowering preservation efforts dedicated to the protection of coral reef in warming oceans.
Introduction
Coral reefs are suffering a severe decline due to the effects of climate change (Hughes et al., 2017). Loss of reef is already showing catastrophic consequences for marine wildlife depending on these structures (Pratchett, Thompson, Hoey, Cowman, & Wilson, 2018) and disastrous aftermaths are expected for human economies as well (Moberg & Folke, 1999). One of the major threat to the persistence of this ecosystem is coral bleaching (Bellwood, Hughes, Folke, & Nyström, 2004): a physiological response induced by environmental stress that provokes hard-skeleton coral, the cornerstone of the reef, to separate from the symbiotic microbial algae living within their cells (Mydlarz, McGinty, & Harvell, 2010). These algae, called Symbiodiniacea (LaJeunesse et al., 2018), are crucial for the host sustainment and therefore a persistent bleached state leads coral to death (Mydlarz et al., 2010). Three mass bleaching events occurred over the last 20 years in concomitance with anomalously high sea water temperatures, reason why thermal stress is now considered the main driver of this phenomenon (Hughes et al., 2017). These events struck world-wide and lead to local losses of coral cover up to 50% (Hughes et al., 2018, 2017). Besides, coral bleaching was found to be associated with other stressors linked to human activity, such as ocean acidification, water eutrophication, sedimentation and overfishing (Anthony, Kline, Diaz-Pulido, Dove, & Hoegh-Guldberg, 2008; Ateweberhan et al., 2013; Maina, Venus, McClanahan, & Ateweberhan, 2008).
Conservation efforts facing the plague of coral bleaching aim to restore reefs that underwent severe losses and to limit the impact of future damages (Baums, 2008; Bellwood et al., 2004; Young, Schopmeyer, & Lirman, 2012). For these purposes, two tools are nowadays usually used: the establishment of marine protected areas (MPAs) and the set-up of coral nurseries (Baums, 2008; Bellwood et al., 2004; Young et al., 2012). MPAs are designed zones in which human access and activities are severely restricted in order to alleviate the effects of local anthropogenic stressors (Lester et al., 2009). Coral nurseries are underwater gardens of transplanted colonies that can then be employed to restore damaged reefs (Baums, 2008; Young et al., 2012). In both cases, researchers advocated the use of design systems accounting for demographic connectivity so that the position of a conservation measure can promote resistance and resilience even on neighbouring sites (Baums, 2008; Krueck et al., 2017; Lukoschek, Riginos, & van Oppen, 2016; Palumbi, 2003; Shanks, Grantham, & Carr, 2003). Even though beneficial effects of this kind of conservation policies have been observed worldwide (Cinner et al., 2016; Rodgers et al., 2017; Selig & Bruno, 2010), these solutions do not confer resistance against the temperature oscillations associated with the last mass bleaching events (Baums, 2008; Hughes et al., 2017). Coral reefs that experienced past thermal stress were found to be more resistant to subsequent heat waves (Hughes et al., 2019; Krueger et al., 2017; Penin, Vidal-Dupiol, & Adjeroud, 2013; Thompson & van Woesik, 2009) but to date this information is neglected in conservation actions (Baums, 2008; Maina, McClanahan, Venus, Ateweberhan, & Madin, 2011). There is an urgent need to understand whether these observations are due to evolutionary processes and, if so, to figure out how the underlying adaptive potential could be included in predictions of climate change responses and in conservation plans (Baums, 2008; Logan, Dunne, Eakin, & Donner, 2014; Maina et al., 2011; van Oppen, Oliver, Putnam, & Gates, 2015).
To this end, seascape genomics tools are likely to play an important role. Seascape genomics is the marine counterpart of Landscape genomics, a branch of population genomics that investigates adaptive potential in field-based experiments (S. Joost et al., 2007; Balkenhol et al., 2017; Rellstab, Gugerli, Eckert, Hancock, & Holderegger, 2015). Samples are collected across a landscape and then genotyped using next-generation-sequencing techniques, providing thousands of genetic variants (Rellstab et al., 2015). Simultaneously, the environmental characterization of the study area is carried out mainly based on remote-sensing data describing specific local climatic conditions (Rellstab et al., 2015). Genomics and environmental information are then combined in order to detect genetic polymorphisms associated with peculiar conditions (i.e., potentially adaptive traits; Rellstab et al., 2015). This approach has already been applied to many land species and is increasingly used to analyse marine animals in what is referred to as seascape genomics (exhaustively reviewed in Riginos, Crandall, Liggins, Bongaerts, & Treml, 2016). As far as we know, no seascape genomics experiment was applied to reef corals yet. In fact, adaptation of these species has been mostly studied by performing transplantation assays followed by conditioning in aquarium, a time and resource-demanding approach that often focuses on a couple of contrasting reef (Howells, Berkelmans, van Oppen, Willis, & Bay, 2013; Krueger et al., 2017; Palumbi, Barshis, Traylor-Knowles, & Bay, 2014; Sampayo et al., 2016; Ziegler, Seneca, Yum, Palumbi, & Voolstra, 2017). Genotype-environment associations studies have also been conducted on corals but they focused on a limited number of markers (Lundgren, Vera, Peplow, Manel, & van Oppen, 2013) or on a restricted number of locations (Bay & Palumbi, 2014; Thomas, Kennington, Evans, Kendrick, & Stat, 2017). Unlike these works, a seascape genomics approach should cover ecologically meaningful spatial scales, allowing to distinguish pressures of different climatic conditions and to account for confounding effects of demographic processes (Balkenhol et al., 2017). Consequently, the main (adaptive) and collateral (connectivity) information generated by this method provides a valuable contribution for preservation strategies (Baums, 2008; Krueck et al., 2017; Palumbi, 2003).
In this work, we applied a seascape genomics framework to detect coral reefs carrying potential adaptive traits to show how conservation policies could implement such findings. Our study focuses on Acropora digitifera of the Ryukyu Archipelago in Japan (Fig. 1), an emblematic species of the Indo-Pacific and flagship organism for studies on corals genomics (Shinzato et al., 2011). We first analysed the convergence between genomic and environmental information i) to detect loci potentially conferring a selective advantage and ii) to develop a model describing connectivity patterns. Next, we took advantage of these findings to indicate which reefs were more likely to be carrying adaptive traits and to evaluate how well such sites are interconnected with the rest of the reef system. Finally, we propose an approach to implement the results obtained in conservation planning. In general, our work provides tools at the interface of conservation genomics and environmental sciences likely to empower preservation strategies for coral reefs.
Materials and methods
The input genomics data used in this paper comes from a published work from Shinzato et al. (2015) describing the population structure of A. digitifera of the Ryuyku Archipelago. Our framework is structured in two axes of analysis and prediction, one focusing on the presence of adaptive traits (seascape genomics) and the other on connectivity (Fig. 2). The seascape genomics analysis (Fig. 2A) crosses genomics data with environmental information at sampling sites to discover potentially adaptive genotypes. The model describing this relationship is then used to predict, at the scale of the whole study area, the probability of presence of adaptive genotypes (Fig. 2B). In the connectivity study (Fig. 2C), we built a model describing how sea-distances between sites (computed out of remote sensing data) correspond to genetic-distances. This model is then used to predict connectivity at scale of the whole study area (Fig. 2D). Finally, the predictions on where the adaptive traits are more likely to appear and on how the reef system is interconnected allow to assess the adaptive potential across the whole study area (Fig. 2E).
Genomic dataset
The genomic data used come from a publicly available dataset consisting of 155 georeferenced colonies of A. digitifera from 12 sampling locations (13±5 colonies per site) of the Ryukyu Archipelago in Japan (Fig. 1; Bioproject Accession PRJDB4188). These samples were sequenced using a Whole-Genome Sequencing approach in the+ scope of a population genomics study. Details on how samples were collected and processed for genomic analysis can be found in Shinzato et al., (2015).
Genomics data were processed using the Genome Analysis Toolkit (GATK; McKenna et al., 2010) framework following the recommended pipeline with modifications necessary to cope with the absence of reliable databases of known variants for this species (Van der Auwera et al., 2013). In short, the A. digitifera reference genome (v. 1.1, GenBank accession: GCA_000222465.2; Shinzato et al., 2011) was indexed using bwa (v. 0.7.5a, Li & Durbin, 2009), samtools (v. 1.9, Heng Li et al., 2009) and picardtools (v. 1.95, http://broadinstitute.github.io/picard) and raw sequencing reads were aligned using the bwa mem algorithm. The resulting alignments were sorted, marked for duplicate reads, modified for read-group headers and indexed using picard-tools. Next, each alignment underwent an independent variant discovery using the GATK HaplotypeCaller tool (using the ERC mode and setting the --minPruning flag to 10) and genotypes were then jointly called by the GATK GenotypeGVCFs tool in random batches of 18 samples to match our computational power (18 CPUs). The variant-calling matrices of the different batches were then joined and filtered in order to keep only bi-allelic Single Nucleotide Polymorphisms (SNPs) using the GATK CombineVariants and SelectVariants tools, respectively. This resulted in a raw genotype matrix counting ~1.2 M of SNPs. Subsequently, we used the GATK VariantAnnotator tool to annotate variants for Quality-by-depth and filtered for this value (<2), read coverage (>5 and <100 within a sample), minor allele frequency (>0.05), major genotype frequency (<0.95) and missing rate of both individuals and SNPs (<0.1) using the GATK VariantFiltrationTool and custom scripts in the R environment (v. 3.5.1, R Core Team, 2016). This pipeline produced the filtered genotype matrix consisting of 136 individuals and 15,516 SNPs.
Environmental data
Eleven georeferenced datasets describing atmospheric and seawater conditions were retrieved from publicly available resources (EU Copernicus Marine Service, 2017; NASA, 2016; National Oceanic and Atmospheric Administration, 2017; Tab S1). These remote sensing derived datasets provided monthly or daily variables measured across the whole study area, with a spatial resolution ranging from 25 km to 4 km (Tab S1). We processed these variables in the R environment using the raster package (v. 2.8, Hijmans, 2016) to compute monthly averages and standard deviations for the whole studied period. Furthermore, Sea Surface Temperature (SST) measurements were used to compute a Degree Heat Week (DHW) frequency index, representing the % of days during which the bi-weekly accumulated heat stress exceeded the 4 °C (Liu, Strong, & Skirving, 2003; Logan et al., 2014). SST and sea surface salinity records were combined to produce estimates of seawater pH (Covington & Whitfield, 1988), dissolved inorganic carbon (Loukos, Vivier, Murphy, Harrison, & Le Quéré, 2000) and alkalinity (Lee et al., 2006). Bathymetry data (Ryan et al., 2009) were used to retrieve the depth at sampling locations. Finally, population density data (CIESIN Columbia University, 2010) were averaged in a 50 km buffer area to produce a surrogate-variable for anthropogenic pressure (Welle, Small, Doney, & Azevedo, 2017).
We used the geographic coordinates associated with each sample to characterize the environmental conditions using the QGIS point sampling tool (v. 2.18.25, QGIS development team, 2009). For predictive step of our study demanding to work at the scale of the whole reef system (Fig. 2C), we retrieved the shapes of the reefs of the region (UNEP-WCMC, WorldFish-Center, WRI, & TNC, 2010) and reported them into a regular grid (cell size of 5×5 km) using QGIS. In this case, we also included reefs from the neighbouring regions (Taiwan and Philippines, Fig. S7) to avoid border-effects in computations. Environmental conditions were assigned to each reef-cell using average function of the QGIS zonal statistics tool.
Seascape genomics
We performed the genotype-environment association analysis using the logistic regression method implemented within the SamBada software (v. 0.7; Stucki et al., 2017), customized to fasten multivariate models computation in the Python environment (v. 2.7; Python Software Foundation, 2018) using the pandas (v. 0.23.4; McKinney, 2010) and statsmodels (v. 0.9; Seabold & Perktold, 2010) libraries. The SamBada approach allows to include proxy variables for genetic structure in the analysis in order to avoid a possible confounding effect (pattern of neutral genetic variation mimicking signals of adaptation to the local environment). Here we performed a discriminant analysis of principal components (DAPC) on the SNPs genotype matrix using the R package adegenet (v. 2.1.1; Jombart, 2008). This procedure highlighted a main separation between two groups of samples along the first discriminant function (Fig. S1). The latter was therefore used as co-variable in association models. We then ran the pipeline for the discovery of adaptive signals described hereafter and summarized in a schematic view in Fig. S2. In order to speed up calculations, the 315 environmental variables were grouped into 29 clusters of highly correlated (|R|>0.7) descriptors in the R environment (Fig. S2A). For each of these groups, one variable was randomly selected to build logistic models against the three genotypes of each SNP (Fig. S2B). In total, the SamBada instance evaluated 1,349,892 association-models (29 environmental groups vs. 3 × 15,516 SNPs; Fig. S2C, S2D) that were subsequently analysed in the R environment. For each association-model related to the same environmental variable, p-values of G-scores (G) and Wald-scores (W) were corrected for multiple testing using the R q-value package (v. 2.14, Storey, 2003). Association-models scoring q<0.001 for both statistics were deemed significant (Fig. S2E). If a SNP was found in more than one significant association, only the best model (according to the value of G) was kept. For each significant association retained, we then calculated all the models built with the involved genotype vs. the other environmental variables from the same cluster of correlated descriptors and we looked for the best association model (according to G; Fig. S2F). Finally, we visualized the logistic regression of each significant association-model using the R popbio library (v. 2.4.4; Stubben & Milligan, 2007) and manually checked the annotations around significant SNPs on the NCBI genome browser (NCBI, 1988). We set the size of the search window to ± 250 kbs since the gene(s) potentially linked to a mutation can lay up to hundreds of kbs away (Brodie, Azaria, & Ofran, 2016; Visel, Rubin, & Pennacchio, 2009). This window size was chosen because it corresponds approximately to the scaffold N50 statistics of the reference genome (i.e. half of the genome is contained within scaffolds of this size or longer).
Probability of Presence of Adaptive Genotypes
The results of the seascape genomics analysis were then used to predict the probability of presence of adaptive traits at the scale of the whole Ryukyu Archipelago (Fig. 2B). Indeed, for each significant association between a genotype and an environmental variable, the SamBada approach provides parameters of a logistic regression which links the probability of occurrence of the genotype with the value of the environmental variable (Stucki et al., 2017). Those logistic models can therefore be used to estimate the probability of presence of the genetic variant for any value of the environmental variable (Stéphane Joost, 2006; Rochat, E., Leempoel, K., Vajana, E., Colli, L., Ajmone-Marsan, P., Joost, 2016). We assumed that potentially adaptive traits could reach any reef of the region since several studies reported high gene flow between distant islands of the Archipelago (Nakajima, Nishikawa, Iguchi, & Sakai, 2010; Nishikawa, 2008; Shinzato et al., 2015). Finally, these probabilities where combined in an average probability (the arithmetical mean of the five probabilities) of carrying adaptive traits (Pa) at each reef of the Ryukyu Archipelago.
Sea current data
Daily records of sea surface current were retrieved from publicly available databases (zonal and meridional surface velocities from the global-reanalysis-phy-001-030 product; EU Copernicus Marine Service, 2017) and used to compute the direction and speed of currents in the R environment using the raster library. These day-by-day calculations were then piled-up to retrieve, for each pixel of the study area (~0.083°), the cumulated speed (the conductance) toward each of the eight neighbouring pixels. This information was used to calculate dispersal costs (the inverse of conductance) and was summarized in a transition matrix in the format of the R gdistance package (v. 1.2, van Etten, 2018). For the connectivity analysis (Fig. 2C), the transition matrix was used to calculated sea-distances (i.e. the least-cost path) between sampling sites of the genotyped colonies. For the connectivity predictions (Fig. 2D), we calculated the sea-distances between all the reefs of the study area, whose coordinates were the centroids of the reef-cells computed as described in the environmental variables section. Importantly, for every couple of reefs (for instance reef1 and reef2) two sea-distances were computed, one per direction (i.e. from reef1 to reef2 and from reef2 to reef1).
Connectivity analysis
Genomics data was processed by the means of a principal component analysis (PCA; Fig. S3) on the filtered genotype matrix. Prior to PCA, in order to cope with missing genotypes in the SNP matrix, we applied the random imputation method of the LEA R package (v.2.4.0; Frichot & François, 2015). The values of the principal components (PCs) were used to calculate the Euclidean distances between each couple of samples that were subsequently averaged by site in order to obtain Genetic Distances (GDs) between reefs. Next, we built a linear model (hereafter referred to as the connectivity model) to estimate GDs out of log-transformed sea-distances and ran a t-test to check for the statistical significance of the relationship. Since we wanted to consider the genetic distances within reef (i.e. sea-distance = 0) we added +0.005 to all the sea-distances prior to the log-transformation to avoid obtaining non-definite values (i.e. the logarithm of zero). As a comparison, we also built a connectivity model using log-transformed aerial-distances (i.e. Euclidean distance of coordinates) as independent variable while maintaining GD as response variable. We then confronted models based on sea-and aerial-distance by comparing the coefficients of determination (R2) and the Akaike information criterion (AIC; Bozdogan, 1987).
Connectivity Predictions
We then used the connectivity model to predict GDs between any couple of reefs of Ryukyu Archipelago. This was done by translating sea-distances in GDs using the connectivity model trained during the connectivity analysis. Importantly, since there were two sea-distances for every couple of reefs (i.e. from reef1 to reef2 and vice versa), two GDs were computed as well.
Based on these predictions, we were able to calculate, for each reef-cell, two indices (Fig. S4):
– outbound connectivity (OC; Fig. S4B): the % of cells reachable from the focal cell within a GD-threshold (tGD).
– inbound connectivity (IC; Fig. S4C): % of cells that can reach the focal cell within a tGD. Since these indices were conceived to display contrasts in connectivity among reefs of the Ryukyu Archipelago, we looked for the tGD value maximizing their variances (Fig. S5). The values corresponding to the 25%-quantile and the 60%-quantile of the GD distribution were those boosting the differences in both connectivity indices. To avoid the artificial bias of border effects, we preferred working at a more local scale and set tGD at the 25%-quantile of the GD distribution (i.e. tGD=79.861).
Evaluation of the adaptive potential
The adaptive potential was evaluated by combining the predictions on the presence of adaptive traits and on connectivity patterns across the reefs of Ryuyku Archipelago (Fig. 2E). The previously computed average probability of carrying adaptive traits (Pa) is used to identify the putatively adaptive reefs. These are the reefs approaching the maximal values of Pa (Pa>0.6). We then used the GDs calculated during the connectivity analysis to retrieve, for each reef-cell, the GD from the closest putatively adapted reef. This results in an index describing the adaptive potential of every reef of the study area: the distance from the closest adaptive reef.
Data Access and Code Availability
All the data and codes used in this article are publicly available on Zenodo (DOI: 10.5281/zenodo.2643454).
Results
Seascape Genomics
We detected 15 significant genotype-environment association-models (A1-A15; qG and qW<0.001; Tab. 1), referring to genetic markers from 6 distinct scaffolds of the A. digitifera reference genome. Among them, one association was related to SST variation in April, one to changes in sea current velocity in May and 13 to DHW frequency (or to a highly correlated variable such as variation in cloud fraction in August).
The SNP in the significant association-model A1 is related to SST standard deviation in April and lies within a genomic window (± 250 kb) counting 18 predicted genes, 7 of which are annotated with a known function (Box S1). Association-models A2 to A5 include four SNP lying at two opposite sides of a long scaffold (~1.3 Mb). The SNPs located on one side (involved in association-models A2 and A3) are associated with DHW frequency and are surrounded by 17 predicted genes (of which 12 have a putative encoded protein with known function; Box S2) while the SNPs located on the other side (association-models A4 and A5) are related to Sea current velocity variance in May and are surrounded by twelve putative genes (Box S4). All the remaining association-models (A6 to A15) involved only DHW. Association-model A6 concerns a SNP surrounded by 14 putative genes, but only one is annotated, the related predicted protein being JNK1/MAPK8-associated membrane proteinlike (NCBI accession: LOC107334815; Box S5). In association-model A7, the SNP of interest lies within in a window showing eleven predicted genes (only two with a known match by similarity; Box S6). Association-models A8 to A14 refer to seven SNPs present in the same scaffold (~195 kb long). This region contains five putative genes of which 2 annotated by similarity (Box S6). In association-model A15, finally, the SNP associated with DHW frequency is located on a ~75 kb scaffold that contains only one putative gene predicted as ATP-dependent DNA helicase RecQ-like (LOC107328703; Box S7).
Probability of Presence of Adaptive Genotypes
The significant association-models of the seascape genomics analysis were then used as starting point for predicting the probability of presence of adaptive genotypes at the scale of the whole Ryukyu Archipelago (Fig. 2B). These genotypes can be theoretically associated to any environmental descriptor but, since most of the putative adaptive genotypes were found to be associated with DHW frequency (Tab. 1), we designated this variable as the environmental pressure of interest (association models A3, A6, A7, A9 and A15, Tab. 1). The average of these probabilities (Pa) ranges from 0.40 to 0.62 (Fig. S7). Potentially adapted reefs (i.e. reefs with Pa>0.6) are located in the islands of Okinawa and Miyako, in the remote Japanese islands east of the Ryukyu Archipelago and in the Northern part of Philippines (Fig. S7).
Connectivity modelling
The two dispersal models respectively based on sea- and aerial-distances resulted in a significant association between genetic-distance (GD) and spatial separation (p<10-9, Fig. S6). We found that the model based on sea-distances better explained the variation in GD (R2=0.57) than the aerial-distances model (R2=0.45). The AIC is lower for the model based on sea-distances (AIC=315) than for the aerial-distances model (AIC=343). The sea-distances dispersal model was therefore used for the following inbound and outbound connectivity predictions.
The reefs around the islands of Tokara were those with the highest values of Inbound Connectivity index (IC=0.49±0.01; Fig. 3a). The reefs in the central islands of Ryukyu Archipelago (Okinawa and Amami) showed higher values of the IC index (IC=0.41±0.03; Fig. 3a) as compared to those located at the edges of the reef system (Yaeyama in South, Osumi in the North) which showed the lowest values (IC=0.27±0.03). As regards the Outbound Connectivity index, reefs located in the southern part (Yaeyama and Miyako) are those predicted with the highest values (OC=0.42±0.03; Fig. 3b) while those in the northern part (Osumi and Tokara) display the lowest ones (OC=0.20±0.13).
Evaluation of the adaptive potential
Reefs in the northern Islands of the Archipelago (Osumi, Tokara and Amami) and in the southern Islands of Yaeyama were those predicted as the genetically farthest (GD=79.4±0.3 and GD=79.0±0.1, respectively) from reefs potentially adapted against thermal stress (Pa>0.6; Fig. 4). Conversely, the southern islands of Miyako and those around Okinawa resulted as the closest ones, despite a considerable variance (GD=73.6±3.1 and GD=76.4±2.8, respectively).
Discussion
Adaptation to thermal stress
Thermal stress is expected to be one of the major threat to coral reef survival and the quest for adaptive traits is becoming of paramount importance (Baums, 2008; Logan et al., 2014; Maina et al., 2011). In this work, the seascape genomics analysis of A. digitifera of the Ryukyu Archipelago revealed the presence of 6 genomic regions hosting genetic variants that might confer a selective advantage against thermal stress (Tab. 1). None of these SNPs directly lay within a coding sequence of a putative gene, but this is rarely the case for causative-mutations (Brodie et al., 2016). In fact, genetic variants in intergenic regions that play a modulatory action on the expression of neighbouring genes are more frequent and can influence loci at 1-2 Mb of distance (Brodie et al., 2016). The fragmentation of the reference genome forced us to limit our search window to ±250 Kb around each SNP and still we found several annotations corroborating a response to heat stress.
The SNP in association-model A6 is related to “JNK1/MAPK8-associated membrane protein”, the closest gene with a predicted function (Box S4). Mitogen-activated-protein kinases (MAPKs) are proteins known to be involved in cellular responses to stress in a wide range of taxa (Neupane, Nepal, Benson, Macarthur, & Piya, 2013) and the c-Jun-N-terminal kinase (JNK) was experimentally proved to be activated under thermal stress in coral Stylopohora pistillata (Courtial et al., 2017). Of the 26,060 predicted genes in the A. digitifera genome (Shinzato et al., 2011), 35 are annotated as “mitogen-activated protein kinase” and only one refers to the “JNK1/MAPK8-associated membrane protein”. Another annotation supports these findings and concerns gene-models predicted as “ATP-dependent DNA helicase Q” and “RecQ” (A1, A7-15; Box S1, Box S5-7). These are proteins involved in the DNA repairing mechanism to UV-light damage in prokaryotes (Courcelle & Hanawalt, 1999) and for which light-stress driven effects were observed in eukaryotic cells as well (Sharma, Doherty, Brosh, & Jr, 2006). Annotations related to these proteins are not uncommon (43 in the reference genome) but it is remarkable to find them around significant SNPs from four distinct genomic regions. A third situation occurs with genes assigned to cytochrome P450 (A2, A3; Box S2), a large family of proteins involved in cellular responses to various types of stresses including heat shocks (Rosic, Pernice, Dunn, Dove, & Hoegh-Guldberg, 2010; Sang, Ma, Qiu, Zhu, & Lei, 2012; Yampolsky et al., 2014) and for which differential patterns of gene expression were observed in Acropora hyacinthus under distinct thermal regimes (Palumbi et al., 2014). We detected 4 annotations related to these genes in the proximity of one region with significant SNPs (models A2 and A3) but the relatively high frequency of this gene-family (62 occurrences across the genome) invites to a cautious interpretation.
Seascape/Landscape genomics studies are susceptible to high false discovery rates, especially when neutral genetic structure is not accounted for (Rellstab et al., 2015). To take this element into account, we processed data following a conservative pipeline and used models explicitly integrating demographic processes. We also set a restrictive threshold to filter significant association-models (i.e. q<0.001 in two distinct tests). Most of the significant association-models found were linked to DHW, most probably because this variable represents one of the main constraint to coral survival (Hughes et al., 2017). It is also possible that the initial application of a sampling scheme adapted to seascape genomics (unlike the one used by Shinzato et al. 2015 who did not consider environmental variability) would have increased sensitivity to other types of adaptive signals (Riginos et al., 2016).
All the genotypes we found associated to thermal stress also displayed annotations suggesting a functional role in cellular heat responses. Nonetheless, annotations analysis is only a first step in the validation of the association-models detected. Ideally, further assays as reciprocal transplantations (Palumbi et al., 2014), experimental conditioning (Krueger et al., 2017) and molecular analysis (Courtial et al., 2017) could ascertain and help quantifying the link between each genotype and the putative selective advantage it might confer. Nevertheless, the high frequency of SNPs associated to helicase RecQ genes family together with their location on 4 distinct scaffolds is strongly unlikely. This suggests that the maintenance of ability to open the double strand of DNA for transcription, replication and repair is a key point for A. Digitifera in Ryukyu archipelago to resist on the pressure applied by DHW.
Patterns of connectivity
Understanding connectivity patterns across a reef system is essential in order to plan effective management strategies. This is usually carried out through the quantification of the dispersal trajectories of larvae (Storlazzi, van Ormondt, Chen, & Elias, 2017) or through the study of the population structure using genetic data (e.g. Lukoschek et al., 2016). The first approach has the main drawback of being difficult and expensive because it requires an a priori knowledge of the dispersal of the studied species (McCook et al., 2009). In population genetics data, on the other hand, sampling often occurs at a few locations and the results are therefore hard to generalize to a whole seascape extent (Beger et al., 2014). The method we applied combines these two approaches. Firstly, genetic data are used to calculate genetic distances between the sampled locations. The latter were then used to train dispersal models, which could in turn predict genetic distances at the scale of the whole reef system.
In model training we found that sea-distances clearly represented a better indicator of spatial separation, compared with aerial-distances (Fig. S6). This happens probably because coral dispersal is driven by the water flow (Paris-Limouzy, 2011), which is strongly asymmetrical in this region (north-east oriented) because of the Kuroshio current (Nishikawa, 2008). By fitting this dispersal model with sea-distances among any reef of the region we were able to evaluate the patterns of connectivity. Reefs in the extreme North (Osumi) face both inbound and outbound isolation (Fig. 3), a situation that might lead to inbreeding depression (Keller & Waller, 2002). Reefs in the extreme south-west of the system (Yaeyama) appear also endangered: despite their proximity to Miyako, the strong north-eastward directionality of sea-currents contrasts the arrival of recruits from the rest of the Archipelago.
The dispersal framework we propose here is potentially applicable to any coral species for which geo-referenced genetic data are available and the indices associated can be extremely valuable in management planning (Beger et al., 2014). However, a substantial part of GD variation remains unexplained by our models and connectivity models developed at a finer geographical and temporal scales might better describe the situation (Storlazzi et al., 2017).
Conservation and Adaptive Potential
Conservation strategies to preserve coral reefs need to account for and even to reinforce the importance of the adaptive potential to resist to thermal stress (Baums, 2008; Logan et al., 2014; Maina et al., 2008; van Oppen et al., 2015). In the perspective of its integration in prioritization programs, adaptive potential needs to be objective and quantifiable (OECD, 2017). Previous works suggested to use climatic records (Logan et al., 2014; Maina et al., 2011), but this approach neglects how organisms modulate a response in front of an environmental constraint. The use of genetic data has also been advocated, but findings from population genetics analyses are often hard to be extrapolated at large scales and therefore includes in conservation plans (Beger et al., 2014). Furthermore, the established guidelines (e.g. Beger et al., 2014) discuss how to reinforce conservation planning with genetic data (i.e. a few loci), which is less informative for adaptive processes, compared to genomic data. Here we showed how a combination seascape genomics and connectivity analysis makes it possible to assess the adaptive potential at the scale of a reef system.
In the specific case of the Ryukyu Archipelago, we found that reef from Miyako and Okinawa islands are those more likely to carry adaptive traits against thermal stress (Fig. 4). Previous works reported indeed that adapted colonies from Okinawa might have resisted during the 1998 and 2001 bleaching events (Van Woesik, Irikawa, & Loya, 2004). In contrast, reefs in the northern part of the Archipelago (Amami, Tokara and Osumi) did not experience the same thermal stress before and therefore are not predicted to show the same adaptive capacities (Fig. 4).
The indicators on adaptive potential and connectivity produced in this work provide valuable descriptors of the state of the reef. Furthermore, they can be used for predicting the expected impact of distinct preservation measures and support therefore spatial planning in conservation (Box. 1). It is unknown whether the connectivity patterns and the adaptive signals observed in A. digitifera are representative for those of other taxa. In this concern, we advocate that future research should focus on a multi-species implementation of the methods proposed here.
Adaptive potential in conservation scenarios
In this box we show how information on adaptive potential can be used to compare different conservation scenarios. We focus on two typical preservation initiatives, marine protected areas (MPAs) and coral nurseries (CNs). For each of them, we compare how the set-up of the measure at two different locations differentially drives the adaptive potential of the reef system.
The starting point for the predictions is the map showing the distribution of the probability of presence of adaptive genotypes (Fig. S7). We modify the values in this map to imitate the effects of preservation strategies. In the case of MPAs, we maintain potentially adaptive reefs (Pa>0.6) only at locations where the protected area is established. The underlying idea is to emulate the way MPAs reduces coral loss due to local stressors (Selig & Bruno, 2010). In the CNs scenarios, potentially adaptive reefs (Pa>0.6) are added where absent. Here we follow the good practices for this kind of measure that suggest to transplant adapted colonies at sites lacking them (Baums, 2008). After modifying the values of Pa, we re-calculate the genetic distance (GD) from adaptive sites (Fig. 4). This is the variable that we use to describe adaptive potential. We then compare differences in this value (ΔGD) between different conservation scenarios to investigate beneficial and negative effects.
Marine Protected Areas
We compare two MPAs established in Miyako-Yaeyama area (MPA1) or in Okinawa (MPA2), both covering the same surface (~15’00 km2). The overall difference in GD (ΔGD=+0.17) indicates that MPA2 globally reduces GDs from potentially adapted reefs more than MPA1. This is probably due to the fact that MPA2 includes more reefs that are putatively adaptive (Fig. S7) and is better connected to the reef-dense area around Amami. MPA1, on the other hand, is far from this reef-dense area and shows beneficial effects only in Yaeyama.
Coral Nurseries
Two CNs emplacements were compared: in Amami (CN1) or in Tokara (CN2). The overall difference in GD (ΔGD=-0.28) suggests a more beneficial effect on adaptive potential under CN2, rather than CN1. In fact, the existence of a coral nursery in Amami (CN1) could provide adaptive genetic material to a larger number of reefs, compared with a set-up established in Tokara (CN2), because of a higher outbound-connectivity potential (Fig. 3b).
Conclusions
This study highlights the value of the seascape genomics approach to support the conservation of corals. We applied this approach to a flagship coral species of Ryukyu Archipelago and identified genetic variants that might underpin adaptation to thermal stress. Coupling this information with a genetic analysis of connectivity enabled the evaluation of the adaptive potential at the scale of the whole study area. The outputs of this analysis are quantitative indices likely to support objective prioritization of reefs in conservation plans. This framework is transferable to any coral species on any seascape and therefore constitutes a useful conservation tool to take into account and evaluate the genomic adaptive potential of coral reefs worldwide.
Acknowledgements
The authors received no specific financial support for the research, authorship, and/or publication of this article. The authors have no conflict of interest to declare.