Abstract
Genetic markers associated with variance of quantitative traits are considered promising candidates for follow-up including interaction analyses. However, as in studies of main effects, X-chromosome is routinely excluded from ‘whole-genome’ scans due to analytical challenges. Specifically, as males carry only one copy of X-chromosome, the inherent sex-genotype dependency could bias the trait-genotype association, through sexual dimorphism in quantitative traits with sex-specific means or variances. Here we investigate phenotypic variance heterogeneity associated with SNPs on X-chromosome and propose robust strategies. Among those, a generalized Levene’s test, adjusting for sex and sex-genotype interaction effects, has adequate power and remains robust to sexual dimorphism. An alternative sex-stratified approach via Fisher’s method is the most robust at the cost of slightly reduced power. We applied both methods to an Estonian study of gene expression quantitative trait loci (eQTL; n=841), and two complex trait studies of height, hip and waist circumferences, and body mass index (BMI) collected on Caucasians in UK Biobank (UKB; n=132,968) and multi-ethnic study of atherosclerosis (MESA; n=2,073). Consistent with previous eQTL findings on mean, we found some but not conclusive evidence for cis regulators being enriched for variance association. Individual SNP rs 148191803 was X-chromosome-wide significant for waist circumference (p=2.4E-6) and suggestive for BMI (p=1.2E-5) in UKB but not in MESA. However, a permutation study based on MESA showed a trait-specific polygenic model whereby multiple X-chromosome loci collectively influence variance of height (λGC=1.14, p<1/100), calling for developments of methods to examine broad-sense heritability by incorporating variance loci and quantifying their sex-specific contributions.
Introduction
Several recent reports have examined autosomal genetic loci contributing to phenotypic variance (as opposed to mean) for a wide range of complex traits1–3, and corresponding methodology development remains an active area of research4–11. One possible reason for such phenotypic variance and SNP genotype association, or variance heterogeneity, is that genotype-stratified variances of a trait differ in the presence of gene-gene (GxG) or gene-environment (GxE) interactions; both referred to as GxE hereinafter. For example, rs1358030 (SORCS1) was shown to interact with treatment type affecting HbAlc levels in Type 1 Diabetes subjects12. And indeed, in a proof-of-principle study where the treatment information was intentionally masked, the SNP was then demonstrated to be associated with variance of HbA1c10. Conversely, because direct GxE modeling may not be feasible in an initial whole-genome scan, the question was then raised as to whether SNPs having effects on the variance of a trait make good candidates for follow-up interaction testing3. For instance, rs7202116 (FTO as the nearest gene) was significantly associated with variance of BMI2, and at the same locus, rs1121980 (FTO) showed evidence for a statistical interaction with physical activity influencing the mean of BMI13,14; it is worth noting that un-modeled interaction induces variance heterogeneity, but the causes of variance heterogeneity are multifaceted8,10,15–18. In practice, although it is possible that an interacting SNP has a stronger effect on variance than on mean, as in the case of rs12753193 (LEPR) interacting with BMI in the prediction of CRP levels in the absence of detectable main effect1, a more powerful approach to selecting association candidates is to jointly evaluate their mean and variance effects8,10,19,20.
Despite enthusiasm to discover SNPs with variance effects and the availability of statistical tests, variance heterogeneity has not been formally explored for SNPs on X-chromosome (XCHR). As in the conventional ‘genome-wide’ (mean) association studies21, the reluctance to include XCHR is due to analytical challenges21,22. They range from technical difficulties in genotype calling to statistical complexities in imputation and association (e.g. model uncertainty involving random or skewed X-inactivation23–26 and sex as a potential confounder). Solutions to overcome some of these challenges had been provided, but all in the context of genetic association analysis of main effects25,27 -30.
Here we focus on understanding the impact of the inherent sex-genotype dependency on variance heterogeneity association analysis, and when the trait of interest has sex-specific mean or variance values for males and females. In practice, sexual dimorphism is consistently observed. For example, based on the UK Biobank (UKB)31 and Multi-Ethnic Study of Atherosclerosis32 (MESA) data, height displays a sex-specific difference in mean, hip circumference differs in variance, while body mass index (BMI) and waist circumference contrast in both mean and variance between males and females (Figure 1). These empirical patterns of sexual dimorphism vary according to the underlying physiology of the trait, which might or might not be related to genes. Thus, association analyses of phenotypic mean or variance with XCHR SNPs could be biased if these potential sex-specific main or variance effects were not appropriately accounted for.
For an autosomal SNP, evaluating differences in phenotypic variance across the three genotype groups can be readily achieved by the classical Levene’s test for variance heterogeneity33. SNPs with significant variance association p-values are then selected as likely candidates for follow-up interaction studies. However, the same strategy to prioritize SNPs on XCHR can be problematic, because sex-specific mean and variance differences could create spurious variance heterogeneity unrelated to the putative GxE interactions. Thus, the correct formulation of variance test is dependent on a proper formulation of sex effect with respect to both mean and variance.
In this paper, we explicitly model the possible sources of confounding related to sex, and propose two general testing strategies that strike a balance between power and robustness. Using extensive simulations, we demonstrate the danger of directly applying autosomal methods to XCHR that would otherwise be suitable for testing variance heterogeneity, and we conclude that special consideration for sex-genotype dependence must be made for XCHR to maintain correct type I error rates. Application studies include identifying SNPs associated with variances of height, BMI, hip and waist circumference using the UK Biobank and MESA data, as well as detecting loci associated with variance of expression quantitative traits using data from the Estonian Genome Center at the University of Tartu (EGCUT) cohort34–36.
Material and Methods
Notation and model setup
Of interest is a quantitative trait Y assumed to be (approximately) normally distributed or had been inversely transformed to resemble a normal distribution. Without loss of generality, consider the following linear model for the ‘true’ association relationship between Y and a SNP,□ where G denotes the SNP genotype (coded additively with respect to the number of the minor allele as 0, 1 and 2 for bb, Bb and BB as in convention37), S is the sex indicator variable (female = 0 and male = 1), E ~ N(0, 1) is a standardized continuous covariate following the classical G-E independence assumption38, and the error term ε□N(0, 1) is independent of G, S and E. The minor allele frequency (MAF) of G is assumed to be the same for male and females; sex-specific MAF affects the naïve methods and we will return to this point in the Discussion section.
Under these assumptions, it is possible to identify autosomal SNPs potentially involved in GxE or high-order interactions, without having to measure E directly, through detecting phenotypic variance associated with G via the working model of Y ~ G. Note that the analytical context here is that direct GxE (or GxG) modeling may not be possible (e.g. E may not be known or measured precisely) or desirable (e.g. due to computational or multiple hypothesis testing concerns for whole-genome GxG scans). To see the rationale behind the working model, with the additional assumption of conditional independence between E and S conditional on G, one can show that the conditional variance of Y on G is, Since sex S is independent of G for an autosomal SNP, Pr(S|G = g) is constant across g = 0, 1 and 2, so are E(S|G=g) and Var(S|G=g). Thus, if βGS = βGE= βGSE = 0, expression (2) can be reduced to a constant with respect to G: Conversely, variation in Var(Y|G=g) across G suggests that at least some of the (un-modeled) interaction terms involving G (i.e. βGS, βGE and βGSE) are non-zero. This was precisely the motivation behind the original idea1 of using Levene’s test to identify variance heterogeneity induced by the underlying but un-modeled GxE interaction.
X-chromosome (XCHR) specific challenges for variance tests
The same approach to draw similar conclusions for XCHR SNPs, however, is questionable, because Pr(S|G = g) is no longer constant in G and expression (3) cannot be further reduced to (4). For example, under the X-inactivation coding of 0, 1 and 2 for the bb, Bb and BB genotypes in females and 0 and 2 for the b and B genotypes in males, the G=1 group contains only females. Similarly, under the no X-inactivation coding of 0, 1 and 2 for females and 0 and 1 for males, the G=2 group then contains only females. Thus, omitting the sex indicator S from the covariates can bias the conclusion through sexual dimorphism as seen in Figure 1.
Consider the simplest case of no interaction effects at all βGS = βGE = βSE = βGSE = 0) nor environmental main effect (βE = 0), but there is a sex main effect (βS ≠ 0, i.e. the sex-stratified phenotypic means differ between males and females), then expression (3) is reduced to Thus, in the absence of any interactions that involve G, there is a spurious phenotypic variance heterogeneity across levels of G through a non-zero sex main effect (βS), or through a sex-environment interaction effect (βSE) if present as in equation (3). Severity of the confounding depends on the discrepancy between the two sex-stratified trait distributions (real data in Figure 1 and conceptual data in Figures 2 A-D), as well as on the strength of correlation between sex and the observed genotype, which in turn depends on the MAF and proportions of males and females in a sample (details in Supplemental Data).
To avoid spurious variance heterogeneity signals, alternative approaches are needed to quantify variance differences induced by GxE or higher order interactions involving G. To this end, it is important to appropriately define the null hypothesis of variance homogeneity that corresponds to an absence of phenotypic variance associated with genotype while allowing for variance (and mean) to differ between males and females (Figures 2 A-D).
X-chromosome (XCHR) variance heterogeneity tests
Here we consider various analytical strategies to assess phenotypic variance associated with genotypes of XCHR SNPs, including naïve methods that directly apply the original Levene’s test to different genotype groups, and alternative approaches that utilize a generalized Levene’s test derived from a two-stage regression framework20,39.
Naïve methods: apply Levene’s test to three or five genotype groups
The original Levene’s test for variance heterogeneity treats an autosomal genotype G as a categorical variable33,39 and examines any variance difference in trait Y amongst the three possible genotype groups. A direct application to XCHR, however, is problematic. Because sex S is inherently correlated with XCHR G, so any potential correlation between S and Y (e.g. as observed in human height) would create the classic case of confounding. To see this, consider the null situation where G is not associated with variance of Y as in the top panel of Figure 2. Assume the X-inactivation coding of G was used (same conclusion for the no X-inactivation coding), the Bb group contains only females and its variance would be the same as reflected by variance of the orange curve in the figure. In contrast, the other two groups (bb+b and BB+B) contain both males and females, and their respective variance values involve both the orange and blue curves depending on sex-specific means (μm and μf) and variances as well the proportion of males in each group. Thus, in the presence of sexual dimorphism (either in mean-Figure 2B or variance-Figure 2C or both-Figure 2D), there would be spurious variance heterogeneity resulting in increased false positive rates (also confirmed by empirical results).
As an alternative, one may be tempted to treat each genotype and sex combination as one group, resulting in a total of 5 groups. Indeed, this five-group strategy does not induce spurious association in the presence of sex-specific mean effect (μf ≠ μm as in Figure 2B). However, it is not difficult to see that the problem remains when there is a sex-specific variance effect as in Figures 2C or 2D).
Fisher’s method: combine sex-stratified Levene’s test
Sex-stratified analysis provides a practical strategy whereby variance heterogeneity is assessed separately in males (two-group Levene’s test) and females (three-group Levene’s test). Fisher’s method can then be used to combine the two p-values. Note that Leven’s test statistic is asymptotically distributed without apparent ‘direction of effect’, so the traditional meta-analysis that combines the weighted (directional) Z-values for testing mean effect is not applicable here. Though sex-stratified analysis does not allow direct GxS modeling, it is robust to various forms of sexual dimorphism as seen in Figures 2B-D, and it does not require the knowledge of X-inactivation status.
Model-based generalized Levene’s test: account for sex-specific mean and variance effects via two-stage regression models
Since the null hypothesis is defined in terms of phenotypic variance heterogeneity induced by (un-modeled) GxE interactions while allowing for sexual dimorphism (Figures 2B-D), a preferred method should explicitly account for the effect of sex on the phenotype of interest. To this end, we consider the generalized Levene’s test that established a flexible two-stage regression framework20,39. In essence, stage one regresses Y on G and obtains the absolute residual d (i.e. absolute of deviation between observed and model fitted Yvalues). Stage two regresses d on G again, and testing the slope was shown to be equivalent to evaluating variance heterogeneity in Y associated with G because the expectation of d linearly depends on variance of Y20,39.
The generalized Levene’s test has been used to study autosomal SNPs with more complex data structures including genotype group uncertainty (e.g. imputed SNPs) or sample dependency (e.g. correlated family members)20. For XCHR analysis, the implementation requires additional care. For example, it is not immediately clear if S (or GxS) should be included in both stages. For a comprehensive evaluation, we consider all combinations of the following two-stage models: Stage One: Mean models, Stage Two: Variance models, The models in stage one are only used to calculate residuals, using either the traditional ordinary least squares (OLS) or the recommended least absolute deviations (LAD); LAD is more robust to data with asymmetric distributions or low genotype counts in a specific group20,40. The goal of this stage is to remove any Mean effects associated with the covariates included in the model (i.e. G, Sor GxS), thus denoted as M1, M2 or M3.
Test for Variance heterogeneity is achieved in stage two (V1, V2 or V3), by testing HO: γG = 0 or HO: γG = γGS = 0 via the standard regression F-test, where model is fitted using OLS for independent samples or generalized least square for dependent samples.
The model-based regression approach includes a total of nine M+V two-stage models, and V3 also allows a two degrees of freedom (d.f.) test (Table S1). Based on the earlier discussion, it is expected that mean modeling strategies omitting S (i.e. M1) would be sensitive to sex-specific mean effect (e.g. Figures 2B or 2D). Meanwhile, variance testing strategies omitting S (i.e. V1) are anticipated to be sensitive to sex-specific variance effect (e.g. Figures 2C or 2D). For completeness of our empirical validation, we first examined all 12 testing strategies in simulation studies then focused on the robust approaches in applications.
Simulation studies
A sample of 2,500 females and 2,500 males were simulated, and the MAF was fixed at 0.2; other sample sizes and MAFs led to qualitatively similar results. Note that although G could be coded assuming X-inactivation or no X-inactivation, the two types of coding are generally highly correlated leading to similar association results29. Thus, for a more focused study here the genotype was simulated assumed no X-inactivation. The number of simulated replicates was 10,000 so that estimates of the empirical Type I Error (T1E) rates within +0.5% of the nominal rate of 5% were considered satisfactory.
A joint mean and variance test can be more powerful than testing for variance heterogeneity alone, but the power of the joint test depends on the individual components10. Therefore, here we focus on comparing the different variance-testing strategies as outlined above, recommending the most robust yet powerful method that is also suitable for the joint location-scale test.
Simulations for T1E evaluation - design I based on model (1)
The genotype-phenotype relationship was generated according to model (1), where the environmental variable E ~ N(0, 1) was used in generating observed phenotypic values but assumed not being available for the actual association analysis. The null scenarios were defined by the absence of interaction effects for GxE and GxExS, so the quantitative trait for each null scenario was generated assuming βGE = βGES = 0 in model (1). A SNP could have a G main effect, but it does not affect the phenotypic variance of interest which is induced by un-modeled βGE and βGES in the working model, so βG = 0 without loss of generality. Note that the naïve variance methods could also pick up a non-zero GxS interaction effect if βGS ≠ 0, but βGS itself in fact can be directly tested because gender information is routinely collected (or robustly inferred from the available genotype data). Thus, βGS is not related to the variance heterogeneity of interest here and was set to be zero. For the remaining parameters, without loss of generality, β0= 0, βE = 0 or 0.5, βs= 0 or 0.5, and βSE = 0, −0.25 or 0.25, giving a total of 12 scenarios. They roughly fall into four categories, corresponding to the four conceptual sex-stratified distributions as shown in Figures 2A-D. For example, sexual dimorphism was introduced via βS and βSE, where a none-zero βS allows for sex-specific mean effect (Figures 2B and 2D) while a non-zero βSE allows for sex-specific variance effect (Figures 2C and 2D). Note that both βS and βSE are independent of the genotype-specific variance effect to be identified, which is absent in the null cases.
Simulations for TIE evaluation - design II based on sex-stratified mean and variance
The null scenarios based on model (1) may not fully capture the extremes of sexual dimorphism, thus we further simulated trait values directly according to sex-specific distributions using means (μm and μf) and variances that mimic the values observed in inverse-normally transformed BMI, height, hip and waist circumference from MESA (Table S2). The simulated traits, generated independent of any genotypes, were then tested for variance association with genotypes of 12,206 XCHR SNPs from the MESA dataset, after filtering by a minimum count of 30 observations in the five sex-genotype stratified groups as variance test is sensitive to small group size.
Simulations for power study
Only strategies with satisfactory T1E control were considered for power evaluation. We focused on model-based design I where power directly depends on the size of GxE and GxExS interaction effects and has a clearer genetic interpretation than design II. Under model (1), βGE was varied from 0 to 0.2 with a 0.025 incremental increase, and combined with a possible threeway interaction βGES of 0 or 0.1. Other parameter values were the same as in the null case with the exception that βS = βE = 0 were not considered for a more focused study of power. That is, β0= 0, βG = βGS = 0, βS = βE = 0.5, and βSE = 0, −0.25 or 0.25. In total, there were 54 scenarios.
Applications
Robust variance testing strategies for X-chromosome SNPs that also had reasonable power performance were then applied to real data. Only reportedly unrelated and ethnically Caucasian individuals were included, and diabetic individuals were excluded based on electronic medical records in the UK Biobank31, and based on blood glucose level greater than 7 mmol/L in MESA32. All quantitative traits were quantile-normally transformed to avoid ‘scale-effect’ where the variance values tend to be proportional to mean values1,2.
The significance level for discovery was set at a nominal level of 5% with Bonferroni correction, depending on the total number of XCHR SNPs examined in each application. Further, the proportion of truly variance-associated variants was estimated using the method proposed by Storey and Tibshirani41.
The UK Biobank (UKB) data31
Available genotyped XCHR SNPs were filtered based on whether they were in pseudo autosomal region and a minimal sample count of 30 across the five sex-genotype groups. In total, 7,344 XCHR SNPs on the Caucasian sample (71,452 females and 61,516 males) were analyzed, and the XCHR-wide significance level was 6.8E-6.
The Multi-Ethnic Study of Atherosclerosis (MESA) data32
The genotype data in MESA, available from dbGap (Study accession. phs000209.v10.p2), were filtered similarly as the UKB data. In total, 12,206 XCHR SNPs on the Caucasian sample (1,003 females and 1,070 males) were analyzed, and the XCHR-wide significance level was 4.1E-6. We did not perform a multi-ethnic analysis with all ethnicities combined. Instead, we focused on the Caucasian subset and used it to corroborate findings from the UKB data.
Estonian Genome Center at the University of Tartu (EGCUT) cohort34–46
We sought to discover XCHR SNPs influencing the variance of expression traits, as variability of gene expression has been suggested to be associated with genetic variants on autosomes7. The recommended strategies were applied to a sample of 413 male and 421 female Estonians across 648 gene expression traits that had gone through standard quality control procedures and further inversely normal transformed. After filtering using the same criteria as the UKB data, 4,034 XCHR SNPs were analyzed for variance association with each of the 648 gene expression traits, resulting in a total number of 2,614,032 tests and a global significance level of 1.9E-8.
Results
Simulation studies
As expected, the naïve Levene’s test with either a three-level factor G factor or a five-level GxS factor grossly overestimated the number of false positives in almost all scenarios except for when βE, βGS and βSE were all set to zero or the in the absence of any sexual dimorphism (Table 1).
For generalized two-stage Levene’s tests, good choices of the mean model in stage one and variance test in stage two should remove any effect from sex to avoid inflating the test statistics. Thus, as expected, any strategies involving M1 or V1 had T1E issues, where the degrees of departure from the nominal T1E rate varied according to sizes of the unadjusted sex mean or variance effects (Table 1). The remaining strategies appear to have reasonably controlled T1E rates and are underlined in Table 1. However, when considering design II where sexual dimorphism was more extreme, only M2V3.2 and M3V3.2 have good T1E control and behave quite similarly (Table S3).
The sex-stratified approach, as expected, gave correct T1E rates in females and males separately, and subsequently in the combined sample via Fisher’s methods, under both design I (Table 1) and design II (Table S3).
In terms of statistical power among testing strategies with reasonable T1E control (M2V3.2, M3V3.2 and Fisher), M2V3.2 and M3V3.2 are nearly identical and had slightly better power than Fisher’s method across most of the scenarios considered (Figures S1).
The reason for performance similarity between M2V3.2 and M3V3.2 is because the model was generated (and correctly modeled) under the assumption of no X-inactivation and βG= 0. Interestingly, when there is a strong genotypic main effect (βG ≠ 0), an increased variance in the female homozygote Bb group could be observed as a result of unknown X-inactivation42. Indeed, additional simulation studies confirmed that variance heterogeneity p-values given by M2V3.2 were influenced by X-inactivation while Fisher’s method and M3V3.2 remained consistent (Figure S2, Table S4). Thus, we recommend the M3V3.2 model-based approach and the complementary sex-stratified Fisher’s method, which were then applied to the three application datasets.
Applications
XCHR-wide analysis of waist circumference showed that the recommended M3V3.2 and Fisher’s tests indeed have good T1E control (Figure 3); similar conclusions were drawn based on results of other traits (Figures S3-5, respectively, for BMI, height and hip circumference). In UKB (Table S5), rs148191803 (MED14; in a region known to escape X-inactivation) was found to be associated with waist circumference, based on the M3V3.2 test (p = 2.08E-06) and Fisher’s method (p = 2.4E-06), at the XCHR-wide significance level (p < 6.8E-06), and the same SNP was associated with variance of BMI (M3V3.2 p = 7.01E-06; Fisher p = 1.20E-05; Table S6). However, no SNPs in the MED14 locus had waist circumference or BMI variance association p-values less than 0.05 in MESA.
The association of rs2023750 at TBL1X with BMI was suggestive (M3V3.2 p = 5.30E-04; Fisher p = 2.61E-04, Table S5) in UKB, and at the same gene locus, rs2521580 was also suggestively associated with BMI (M3V3.2 p = 2.85E-03; Fisher p = 9.04E-04) and waist circumference (M3V3.2 p = 9.77E-03; Fisher p = 5.75E-04) in MESA (Table S6).
Although there were no additional X-chromosome-wide significant SNPs in UKB or MESA, the overall distributions of the p-values suggest enrichment of associated variants for some of the traits. In Figure S4 for example, the estimated genomic lambda λGC based on the M3V3.2 variance test for height was 1.028 and 1.194, respectively in UKB and MESA, and it was 1.014 and 1.186 based on Fisher’s method. The estimated proportion of truly associated SNPs also suggested signal enrichment (Table S7).
To benchmark the observed estimates, a permutation analysis was conducted using the individual-level MESA data available to us. Each quantitative trait under the study was permutated within the two sex strata, independently, 100 times. For each permutated dataset, Fisher’s method and M3V3.2 were applied and the corresponding AGCvalues were then calculated (Figure 4). For height, the permuted values, as expected, centered around λGC= 1. The observed AGCvalue, based on either the M3V3.2 or Fisher’s method was larger than all 100 null values (Figure 4), supporting the apparent enrichment of XCHR SNPs associated with variance of height. The same conclusion holds, but to a lesser extent for waist circumference. However, for hip circumference and BMI, the observed estimates were not visibly different from the null estimates obtained from the permuted datasets. Similar observations were made based on the estimated proportion of truly associated SNPs (Figure S6).
For the eQTL analysis, we observed various forms of sexual dimorphism in expression traits. In total, 182 out of the 648 expression traits had p <0.05 based on either a t-test of equality of means or an F-test for equality of variance between the two sexes (Figure S7). Among the eQTLs, the top five variance-associated SNPs belonged to three genes, FTX, PLAC1 and TEX11 (Figures S8-10), but no SNPs passed the strict Bonferroni correction at p < 1.9E-8 (Table S8). There was no apparent enrichment of association globally over all SNP-expression 2,614,032 (= 648*4,034) p-values (Figures S11-12). However, upon further investigation based on stratifying SNPs and gene expression pairs according to whether they were cis or trans acting (using a physical distance of 5Mbps from the start and the end of the gene for each expression trait), we found that the estimated proportion of truly associated SNP-expression pairs appear to be slightly higher for SNPs in cis (0.006 and 0.013 based on p-values of Fisher’s and M3V3.2 methods, respectively), as compared to those in trans (0 based on either). The estimated lambda control was 0.996 and 1.003 for cis-acting pairs using Fisher’s and M3V3.2 methods, respectively, and both at 0.99 for SNP-expression pairs in trans, suggesting additional studies are needed to establish convincing evidence for enrichment of variance-associated eQTLs.
Discussion
This work was motivated by the recent call to include X-chromosome (XCHR) in ‘whole-genome’ scans21, as well as the recent development of identifying autosomal SNPs associated with phenotypic variance1,20. To pave the way for future XCHR-wide study of variance heterogeneity and subsequent joint location-scale test10, we examined a catalogue of analytical strategies and recommended two robust and power approaches. We emphasize the importance of recognizing sex as an inherent confounder in analyzing XCHR variants that contribute to phenotypic variance heterogeneity, particularly for traits displaying sexual dimorphism with either sex-specific means or variances, or both; this also holds for the traditional association analysis of XCHR variants studying their effects on phenotypic mean22.
Between the two recommended strategies, Fisher’s method to combine sex-specific Levene’s p-values is intuitive and the most robust, but at the cost of slightly reduced power. Through exploiting the recently proposed generalized Levene’s test based on a two-stage regression approach, the model-based M3V3.2 test can directly account for sex main effect as well as GxS interaction effect. The model-based regression testing strategy also allows flexible adjustments for other potential confounder such as principal components43. Thus, we recommend in practice to apply both methods to complement each other.
The naïve strategies that directly test for variance heterogeneity across either the classical three genotype groups or the sex-stratified five groups are inadequate with grossed inflated T1E rates in the presence of any sexual dimorphism (Table 1). Note that if the status of X-inactivation were known a priori, a non-additive variance model (VNA) in stage two may be considered: where G1 and G2 are indicator variables for the Bb and BB+B groups under X-inactivation, respectively, or alternatively for the Bb+B and BB groups without X-inactivation. Under this representation, Lev3 is equivalent to M1VNA1 in which the main effect of sex is not account for in stage one; while Lev5 is equivalent to the M3VNA3 model-based testing of γG1 = γG2 = γS = γG1S = 0 in which variance heterogeneity due to sex γs is erroneously being tested. These observations clearly revealed the source of bias inherent in the naïve methods.
The additive coding for G in stage one is believed to sufficiently capture the genetic main effect37, while it may not be the case for analysis of variance. The ambiguous genotype grouping under unknown X-inactivation status adds another layer of complexity for non-additive variance models (Figure S13). In fact, the choice of reference allele coding matters in XCHR when the mean association model does not include the GxS interaction terms as shown recently29. This has direct consequences for XCHR variance testing since a variance difference in the female homozygote group could be observed as a result of a strong marginal effect coupled with unknown X-inactivation42. Additional simulation results under model (1) with a non-zero genetic main effect suggested that p-values derived from M2V3.2 varied according to the underlying X-inactivation status (Table S4). Though in applications, the genetic main effect would have to be extremely large for the M2V3.2 p-values to lead to different conclusions. We also note that X-inactivation and no X-inactivation lead to similar (mean) association results if the GxS interaction term is included in the model, which explains the consistent performance of M3V3.2 irrespective of the X-inactivation status.
Since variance testing requires larger sample size than mean testing, detecting individual variance signals that are significant at the XCHR-wide or genome-wide level requires studies of very large size that might only be viable through meta-analysis. Meta-analyses of variance heterogeneity4 for XCHR variants can be conducted in parallel to that of a single study incorporating the analytical strategies proposed for autosomal variants4.
Similar to a polygenic model proposed for association studies of main effects, it is possible that a large proportion of genetic variants, though not individually detectable, could collectively contribute to variance heterogeneity in certain complex traits44,45. The permutated genomic control values showed a clear enrichment and point to a possible XCHR polygenic inheritance model for variance of height, which suggests that height could be potentially enriched for gene-environment interactions. Some have suggested that X-linked genes contribute to the sex-specific architecture of complex traits46, yet the amount of contribution from XCHR SNPs involved in possible GxE or higher-order interactions is unclear. On that note, it is interest to point out that a sex-stratified approach could reveal sex-specific enrichment of XCHR SNPs associated with phenotypic variance of, for example, waist circumference in males (Figure 4). It has been reported genetic loci can have gender specific marginal effects on traits such as height47 and HDL48, i.e. GxS interactions. Results from this study call for new developments of the broad-sense heritability estimation methods that can incorporate variance loci as well as quantify their contributions to sex-specific heritability.
Web Resources
An implementation of our methods is provided as an open-source and user-friendly R package available on github (https://github.com/WeiAkaneDeng/Xvarhet).
Acknowledgements
The authors thank Professor Radu V. Craiu and Mr. Bo Chen for helpful suggestions and critical reading of the original version of the paper. The authors would like to thank Silva Kasela for her help on optimizing the eQTL analysis. We are thankful to all the participants of the Multi-Ethnic Study of Atherosclerosis, UK Biobank, Estonian Genome Center at the University of Tartu cohort. This research was funded by the Canadian Institutes of Health Research (CIHR, 201309MOP-310732-G-CEAA-117978) and the Natural Sciences and Engineering Research Council of Canada (NSERC, 250053-2013) to LS. WQD is supported by NSERC Alexander Graham Bell Canada Graduate Scholarship and Ontario Graduate Scholarship.