Abstract
Over a decade of genome-wide association studies have led to the finding that significant genetic associations tend to spread across the genome for complex traits. The extreme polygenicity where “all genes affect every complex trait” complicates Mendelian Randomization studies, where natural genetic variations are used as instruments to infer the causal effect of heritable risk factors. We reexamine the assumptions of existing Mendelian Randomization methods and show how they need to be clarified to allow for pervasive horizontal pleiotropy and heterogeneous effect sizes. We propose a comprehensive framework GRAPPLE (Genome-wide mR Analysis under Pervasive PLEiotropy) to analyze the causal effect of target risk factors with heterogeneous genetic instruments and identify possible pleiotropic patterns from data. By using summary statistics from genome-wide association studies, GRAPPLE can efficiently use both strong and weak genetic instruments, detect the existence of multiple pleiotropic pathways, adjust for confounding risk factors, and determine the causal direction. With GRAPPLE, we analyze the effect of blood lipids, body mass index, and systolic blood pressure on 25 disease outcomes, gaining new information on their causal relationships and the potential pleiotropic pathways.
1 Introduction
Understanding the pathogenic mechanism of common diseases is a fundamental goal in clinical research. As randomized controlled experiments are not always possible, researchers are looking towards Mendelian Randomization (MR) as an alternative method for probing the causal mechanisms of common diseases [18]. MR uses inherited genetic variations as instrumental variables (IV) to interrogate the causal effect of heritable risk factor(s) on the disease of interest. The basic idea is that at these variant loci, the inherited alleles are randomly transmitted from the parents to their offsprings according to Mendel’s laws. Thus, the genotypes are independent from non-heritable confounding variables which may obfuscate causal estimation in parent-offspring studies. More generally, such independence also approximately holds for population data such as the genome-wide association studies (GWAS) when individuals share the same ancestry [46]. With the accumulation of data from GWAS, there is an increasing interest in MR approaches, especially approaches that only rely on the GWAS summary statistics that are readily available in the public domain [19, 46].
How well Mendelian Randomization works depends on how well the genetic variant loci used as instruments abide by the rules of IV. These rules dictate that, if the genetic locus has an effect on the disease outcome, it should be only through pathways mediated by the risk factor(s) of interest. This rule, termed exclusion restriction, is violated when there is horizontal pleiotropy, defined as the case where the genetic variant can influence the disease through pathways other than the given risk factor(s) [21]. There has been much recent attention on this issue [10, 4, 5, 25, 51, 59, 42, 11, 3, 36, 43] in MR, yet our understanding is far from complete. Current methods rely on different assumptions on the pattern of horizontal pleiotropy, while improper assumptions may lead to biased estimation of the true causal effects. What assumptions on pleiotropy and genetic effects would be suitable? Would it be possible to learn the degree of pleiotropy from the data? Could we perform model diagnosis utilizing only GWAS summary statistics?
The pleiotropy issue that muddles Mendelian Randomization studies is, in a large part, due to the fact that complex traits are extremely polygenic [16, 57, 8, 32, 45, 49, 36, 38, 55]. Accumulating evidence from GWAS studies indicate that complex diseases may share an omnigenic architecture where all genes affect every complex trait [6]. While a few genes might be “core” genes, almost all genes are involved and can exert non-zero effects on both the risk factors and disease. Thus, given a risk factor that explains only part of the causal mechanism of a complex disease, there would be many SNPs affecting the disease through their effects on other unmeasured risk factors. In other words, in an MR analysis, not only would we expect horizontal pleiotropy to be a pervasive issue across all genetic variants, any disease or complex risk factor would also be associated with a large number of SNPs across the whole genome. Many existing MR methods rely on the assumption that pleiotropic effects sparsely involve only a few SNPs, which directly counters these recent insights. Methods that don’t assume sparsity often require that the pleiotropic effects cancel each other across SNPs, named as the instrument strength independent of direct effect (inSIDE) assumption [4], which can be rather optimistic. Recently, a few new methods relaxed the inSIDE assumption to consider “directional pleiotropy” through one pleiotropic pathway [36]. However, there would then be an issue in identifying the true causal effect of the risk factor, and the model is restrictive to allow for only one pleiotropy pathway. Armed with these assumptions, most existing methods also utilize only the few SNPs that have the strongest association with the risk factor as instruments, ignoring the SNPs that are weakly associated. In this work, we will show that weakly associated SNPs are also informative, and that a model combining weak and strong SNPs would not harm MR while increasing its accuracy and stability in some scenarios.
We propose a comprehensive statistical framework for causal effect estimation when pleiotropy is pervasive across the genome. The framework, called GRAPPLE (Genome-wide mR Analysis under Pervasive PLEiotropy), facilitates interactive identification of multiple pleiotropic pathways and the incorporation of all SNPs associated with the risk factor into the analysis. GRAPPLE builds on the statistical framework MR-RAPS [59]. However, we emphasize the detection of multiple pleiotropic pathways when the inSIDE assumption in MR-RAPS is violated as well as the discrimination of the direction of causality. Using GRAPPLE, we further address how to jointly estimate the effects with multiple risk factors to reduce directional pleiotropy, as well as how to integrate cohorts with overlapping samples, both common challenges faced by current studies. The estimation accuracy of GRAPPLE is examined through validations involving real studies and simulations.
GRAPPLE is applied to a screening of the causal effects of 5 risk factors (three plasma lipid traits, body mass index, and systolic blood pressure) on 25 common diseases. Although there have been many causal effect screens [51, 39, 36] for these risk factors and diseases, the combined analysis enabled by GRAPPLE brings forth new insights on the pleiotropic landscape across diseases and, thus, an improved understanding of the causal estimates obtained. Specifically, we will reexamine the role of lipid traits on coronary artery disease and type-II diabetes, where the results from the multitude of MR studies [46, 31, 33] have been under heated debate.
2 Results
2.1 Model Overview
2.1.1 From the causal model to GWAS summary statistics
Our framework starts with a set of structural equations that jointly specify the generative model on the disease Y that relies on K observed risk factors X = (X1, ⋯, XK) of interest, and all genetic variants Z = (Z1, Z2,…) (Fig 1a). where U represents unknown non-heritable confounding factors and EXk and EY are random noise acting on Xk and Y respectively. The parameter of interest, β, quantifies the causal effect of the vector of risk factors X on Y. Due to Mendel’s law of inheritance, the genotypes Z are independent of (U, EY, EXk). The function f (U, Z, EY) represents the causal effect of unmeasured risk factors on Y, which can be heritable (contributed by Z) or non-heritable (contributed by U). The nonparametric functions f(·) and gk(·) allow interactions among SNPs in Z and variables (U, EY, EXk) in their causal effects on X and Y. Under this model, there is horizontal pleiotropy for a SNP j if Zj has nonzero association with f(U, Z, EY). This is the case, for example, when Zj acts on Y through a pathway affecting unmeasured risk factors, or when Zj is in linkage disequilibrium (LD) with such a locus.
Now consider the case where only GWAS summary statistics, i.e. the estimated marginal associations between each SNP j and the risk factors/disease traits, are available. Let Γj be the true association between SNP j and Y, and γj be the vector of true marginal associations between SNP j and X. Later, we will denote their estimated values from GWAS summary statistics as . Then, as shown in Materials and Methods, the model (1) results in the linear relationship where for binary Y, the parameter β in (2) is a conservatively biased version of β in (1). This relationship holds even when the functions f (·) and g(·) in (1) are not linear. Here, αj is the marginal association between Zj and f (U, Z, EY), representing the unknown horizontal pleiotropy of SNP j. In MR, one would typically simultaneously select p SNPs as multiple instruments to estimate the causal effect of X.
One can immediately see that identifying β is impossible without further assumptions on αj. Early MR methods such as IVW [10] made the simplest assumption that all instruments are valid satisfying αj = 0. However, as already discussed in Introduction, the assumption of no pleiotropy, or more generally, assuming that αj is sparsely nonzero as in Weighted Median [5] or MR-PRSSO [51] contradicts the fact that horizontal pleiotropy is pervasive. One assumption that allows pervasive pleiotropy is to assume the inSIDE assumption [4] where , or alternatively, the random effect model [59, 43] where ) for most genetic instruments. Unfortunately, the inSIDE assumption requires all unmeasured heritable risk factors of the disease to be genetically uncorrelated with the target risk factor(s) X, which is likely violated, especially when there are clusters of SNPs associate with both the unmeasured risk factors and X.
Noticing the limitation of the inSIDE assumption, some new MR methods, such as LCV [39], CAUSE [36] and MRMix [42] allow a proportion of genetic instruments to be associated with one common hidden pleiotropic pathway affecting both the risk factor and disease. For instance, under the above notation, both CAUSE and MRMix assumed that for the proportion of SNPs that violate the inSIDE assumption, their pleiotropic effects satisfy where represents the directional pleiotropic effects due to a confounding pathway and . This is a more realistic assumption than inSIDE, though there would then be an issue to distinguish the true causal effect β from the pleiotropic direction β + a, and the model may be too restrictive to allow for only one pleiotropic pathway.
2.1.2 Identify multiple pleiotropic pathways and the direction of causality
The key idea underlying GRAPPLE is to detect multiple pleiotropic pathways by using the shape of the data profile likelihood under no pleiotropy to probe the underlying causal mechanism, without explicit assumptions of the pleiotropic patterns (Fig 1b). When K = 1, the GWAS summary statistics reduce to the scalar and , with their standard errors and . From the central limit theorem, the joint distribution of approximately follows a multivariate normal distribution where θ is a shared sample correlation that can be estimated as (see Materials and Methods).
When there is no horizontal pleiotropy in the p selected independent genetic instruments (αj = 0 for j = 1,2, ⋯, p), the robustified profile likelihood is [59], where ρ(·) is the Tukey’s Biweight loss. As described with more details in Materials and Methods, the profile likelihood is obtained by profiling out nuisance parameters γ1, ⋯, γp in the full likelihood from (3), which is further robustified by replacing the L2 loss with Tukey’s Biweight loss to increase the sensitivity of mode detection. Under no pleiotropy or inSIDE assumption, it would only have one mode near the true causal effect b = β.
Now consider the case where a second genetic pathway (Pathway 2) also contributes substantially to the disease, and where some of the loci that we include as instruments are also associated with Pathway 2 (Fig 1b). In this scenario, SNPs that are associated with X only through Pathway 2 can contribute to a second mode in the profile likelihood at location β + κ/δ, where κ and δ quantifies the causal effect of Pathway 2 on Y and its marginal association with X, respectively (Materials and Methods). By a similar logic, multiple pleiotropic pathways result in multiple modes in l(b). Thus, we can use the presence of multiple modes in l(b) to diagnose the presence of horizontal pleiotropic effects that are grouped into different directions.
The existence of pleiotropic pathways not only complicates MR, more severely, it makes the causal effects of the risk factors unidentifiable. Specifically, when Pathway 2 exists, the GWAS summary statistics alone can not provide information to distinguish β from β + κ/δ. Instead of making further assumptions to identify the true causal effect, when multiple modes are detected, we suggest collecting more GWAS data to adjust for confounding risk factors that contribute to these modes. To help finding the confounding risk factors, GRAPPLE identifies marker SNPs of each mode, as well as the mapped genes and GWAS traits of each marker SNP (see Materials and Methods), so that researchers can use their expert knowledge to infer possible confounding risk factors that contribute to each mode. With the GWAS summary statistics of these confounding traits, GRAPPLE can perform a multivariate MR analysis assuming the inSIDE assumption on the remaining horizontal pleiotropic effects. GRAPPLE uses an adjusted robustified profile likelihood approach that can jointly estimate β and τ2 (Materials and Methods).
With multiple modes detection, we can also consider the question of whether X indeed causes Y, as our structural equation (1) presumes, or it is the reverse case of Y causing X. If it were the case that the direction of causality runs from Y to X, then an instrument is associated with X either through Y, or through unmeasured heritable risk factors of X unrelated to Y. In the latter case, a SNP j satisfies γj = 0 while Γj ≠ 0, and would contribute to a mode at 0. In the former case, γj = βΓj where β is the causal effect of Y on X, and these SNPs may contribute to a mode around 1/β. This idea shares similarities with bidirectional MR [50, 26]. Bidirectional MR is based on the assumptions that when MR is reversely performed, all selected instruments affect Y not through X, and filters out suspicious SNPs that may violate this assumption by checking their associations with X. Though it sometimes works, there is no guarantee that the filtering does not introduce bias. In GRAPPLE, we identify the direction by checking if there is a mode at 0 after switching the roles of X and Y, while tolerating the existence of another mode around .
2.1.3 Weak genetic instruments: A curse or a blessing?
Besides the assumption of no-horizontal-pleiotropy, for a SNP to be a valid genetic instrument, it needs to have a non-zero association with the risk factor of interest. In most MR pipelines, SNPs are selected as instruments only when their p-values are below 10-8, which is required to guarantee a low family-wise error rate (FWER) for GWAS data. Using such a stringent threshold also avoids weak instrument bias [13], where measurement errors in are too large to lead to bias in . However, such a stringent selection threshold may result in very few, or even zero, instruments for underpowered GWAS, and may still not be adequate to avoid weak instrument bias. Further, when our goal is to jointly model the effects of multiple risk factors (the setting where X as a vector), it is unrealistic to assume that all selected SNPs have strong effects on every risk factor. In addition, the highly polygenecity phenomenon of complex traits indicates that the number of weak instruments far outnumbers the number of strong instruments, and collectively, they may exert a positive effect on the estimation accuracy.
In GRAPPLE, we use a flexible p-value threshold, which can be either as stringent as 10-8 or as mild as 10-2, for instrument selection. Based on the profile likelihood framework of MR-RAPS [58], GRAPPLE can provide valid inference of to avoid weak instrument bias for multiple risk factors with SNPs selected at any given p-value threshold, when horizontal pleiotropy of most SNPs follow the random effect model . This flexible p-value threshold is beneficial for several reasons. First, including moderate and weak instruments may increase power, especially for under-powered GWAS data where there are too few strongly associated SNPs. Second, for MR with multiple risk factors where it is inevitable to include SNPs that have weak associations with some of the risk factors, we can obtain more accurate causal effect estimations than methods that can only deal with strongly associated SNPs. More importantly, comparing estimates across a series of p-value thresholds can show stability of our estimates and a more complete picture of the underlying horizontal pleiotropy. In practice, we suggest researchers to vary the selection p-value thresholds from a stringent one (say 10-8) to a mild one (say 10-2), both in the detection of multiple modes and in estimating causal effects. We would expect to see consistent results across the p-value thresholds, if there are truly multiple pleiotropic pathways or our assumptions hold in estimating the causal effects of the risk factors.
2.1.4 The three-sample design to guard against instrument selection bias
Selecting instruments from GWAS summary statistics can also introduce bias, which is the “winner’s curse”. The magnitude of will increase conditional on being selected and would bias the estimate of β. When K = 1 that there is only one risk factor, the estimate will bias towards 0, but there is no guarantee of the direction of the bias when K > 1. Typically, it is believed that the selection bias is negligible when only the strongly associated SNPs are selected as instruments.
However, we find that for commonly used MR methods, instrument selection can introduce bias even when only genetic variants with genome-wide significant p-values (≤ 10-8) are selected (Fig S1a). Thus, unlike the usual two-sample GWAS summary statistics design which involves one GWAS data for the risk factor and one for the disease, we strongly advocate using a three-sample GWAS summary statistics design (Fig 1d). To avoid the selection bias, selection of genetic instruments is done on another GWAS dataset for the risk factor, whose cohort has no overlapping samples with both the risk factor and disease cohorts. In addition, to ease calculation (see Materials and Methods), currently we only include independent SNPs in GRAPPLE and we use the LD clumping for SNP selection to obtain them [41]. The three-sample design will also avoid possible selection bias introduced during clumping.
Summarizing the above points, a complete diagram of the GRAPPLE workflow is shown in Fig 1d. A researcher may start with a single target risk factor of interest. The shape of the robustified profile likelihood provides information on possible pleiotropic pathways. If multiple modes are detected, then one may need to adjust for pleiotropic pathways. Unfortunately, this step can not be done automatically as summary statistics themselves do not provide enough information to distinguish a causal mode from a pleiotropic mode. Researchers can use the marker SNP/gene/trait information that GRAPPLE provides to understand each mode, decide what confounding risk factors to adjust for, and collect extra GWAS data for them. GRAPPLE can then jointly estimate the causal effects of multiple risk factors to adjust for the confounding effects of the added risk factors.
2.2 Assessment of GRAPPLE with real studies
2.2.1 Inference from both weak and strong genetic instruments under no pleiotropy
We first examine whether GRAPPLE provides reliable statistical inference combining weak and strong instruments under an artificial setting with real GWAS summary statistics. In this setting, we make X and Y be the same trait from two non-overlapping cohorts, thus γj = Γj while for any SNP. Though the structural equation describing the causal effect of X on Y does not exist, the linear relationship model (2) from which we estimate β still holds with β = 1 and αj = 0. In other words, we are not estimating a meaningful “casual” effect, but are in a special case where the true β is known, which can be used to test whether MR methods provides valid inference under no pleiotropy. Specifically, we consider three traits: Body mass index (BMI), Type II diabetes (T2D) and height from the GIANT and DIAGRAM consortium where sex-specific GWAS data are available [30, 35]. The female cohort is used to get and the male cohort is used to get . As a three-sample design, the UK Biobank data for corresponding traits are used for SNP selection. The true β is 1, when we assume that all selected instruments have no gender-specific association with the traits. For benchmarking, we compare the performance of GRAPPLE with CAUSE [36] and other three well-adopted MR methods, inverse-variance weighted (IVW) [10], MR-Egger [4] and weighted median [5] with the same three-sample design.
We compare across different p-value thresholds for instrument selection, ranging from a stringent threshold 10-8 to a mild threshold 10-2 (Fig 2a). GRAPPLE keeps providing unbiased estimates of β showing that it does not suffer from the weak instrument bias. Surprisingly, biases exist in other MR methods even with a stringent p-value threshold, which is most likely due to the power discrepancy between the GWAS data for selection and estimating γj. In addition, the confidence intervals do get narrower with GRAPPLE for T2D, showing the potential benefit of including weak instruments for less powerful GWAS studies.
Finally, we demonstrate that the three-sample design to avoid selection bias is necessary not only for GRAPPLE, but also for other MR methods. As shown in Fig S1a, the two-sample design where we use the same cohort of the risk factor for selection can result in biased casual effects estimation, and the biases appear for most MR methods even when only the strongly associated SNPs are selected.
2.2.2 Level of pleiotropy in SNPs with heterogeneous strengths
Next, we examine whether or not the weak instruments are more vulnerable to pleiotropy, which can be a concern for including the weak SNPs. We compare four risk factor and disease pairs that cover eight different complex traits, including the effect of BMI on T2D, low-density cholesterol concentrations (LDL-C) on coronary artery disease (CAD), height on smoking, and systolic blood pressure (SBP) on stroke (Fig 2b).
We test whether independent sets of strongly and weakly associated SNPs can provide consistent estimates of the causal effects of the risk factors. SNPs passing the p-value threshold 10-2 in the cohort for selection are divided into three groups after LD clumping: “strong” (pj ≤ 10-8), “moderate” (10-8 < pj ≤ 10-5), and “weak” (10-5 < pj ≤ 10-2). The SNPs across groups are used separately to obtain group-specific estimates of the causal effect β. We observe that for all the four pairs, the estimates are stable across groups (Fig 2b). Though the “weaker” SNPs provide estimates with more uncertainty due to limited power, the estimates are consistent with those from the “strong” group. Other MR methods also show some level of consistency in estimating β across different sets of instruments, but perform worse due to weak instrument bias (Fig S1b). To conclude, in the analysis of these four pairs of traits, we do not see any evidence that weakly associated SNPs provide more biased estimates than strong instruments due to horizontal pleiotropy. In contrast, as the strong SNPs, they may also provide useful information to infer the causal effects of the risk factors. GRAPPLE can expand the ability to evaluate causal effect of risk factors with both strong and weak genetic instruments.
2.2.3 Identify direction of causality for known causal relationships
Then, we examine the performance of GRAPPLE in identifying the causal direction with the shape of the profile likelihood. For the causal direction, we focus on the two pairs of traits with known causal relationship: BMI on T2D, and LDL-C on CAD. We switch the roles of the risk factor and disease to see if the correct direction can be revealed. Specifically, we treat T2D and CAD as the “risk factor”, and BMI and LDL-C as the corresponding “disease” (Fig 2c). For T2D, the cohort for the other gender is used for SNP selection and for CAD, the risk factor cohort used is from [17] and the selection p-values are from [44]. As expected, we see that when the roles of the risk factor and disease are reversed, the robustified profile likelihood shows a main mode at 0, and a weaker mode around 1/β.
2.2.4 Multiple pleiotropic pathways in the effect of C-reactive protein
Finally, we test for our ability to identify multiple pleiotropic pathways with the analysis of the C-reactive protein (CRP) effect on CAD. C-reactive protein has been found to be strongly associated with the risk of heart disease while many SNPs who are associated with the C-reactive protein also seem to have pleiotropic effect on lipid traits [22]. Previous MR analyses only included SNPs that are near the gene CRP to guarantee a free-of-pleiotropy analysis [14] and found that CRP has no causal effect on CAD, validated also by randomized experiments [28]. However, if the SNP selection near CRP gene is not performed, can GRAPPLE identify the existence of multiple pathways and obtain the correct estimate of the C-reactive protein effect from its associated SNPs across the whole genome?
CRP GWAS data from [40] is used for selection and the data from [20] using a larger cohort is used for getting . The robustified profile likelihood shows a pattern of three modes, indicating the existence of at least three different pathways (Fig 2d). One mode is negative, one is positive and the third is around zero. The negative mode involves a few marker genes including HNF1A and PVRL2, with a marker trait LDL-C. The positive mode has marker traits pulmonary function and the C-reactive protein, and the few markers genes (IL6R, ARHGAP10, BCL7B, PABPC4) are also involved in immune response and lung cancer progression [47, 48]. The mode at 0 has marker genes CRP and LEPR, and only one marker trait, the C-reactive protein.
We compare across 3 p-value thresholds (10-8, 10-5, 10-3) and check how the existence of multiple pathways affects causal estimates of the effect of C-reactive protein in MR methods using SNPs across the genome. Including the C-reactive protein as the only risk factor, all bench-marking methods give a negative estimate of the CRP effect, which is possibly driven by the bias from an LDL-C induced pleiotropic pathway (Fig 2e). MR-RAPS is the estimation method used in GRAPPLE only there is only one risk factor, and the three other bench-marking methods give incorrect inference of the CRP effect with a p-value of β below 0.01 for at least one SNP selection threshold (notice that the weak instrument bias is bias towards zero as shown in Fig 2a, thus the significance at p-value threshold 10-3 for MR-Egger and IVW is not due to weak instrument bias). In contrast, after using two risk factors: the C-reactive protein and LDL-C, where LDL-C is an apparent confounding risk factor from Fig 2d, the estimates of CRP effect can keep insignificant across p-value thresholds. In addition, the estimates themselves are much closer to 0 compared with that without including LDL-C. This analysis illustrates how GRAPPLE can detect pleiotropic pathway, provide information on which confounding risk factors to adjust for, and obtain reliable inference after adjusting for additional risk factors.
As a complement to the above real data analysis, we have also conducted a set of simulations to evaluate GRAPPLE’s performance in detecting multiple pleiotropic pathways. For details, see Supplementary Note 2 and Fig S2.
2.3 A causal landscape from 5 risk factors to 25 common diseases
Finally, we apply GRAPPLE to interrogate the causal effects of 5 risk factors on 25 complex diseases through a multivariate genome-wide screen. The five risk factors are three plasma lipid traits: LDL-C, high-density lipoprotein cholesterol (HDL-C), triglycerides (TG), BMI and SBP. The diseases include heart disease, Type II diabetes, kidney disease, common psychiatric disorders, inflammatory disease and cancer (Fig 3a). For each pair of the risk factor and disease, we compare across p-value thresholds from 10-8 to 10-2. As a summary of the results, Fig 3a illustrates the average number of modes detected across the p-value thresholds for SNP selection (for modes at each p-value threshold, see Figure S2). Besides the number of modes, Fig 3a also shows the p-values for each risk factor when GRAPPLE is performed with only the single risk factor (see also Fig S3, Materials and Methods). These p-values are not valid when there are pleiotropic pathways.
Fig 3a shows that multi-modality can be detected in many risk factor and disease pairs. Multimodality is most easily seen using the stringent p-value threshold 10-8 (Fig S3). However, we find that some modes are contributed by a single SNP thus is more likely an outlier than a pathway. For instance, the effect of stroke on LDL-C shows two modes when the p-value threshold is 10-8 or 10-7 (one mode around −2.3 and another mode near 0.08). However, the negative mode only has one marker SNP (rs3184504) which has been found strongly associated with hundreds of different traits according to GWAS Catalog [9] while the other mode has hundreds or marker genes. After removing the SNP rs3184504, the mode disappears. Such a mode also disappears when we increase the p-value threshold to include more SNPs as instruments. Thus, the average number of modes serves as a strength of evidence for the existence of multiple pleiotropic pathways. Some risk factor and disease pairs show multi-modality without having a significant p-value for β, suggesting that the risk factor and disease are genetically correlated through multiple pathways but there is no evidence that risk factor has a causal effect on the disease.
We then focus on two diseases: CAD and T2D. For CAD, all five risk factors show very significant effects, though multi-modality is detected in HDL-C and SBP. First, consider the well-studied, often-debated relationship between CAD and the lipid traits. In our results for HDL-C, with different p-value thresholds, three modes in total can show up, two being negative and one positive, indicating that the pathways from HDL-C to CAD is complicated (Fig 3b). (Fig 3b shows that one negative mode is contributed by SNPs near genes LPL and BUD13, which are strongly associated with triglycerides. Another positive mode is contributed by SNPs near genes ALDH1A2 and PSKH1, which is related to respiratory diseases [52]. The markers of the other negative mode are mapped to genes including LIPG and CETP.
Since the effects of the lipid traits are generally complicated, we combine all 5 risk factors and run an MR jointly with GRAPPLE (Fig 3c) with different p-value thresholds. After adjusting for other risk factors, the two most prominent risk factors for the heart disease are LDL-C and SBP, while the protective effect of HDL-C stays negligible as well as the risk brought by TG. So these results show that HDL-C as a single measurement does not seem to have a protective effect on heart disease, while there are complicated multiple pathways involved. Researchers have suggested analyzing different subgroups of HDL-C as smaller particles tend to have a stronger protective effect [60].
Lipids are involved in a number of biological functions including energy storage, signaling, and acting as structural components of cell membranes and have been reported to be associated with various diseases [54, 24, 56, 27, 34, 1]. Besides CAD, another disease that most likely involves the lipid traits is the Type II diabetes (Fig 3a). T2D is associated with dyslipidemia (i.e., higher concentrations of TG and LDL-C, and lower concentrations of HDL-C), though the causal relationship is still unclear [23]. In the mean time, evidence has emerged that LDL-C reduction with statin therapy results in a modest increase in risk of T2D [54]. For the MR analyzing each risk factor alone, we see potential protective effects of LDL-C and HDL-C on T2D but also multi-modality patterns. Two modes show up in the profile likelihood from HDL-C to T2D where one negative mode has a marker gene LPL and a mode near 0 with marker genes CETP and AC012181.1. Thus we include all 3 lipid traits, along with BMI and run a joint model for these 4 risk factors using GRAPPLE (Fig 3d). Our result indicates a mild protective effect of HDL-C on T2D, while showing not enough evidence for the effect of either LDL-C or TG.
3 Discussion
We propose a comprehensive framework that utilizes both strong and weakly associated SNPs to un-derstand the causal relationship between complex traits. GRAPPLE is robust to pervasive pleiotropy and can identify multiple pleiotropic pathways. The multivariate MR in GRAPPLE can adjust for known confounding risk factors.
GRAPPLE incorporates several improvements over existing MR methods. It gets rid of weak instrument bias by dealing with measurement errors of the SNP associations on the risk factors with profile likelihood. Our likelihood is similar to the approach in [12], while allowing pervasive pleiotropy with the inSIDE assumption. The multi-modality visualization shares similarities with [25], which estimates the causal effect by the global mode, but we provide a more comprehensive analysis to identify multiple pleiotropic pathways by the local modes. Our causality direction identification is related to bidirectional MR where they used the assumption that if we reverse the role of risk factor and disease, the estimated causal effect is likely to be 0. We use this idea in a more principled way and can avoid bias when SNPs affecting the disease through the target risk factors are also selected in the reversed MR. Finally, as the intercept term in MR-Egger is not invariant to the arbitrary assignment of effect alleles for each SNP, leading to the deficiency of the method, GRAPPLE does not include any intercept term.
GRAPPLE needs a separate GWAS cohort of the exposure for SNP selection, which is necessary for valid inference with weakly associated SNPs. Actually, as shown in Fig S1a, the three-sample design is needed for other MR methods as well to avoid selection bias. Currently, we find it hard to obtain multiple good-quality public GWAS summary statistics with non-overlapping cohorts. We suggest that the stage-specific or study-specific GWAS data before meta-analysis may be released to the public in the future.
In GRAPPLE, we still require using a p-value threshold, though it can be as mild as 10-2, instead of requiring no p-value threshold at all. There are two main reasons for this requirement. One consideration is to increase power, as including too many SNPs with γj = 0 or extremely small would instead increase the variance of [59, 58]. Another consideration is that we would not want unmeasured risk factors that are unassociated (or very weakly associated) with target risk factors to bring in large pleiotropic effects on SNPs that mainly affect these unmeasured risk factors. The chance of including these SNPs would be much lower by requiring a mild p-value threshold.
To adjust for confounding risk factors, GRAPPLE requires that these factors are either known a priori, or can be identified from the marker SNPs / genes / traits. However, this step can be hard to execute in practice. The pleiotropic pathways may not be well tagged, and GRAPPLE may not have the power to return enough markers. As a future direction, instead of adjusting for unknown confounding risk factors, we may consider directly adjusting for confounding gene expressions that can be more easily identified.
Finally, when discussing the causal effect of a risk factor, one implicit assumption we use is consistency, assuming that there is a clear and only one version of intervention that can be done on the risk factor. However, interventions on risk factors such as BMI are typically vague [15]. For instance, there can be multiple ways to change weight, such as taking exercise, switching to different diet or conducting a surgery. It is common sense that these different interventions would have different effects on diseases, though they may change BMI by the same amount. Similarly, the cholesterol has abundant functions in our body and involves in multiple biological processes. Intervening different biological processes to change the concentrations of lipid traits may also have different effect on diseases. With MR, the interventions are changing risk factors levels with natural mutations, which may be different from interventions with drugs that has a rapid and strong effect on the risk factors. We think that our causal inference using GRAPPLE, along with the markers we detect, would provide abundant information to deepen our understanding of the risk factors. However, one still needs to be careful when giving causal interpretations of the results. One recommendation in practice is to triangulate the results from MR with other sources of evidence [37].
Materials and Methods
Model details
The structural equations (1) where X = (X1,X2,…,XK) and β = (β1,β2,…,βK) describe how individual level data are generated. To link it with the GWAS summary statistics data, denote which is the true marginal association between a SNP Zj and risk factor Xk and which is the marginal association between Zj and the causal effects of unmeasured risk factors on Y, i.e. the horizontal pleiotropic effect of Zj on Y given X. Then we can rewrite the structural equations into the following linear models: where corr(Zj, ∈jk) = 0 for any k and corr guaranteed by the definitions of γjk and αj. By replacing X in (6) with (5), we get where and As Corr(Zj, ej) = 0, we conclude that Γj also satisfies that
Thus, parameters Γj also represent true marginal associations between SNP Zj and the the disease trait. This is how we result in working with Eq (2).
When the disease is a binary trait, the structural equation of Y changes to
With the same argument, we have
If we further assume that for each genetic instrument j, Zj is actually independent of ej (instead of just being uncorrelated), then the odds ratio that is estimated from the marginal logistic regression will be approximately Γj/c with a constant c > 1 determined by the distribution of ej. In other words, for binary disease outcomes, Eq (2) is still approximately correct with the β in (2) being a conservatively biased (by a ratio of 1/c) version of the β in (7) (for a detailed calculation, see A.1 of [59]).
GWAS summary statistics from overlapping cohorts
The GWAS estimated effect sizes (log odds ratios for binary traits) of SNP j are for the disease and a length K vector for the risk factors. As shown in [7] and derived in Supplementary Note 3.1, for any risk factor k we have where No and Nek are the total sample sizes for the disease and kth risk factor. Nsk is the number of shared samples. The correlation of Xk and Y of any shared sample is Corr [Ys, Xks]. Eq (8) shows that all the SNPs share the same correlation. As a consequence, we assume where Σ is the unknown shared correlation matrix.
Estimate the shared correlation Σ
To estimate Σ from summary statistics, we can use Eq (8). We first need to choose SNPs where γjk = 0 for all risk factors k so that we can estimate the shared correlation Corr using the sample correlation of the chosen SNPs. We choose all SNPs whose selection p-values for all k.
For these selected SNPs, denote the Z-values of for j = 1,…,T as matrix ZT ×(k+1) where T is the number of selected SNPs. Then Σ is estimated as the correlation matrix of ZT×(k+1).
Instruments selection using LD clumping
In GRAPPLE, we need to first select a set of SNPs as genetic instruments to estimate the causal effects β. Here, we only select independent SNPs to simplify the calculation. Besides the independence requirement, we only include SNPs that pass a p-value threshold to reduce the inclusion of false positives that can decrease power. To avoid selection bias, a separate cohort for each risk factor is used where the reported p-values in that cohort are used for instruments selection. Denote the selection p-value for SNP j and risk factor k as pjk, for multiple risk factors and a given selection threshold, we require the Bonferroni combined p-values K min(pjk) to pass the threshold. After that, we use LD clumping with PLINK [29] to select independent genetic instruments. The LD r2 threshold for PLINK is set to 0.001.
Estimate the effects β
Here, we perform statistical analysis assuming αj ~ N(0, τ2) for the pleiotropic effects, while robust to outliers where the pleiotropic effects for a few instruments are large.
Under model (9), Eq (2) and given Σ, the log-likelihood with GWAS summary statistics satisfy: up to some additive constant. Here, e = (1, 0,…, 0).
Define for each SNP j the statistics where ΣXj is the variance of and ΣXjYj is the covariance between and in Σj. Then the profile log-likelihood that profile out parameters (γ1,…, γp) results in
As discussed in [59], maximizing L(β, τ2) would not give consistent estimate of τ2. Because of this and the goal of making robust to outlier SNPs with large pleiotropic effects, our optimization function is the adjusted robustified profile likelihood defined as where ρ(·) is some robust loss function. By default, GRAPPLE uses the Tukey’s Biweight loss function: where c is set to its common default value 4.6851. We maximize (11) with respect to β as well as solving the following estimating equation for the heterogeneity τ2 which is where with . The estimating equation satisfies at the true values of β and τ2, thus can result in consistent estimate of τ2. For the details of estimating β and τ2 as well building confidence intervals for them, see Supplementary Note 3.2.
Identify pleiotropic pathways via the multi-modality diagnosis
We use the mode detection of the robustified profile likelihood (11) to detect multiple pleiotropic pathways. To increase sensitivity, we set τ2 = 0 and reduce the tuning parameter in the Tukey’s Biweight loss function to c = 3. Here we present a detailed argument on why mode detection can identify pleiotropic pathways.
If there is a confounding Genetic Pathway 2 , as shown in Figure 1a, that are missed, then we have the structural equation and also the linear model for a SNP j that only associate with Genetic Pathway 2 and uncorrelate with X conditional on . Similar to (5), we have
Plug in (13), we have
Thus, if there are enough SNPs like SNP j, they would contribute to another mode of (4) at β + κ/δ.
The same argument works for identification of the causal direction. Say there is another that affects Y but is uncorrelated with the risk factor X (δ = 0). The existence of such is common, unless X is the only heritable risk factor of Y. SNPs strongly associated with would not likely be selected when X is the exposure while would appear when the roles of X and Y are switched. These SNPs can be used to identify the causal direction, as as in the reverse MR, they contribute to a mode at 0, while the SNPs that affect Y through X will contribute to a mode at 1/β.
Select marker SNPs and genes for each mode
GRAPPLE uses LD clumping with a stringent r2 (= 0.001) threshold to guarantee independence among the genetic instruments. However, marker SNPs are not restricted to these independent instruments in order to get more biological meaningful markers. Marker SNPs are selected from a SNP set where the SNPs are selected using LD clumping with r2 threshold 0.05.
Assume that there are M modes detected at positions β1, β2, ⋯, βM. Define the residual of SNP for mode m as where tj(·, ·) is defined in Eq (10). SNP j is selected as a marker for mode m if |rjm′| > t1 for any m′ ≠ m and |rjm| ≤ t0. By default, t1 is set to 2 and t0 is set to 1 which gives reasonable results in practice. When the marker SNPs are selected, GRAPPLE further map the SNPs to ENCODE genes where the marker SNPs locate and and search for the traits that these SNPs are strongly associated with in GWA studies by querying HaploReg v4.1 [53] using the R package HaploR. The ratios of the marker SNPs are also returned for reference (shown as the vertical bars in Fig 3b).
Compute replicability p-values across SNP selection thresholds
Each p-value shown in Fig 3a summarizes a vector of p-values across 7 different selection p-value thresholds ranging from 10-8 ot 10-2 for each risk factor and disease pair. It reflects how consistent the significance is across SNP selection thresholds. Specifically, it is the partial conjunction p-value [2] for rejecting the null that β is non-zero for at most 2 of the selection thresholds. For a risk factor and disease pair k, let the p-values computed by using SNPs selected with the 7 thresholds pks where s = 1, 2, ⋯, 7. Then rank them as pk(1) ≤ pk(2) ≤ ⋯ ≤ pk(7), the partial conjunction p-value for the pair k is computed as 5pk(3).
Code Availability
The R package GRAPPLE can be installed from Github at https://github.com/jingshuw/GRAPPLE.
Data Availability
All GWAS summary statistics that are used in the analyses of the manuscript are downloaded from public resources, where most of them are downloaded from the GWAS Catalog [9], and the websites of GWAS consortium GIANT, DIAGRAM, PGC, GLGC, and UKBiobank. A complete list of the datasets used in each analysis and where they are from is provided in Supplementary Tables 1 and 2 and Supplementary Note 2. Intermediate results for screening of 5 risk factors on 25 diseases are available at https://www.dropbox.com/sh/myh8xgxne8fo17v/AABWJf781VrCGnqNFMLtnqIea?dl=0.
Author Contribution
J.W. and Q.Z. conceptualize the study and formulate the model, with discussions with D.S. and N.Z.. J.W. developed the method and algorithm, and performed data analysis. J.B. and G.D.S. helped with designing the validation experiments. G.H. provided data for the GWAS summary statistics of C-reactive protein and LD clumping. J.W., Q.Z. and N.Z. wrote the paper.
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵