Abstract
Mendelian randomization is a promising approach to help improve causal inference in observational studies, with widespread potential applications, including to prioritization of pharmacotherapeutic targets for evaluation in RCTs. From its initial proposal the limitations of Mendelian randomization approaches have been widely recognised and discussed, and recently Pickrell has reiterated these1. However this critique did not acknowledge recent developments in both methodological and empirical research, nor did it recognise many future opportunities for application of the Mendelian randomization approach. These issues are briefly reviewed here.
Whilst providing an appropriate note of caution with respect to the interpretation of Mendelian randomization study findings1, Pickrell’s reasons for viewing Mendelian randomization as having so far failed to fulfil its promise are based on some misconceptions.
30 years of Mendelian randomization?
Pickrell says that the “first reason for scepticism is that in the nearly 30 years of Mendelian randomization, arguably no new causal relationship has been identified with this approach and subsequently verified in a randomized controlled trial” 1. The thirty year figure appears to have derived from the notion that the method began to be applied following a publication by Martijn Katan in 19862. Katan’s insightful contribution was a letter to the Lancet which proposed the use of an APOE genetic marker for cholesterol level to interrogate the claim that low circulating cholesterol increases the risk of cancer. The major issue with observational studies that had examined the association between circulating cholesterol level and cancer was that early stages of disease could lower cholesterol level, but would not influence genotype (i.e. the reverse causation problem that exists for conventional observational studies would not influence a genetic association). Katan also pointed out that genotypes would be related to long-term (since birth) differences in cholesterol levels. This would provide a powerful test of the low cholesterol-cancer risk hypothesis. No data were presented in the letter, and it did not use the term Mendelian randomization. In the subsequent 17 years the letter was only cited twice3,4, neither time in relation to its proposed methodology. No “Mendelian randomization” studies followed its publication, and it has only become widely cited (one suspects often by people who have not read it, given the disconnect between what the citations suggest was in the letter and what was actually there) after it was referenced as one of the antecedents of Mendelian randomization in the first extended presentation of the approach in 20035, and then we reprinted it in the International Journal of Epidemiology in 20046,7
Pickrell is not alone in misstating the time span over which Mendelian randomization studies have been performed, with others claiming that the method has been used in epidemiological studies for more than 20 years8. Some confusion is created by the fact that the term “Mendelian randomization” was introduced in a very different context, in studies utilizing whether or not acute myeloid leukemia patients had HLA compatible siblings as a way of obtaining evidence as to the survival benefits of bone marrow transplants.9 Strangely, this paper is quoted as the source of study designs that use “genetically determined differences in exposure to test whether a biomarker affects disease”, and to provide evidence of the more than two decades over which the use of “genetic variation in a biomarker to deduce the causal effects of that biomarker on a disease” has been in use10. In fact the term Mendelian randomization has only been used in its current sense since the early 2000s5,11,12,13.
This archaeology of the concept of Mendelian randomization (provided in more depth elsewhere)14 is of relevance to Pickrell’s critique, as the main plank of his first “reason for scepticism” is based on a misperception of time scale over which such studies have been carried out. Indeed it is only since the era of robust identification of common genetic variant associations with phenotypes in the post-GWAS era that Mendelian randomization studies have generally become feasible, reflected in the very considerable increase in publication of Mendelian randomization studies over the last five years.
No evidence of Mendelian randomization studies leading to clinical trials?
Given the well documented long time-course required for drug development15, or the development of non-drug public health and clinical interventions, it is of course entirely unsurprising that there has not been a cascade of new causal relationships that have been identified with this approach and subsequently verified in randomized controlled trials, as Pickrell seems to expect should have occurred in the 12 years since the use of this study design was first articulated. He suggests, indeed, that there are no such cases. This is incorrect – remarkably (given the short time period over which Mendelian randomization studies have been applied, and the time required for drug, public health and clinical intervention development through to completion of long-term phase 3 trials) – there are several cases. For example, proprotein convertase subtilisin/kexin type 9 serine protease (PCSK9) genetic variation was identified as relating to LDL cholesterol level and coronary heart disease in 200616 and the two largest randomised controlled trials (RCTs) of monoclonal antibodies targeting PCSK9 that have appeared since publication of that Mendelian randomization study, both suggesting 17,18 a reduction in cardiovascular events, as does a meta-analysis including smaller trials 19.
In conventional observational studies lipoprotein-associated phospholipase A2 (Lp-PLA2) levels have been shown to predict coronary heart disease (CHD) risk for many years, with this being apparently independent of conventional CHD risk factors (see e.g. the 2010 large scale overview of such studies20). This led to the development of pharmacotherapeutic agents which lowered Lp-PLA2. However genetic studies on the V279F variant in PLA2G7, which is common in East Asian but not European-origin populations and is associated with particularly low levels of Lp-PLA2, suggested that there would not be a substantial effect of lowering LP-PLA2 levels on CHD risk (see e.g. a 2010 metaanalysis of these)21. Subsequently large scale trials have failed to find the benefit that was anticipated from a naive interpretation of the observational epidemiological data22,23,24. Whilst other trials are ongoing the magnitude of anticipated benefit (if any) has certainly been down-scaled as a result of the Mendelian randomization studies.
Additionally there are cases where genetic studies and the development of therapies co-evolved. For example, findings with respect to Niemann-Pick Cl-Like 1 Protein (NPC1L1) genetic variation and ezetimibe, the drug that targets NPC1L1, were cited in a recent review of Mendelian randomization applied to drug development as being apparently discrepant25. Recent large-scale genetic and RCT data now suggest the findings are concordant 26,27,28. Furthermore there are several examples of ongoing trials testing expectations from Mendelian randomization studies, for example trials of IL6 receptor blockade29 that will evaluate the predictions from Mendelian randomization studies30,31.
As well as contributing to the elucidation of some drug targets, Mendelian randomization has also deprioritized the development of some pharmacotherapeutic agents. For example the further development of therapies targeting C-reactive protein (CRP) levels was not encouraged by the many Mendelian randomization studies suggesting CRP was not causal with respect to a range of cardiometabolic outcomes 32,33,34,35, and targeting the endothelial lipase gene (LIPG Asn396Ser) to lower HDL cholesterol was discouraged by a large-scale Mendelian randomization study of this.36
To support his argument Pickrell provides a supplementary table presenting a comparison of Mendelian randomization studies and RCTs supposedly investigating the same issue, implying that this is a comprehensive list of Mendelian randomization studies. But in fact this includes a very limited number of studies included purely as illustrative examples in two overviews of Mendelian randomization. But there are a very large number of Mendelian randomization studies – for a partial but more systematic listing, see Boef et al37. This presents nearly 200 studies, but remains only a sample since it is based on search and selection criteria that have missed a large number of such studies (e.g. consortia papers, which are increasingly common in the field, those using different terminologies but nonetheless applying Mendelian randomization methods, etc.). An attempt to evaluate Mendelian randomization on the basis of a (in any case flawed) comparison should attempt a somewhat more systematic approach. Furthermore Pickrell fails to recognise the widely understood limitations of RCTs. As just one example in the table he suggests there are no RCTs of alcohol reduction and blood pressure. This is far from the truth, already by 2001 a widely cited metaanalysis included 15 such RCTs38, and more have appeared since. However such trials have problems in producing large sustained changes in alcohol consumption. The power of Mendelian randomization to provide useful evidence in situations where trials are difficult or impossible to successfully implement is a very considerable strength, not a weakness, of the approach.
Pleiotropy: what’s new?
The second reason for scepticism that Pickrell raises, that of pleiotropy, has been discussed in very considerable detail, from the initial 2003 paper5 onwards, e.g. recently by Glymour et al39 and Vanderweele40. As we will see, the simulation he provides does not add greatly to what is known and widely recognised about the potential effect of pleiotropy. Furthermore the paper fails to acknowledge the extensive (and empirically useful) work in the instrumental variables field, much of which has been directly applied to and utilised in the Mendelian randomization context, and is well known to practitioners of IV analysis. These methods allow for robustness checks, and estimation which remains valid under relaxed IV assumptions. For a limited sample of this literature, see, e.g,41,42,43,44,45,46,47,48.
Against the background of this exciting work Pickrell makes two unreliable observations. First, he cites Bulik-Sullivan’s groundbreaking work using whole-genome LD regression approaches49 as showing that “genetic variants that influence HDL cholesterol levels have correlated effects on whether an individual went to college. A naive interpretation of this might suggest the (rather nonsensical) conclusion that HDL-raising drugs should increase education levels”.1 This makes it sound as though Bulik-Sullivan’s work is an example of Mendelian randomization, which it is not - it simply suggests there is a genetic correlation, that could be generated by vertical (real) phenotypic effects, or be through horizontal (spurious) pleiotropy, or a combination of the two. Furthermore there is no direction implied by there being a genetic correlation, whilst the nonsensical conclusion that Pickrell highlights assumes that direction of effect can be inferred. This is not the case. It is the case, however, that higher college education - and things that follow on from this, like greater awareness of healthy diet, improved exercise behaviour, the use of medications etc - will influence HDL cholesterol. The genetic correlation could, at least in part, be through phenotypic effects.
Second, Pickrell states that “it is sometimes suggested that using a large number of genetic variants in Mendelian randomization (combined into a single score) offers a way to partially avoid this problem”1, and the simulations that are then performed examine this situation. Unsurprisingly, if you simulate a situation with pleiotropy you demonstrate the existence of pleiotropic effects. However approaches to utilising multiple genetic variants in Mendelian randomization studies have considered comparing potentially large numbers of independent estimates, which allow evaluation of the extent to which pleiotropy may be biasing effect estimates, as the particular strength of this situation. They do not suggest that combinations into a single score provide such reassurance. For example, in an introductory paper on Mendelian randomization published some years ago, it was suggested that
“In some cases, it may be possible to identify two separate genetic variants, which are not in linkage disequilibrium with each other, but which both serve as proxies for the environmentally modifiable risk factor of interest. If both variants are related to the outcome of interest and point to the same underlying association, then it becomes much less plausible that reintroduced confounding explains the association, since it would have to be acting in the same way for these two unlinked variants. This can be likened to RCTs of different blood pressure-lowering agents, which work through different mechanisms and have different potential side effects. If the different agents produce the same reductions in cardiovascular disease risk, then it is unlikely that this is through agent-specific (pleiotropic) effects of the drugs; rather, it points to blood pressure lowering as being key. The latter is indeed what is in general observed50. In another context, two distinct genetic variants acting as instruments for higher body fat content have been used to demonstrate that greater adiposity is related to higher bone mineral density51. With the large number of genetic variants that are being identified in genome wide association studies in relation to particular phenotypes—e.g. >50 independent variants that are related to height; >90 that are related to total cholesterol and >20 related to fasting glucose—it is possible to generate many independent combinations of such variants and from these many independent instrumental variable estimates of the causal associations between an environmentally modifiable risk factor and a disease outcome. The independent estimates will not be plausibly influenced by any common pleiotropy or LD-induced confounding, and therefore if they display consistency this provides strong evidence against the notion that reintroduced confounding is generating the associations”52.
A simple use of multiple genetic instruments for evaluating the plausibility of distortion of Mendelian randomization findings by pleiotropy is illustrated in the figure. It is increasingly improbable for 2, 3, 4 or more genetic variants out of LD with each other to lead to precisely the same quantitative causal effect estimate due to perfectly balancing pleiotropy. Here we see data from 9 SNPs from 6 genes which lead to remarkably similar predicted causal effects of LDL cholesterol on CHD risk.
The more recent developments in instrumental variables approaches referenced earlier move well beyond even this interrogation of the potential influence of pleiotropy in the Mendelian randomization setting, and provide an extensive range of sensitivity analyses that can inform interpretation of the findings41,42,43,47. Ironically, one of the conclusions of Pickrell’s exciting recent work on shared genetic influences on human traits53 - that “the effect sizes of the variants on the different traits appear to be largely uncorrelated”53 – provides the basis for one approach to utilising multiple potentially pleiotropic genetic variants in the Mendelian randomization context.43
Fulfilling the potential of Mendelian randomization?
The suggestions Pickrell makes regarding fulfilling the promises of Mendelian randomization are useful considerations. However they fail to anticipate some transformational levels of evidence that can be provided by Mendelian randomization approaches. To give a short selection of these related to just one issue, that of pharmaceutical target evaluation:
(1) Predicting the comprehensive phenotypic effect of pharmaceutical treatments (including unwanted side effects) based on interrogation of the phenome-wide associations of a genetic proxy. One example suggests that IL1 manipulation – currently being trialled in autoimmune disease contexts – may increase rather than decrease cardiovascular risk.54
(2) The separation of on-target from off-target effects of pharmacotherapy. A recent example suggests that the increase in diabetes risk seen in statin trials is a consequence of its mechanism of reducing cholesterol, rather than an off-target side effect unrelated to the pathway thought which cholesterol is reduced55
(3) The separation of specific mechanism from intermediate phenotype effects. In the above statin example is the on-target effect could be due to HMGCoA reductase manipulation or to cholesterol lowering, in which case all approaches to cholesterol reduction would have the same influence on diabetes risk, proportional to their success in actually reducing cholesterol level
(4) Providing evidence on generalizability of findings from RCTs. It is not feasible to conduct adequately powered large scale RCTs of a treatment in every possible subgroup of the global population (defined by combinations of age, gender, ethnicity, an extensive range of comorbidities, for example), but Mendelian randomization studies using genetic proxies could be carried out at a tiny fraction of the cost
(5) Providing evidence on combined drug treatment – answering the question ‘will combined therapy produce additional benefits?’ – through analysis of combination of genetic variants mimicking the different drugs, in the factorial Mendelian randomization design28.
Recent commentaries, illustrating the considerable potential in this field, are available elsewhere.56 57. Similar and greatly expanded lists could be created in many other domains of research, translation and practice.
Mendelian randomization in its place
Pickrell draws attention to some limitations of Mendelian randomization, and many others of course exist and have been discussed at length in the published literature. Ways of moving beyond these are being developed, however. There are some fundamental issues not discussed by Pickrell that deserve attention as Mendelian randomisation develops. First, most of the currently available GWAS data are not the ones that are required to answer some of the most important questions in medicine and related disciplines. Mendelian randomization has largely been applied to disease aetiology (e.g. in case-control studies of particular diseases), but this does not have any direct bearing on therapy. For example, GWAS clearly identified a proxy for smoking intensity as the strongest common genetic variant related to lung cancer, reaffirming the causal influence of smoking on lung cancer. However, once lung cancer has developed stopping smoking is not an effective treatment. It is likely that many – diseases demonstrate a similar disconnect between triggers (which cannot be reversed through the same process once the disease has been initiated) and factors related to progression and prognosis. Mendelian randomization studies of disease incidence are a powerful tool for identifying applied to studies of disease progression offers to provide evidence regarding factors that could prevent disease – such as smoking and lung cancer as a proof of principle example) for some conditions – e.g. CHD – factors that relate to disease incidence (higher blood pressure, higher cholesterol, and smoking) relate to disease progression and future events. However in other conditions this will not be the case. Mendelian randomization applied to studies of disease progression may provide evidence regarding factors that could be manipulated to improve prognosis, and establishing such studies would represent a major advance.
Second, Mendelian randomization generally utilises genetic variants which influence life-long exposure to different levels of a potential risk factor. This means that such studies can establish the very long-term effects of such exposures58, and has advantages in terms of demonstrating what would be possible with preventative initiatives starting in early life, and the public health effect of factors that produce population-level shifts in a risk factor such as blood pressure or cholesterol. However a potential downside is that if exposure at one period of life leads to a change in risk that is not reversible, the Mendelian randomization findings would not be reproduced in a trial modifying the factor in later life. For example, if lower levels of antioxidant exposure (e.g. vitamin C, uric acid, bilirubin) in infancy or childhood lead to changes in the arterial wall which cannot be reversed by later antioxidant supplementation, Mendelian randomization findings would correctly suggest adverse effects of low antioxidant levels, but this would not translate into benefit of antioxidant treatment. In principal it could be possible to utilise gene by environment interactions to identify critical periods during which exposures act52, but in reality obtaining datasets for this of adequate sample size, to establish robust gene by environment interactions, will be a seriously limiting factor. This illustrates the need to combine Mendelian randomisation evidence with other sources of information
As indicated above, Mendelian randomization provides one plank of evidence on a particular question. The evaluation of any particular question requires the triangulation of evidence from different approaches, particularly from approaches which, whilst potentially biased, suffer from nonassociated forms of bias59,60,61. The approach provides a potentially powerful narrowing down of hypotheses for further testing, for example, the selection of candidate pharmacotherapeutic agents for evaluation in expensive RCTs. As we said in the initial extended presentation of Mendelian randomization “It is probably fair to say that the method offers a more robust approach to understanding the effect of some modifiable exposures on health outcomes than does much conventional observational epidemiology. Where possible randomized controlled trials remain the final arbiter of the effects of interventions intended to influence health, however”5.
Acknowledgements
Thanks to Stephen Burgess, Dave Evans, Marcus Munafo, Caroline Relton and Nic Timpson for comments on an earlier draft of this piece.
Footnotes
↵i Ference BA, Yoo W, Alesh I, Mahajan N, Mirowska KK, Mewada A, Kahn J, Afonso L, Williams KA Sr, Flack JM. Effect of long-term exposure to lower lowdensity lipoprotein cholesterol beginning early in life on the risk of coronary heart disease: a Mendelian randomization analys is. J. Am. Coll. Cardiol 2012, 60, 2631–2639