TY - JOUR T1 - Constraints on eQTL fine mapping in the presence of multi-site local regulation of gene expression JF - bioRxiv DO - 10.1101/084293 SP - 084293 AU - Biao Zeng AU - Luke R. Lloyd-Jones AU - Alexander Holloway AU - Urko M. Marigorta AU - Andres Metspalu AU - Grant W. Montgomery AU - Tonu Esko AU - Kenneth L. Brigham AU - Arshed A. Quyyumi AU - Youssef Idaghdour AU - Jian Yang AU - Peter M. Visscher AU - Joseph E. Powell AU - Greg Gibson Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/10/29/084293.abstract N2 - Expression QTL (eQTL) detection has emerged as an important tool for unravelling of the relationship between genetic risk factors and disease or clinical phenotypes. Most studies use single marker linear regression to discover primary signals, followed by sequential conditional modeling to detect secondary genetic variants affecting gene expression. However, this approach assumes that functional variants are sparsely distributed and that close linkage between them has little impact on estimation of their precise location and magnitude of effects. In this study, we address the prevalence of secondary signals and bias in estimation of their effects by performing multi-site linear regression on two large human cohort peripheral blood gene expression datasets (each greater than 2,500 samples) with accompanying whole genome genotypes, namely the CAGE compendium of Illumina microarray studies, and the Framingham Heart Study Affymetrix data. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ~40% of over 3500 eGenes in both datasets, and the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. However, the concordance of specific signals between the two studies is only ~30%, indicating that expression profiling platform is a large source of variance in effect estimation. Furthermore, a series of simulation studies imply that in the presence of multi-site regulation, up to 10% of the secondary signals could be artefacts of incomplete tagging, and at least 5% but up to one quarter of credible intervals may not even include the causal site, which is thus mis-localized. Joint multi-site effect estimation recalibrates effect size estimates by just a small amount on average. Presumably similar conclusions apply to most types of quantitative trait. Given the strong empirical evidence that gene expression is commonly regulated by more than one variant, we conclude that the fine-mapping of causal variants needs to be adjusted for multi-site influences, as conditional estimates can be highly biased by interference among linked sites. ER -