RT Journal Article SR Electronic T1 Constraints on eQTL fine mapping in the presence of multi-site local regulation of gene expression JF bioRxiv FD Cold Spring Harbor Laboratory SP 084293 DO 10.1101/084293 A1 Biao Zeng A1 Luke R. Lloyd-Jones A1 Alexander Holloway A1 Urko M. Marigorta A1 Andres Metspalu A1 Grant W. Montgomery A1 Tonu Esko A1 Kenneth L. Brigham A1 Arshed A. Quyyumi A1 Youssef Idaghdour A1 Jian Yang A1 Peter M. Visscher A1 Joseph E. Powell A1 Greg Gibson YR 2016 UL http://biorxiv.org/content/early/2016/10/29/084293.abstract AB Expression QTL (eQTL) detection has emerged as an important tool for unravelling of the relationship between genetic risk factors and disease or clinical phenotypes. Most studies use single marker linear regression to discover primary signals, followed by sequential conditional modeling to detect secondary genetic variants affecting gene expression. However, this approach assumes that functional variants are sparsely distributed and that close linkage between them has little impact on estimation of their precise location and magnitude of effects. In this study, we address the prevalence of secondary signals and bias in estimation of their effects by performing multi-site linear regression on two large human cohort peripheral blood gene expression datasets (each greater than 2,500 samples) with accompanying whole genome genotypes, namely the CAGE compendium of Illumina microarray studies, and the Framingham Heart Study Affymetrix data. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ~40% of over 3500 eGenes in both datasets, and the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. However, the concordance of specific signals between the two studies is only ~30%, indicating that expression profiling platform is a large source of variance in effect estimation. Furthermore, a series of simulation studies imply that in the presence of multi-site regulation, up to 10% of the secondary signals could be artefacts of incomplete tagging, and at least 5% but up to one quarter of credible intervals may not even include the causal site, which is thus mis-localized. Joint multi-site effect estimation recalibrates effect size estimates by just a small amount on average. Presumably similar conclusions apply to most types of quantitative trait. Given the strong empirical evidence that gene expression is commonly regulated by more than one variant, we conclude that the fine-mapping of causal variants needs to be adjusted for multi-site influences, as conditional estimates can be highly biased by interference among linked sites.