RT Journal Article SR Electronic T1 EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations JF bioRxiv FD Cold Spring Harbor Laboratory SP 023457 DO 10.1101/023457 A1 Guo-Bo Chen A1 Sang Hong Lee A1 Zhi-Xiang Zhu A1 Beben Benyamin A1 Matthew R. Robinson YR 2015 UL http://biorxiv.org/content/early/2015/11/17/023457.abstract AB We apply the statistical framework for genome-wide association studies (GWAS) to eigenvector decomposition (EigenGWAS), which is commonly used in population genetics to characterise the structure of genetic data. The approach does not require discrete sub-populations and thus it can be utilized in any genetic data where the underlying population structure is unknown, or where the interest is assessing divergence along a gradient. Through theory and simulation study we show that our approach can identify regions under selection along gradients of ancestry. In real data, we confirm this by demonstrating LCT to be under selection between HapMap CEU-TSI cohorts, and validated this selection signal across European countries in the POPRES samples. HERC2 was also found to be differentiated between both the CEU-TSI cohort and within the POPRES sample, reflecting the likely anthropological differences in skin and hair colour between northern and southern European populations. Controlling for population stratification is of great importance in any quantitative genetic study and our approach also provides a simple, fast, and accurate way of predicting principal components in independent samples. With ever increasing sample sizes across many fields, this approach is likely to be greatly utilized to gain individual-level eigenvectors avoiding the computational challenges associated with conducting singular value decomposition in large datasets. We have developed freely available software to facilitate the application of the methods.