TY - JOUR T1 - Comparison of one-stage and two-stage genome-wide association studies JF - bioRxiv DO - 10.1101/099291 SP - 099291 AU - Shang Xue AU - Funda Ogut AU - Zachary Miller AU - Janu Verma AU - Peter J. Bradbury AU - James B. Holland Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/01/09/099291.abstract N2 - Linear mixed models are widely used in humans, animals, and plants to conduct genome-wide association studies (GWAS). A characteristic of experimental designs for plants is that experimental units are typically multiple-plant plots of families or lines that are replicated across environments. This structure can present computational challenges to conducting a genome scan on raw (plot-level) data. Two-stage methods have been proposed to reduce the complexity and increase the computational speed of whole-genome scans. The first stage of the analysis fits raw data to a model including environment and line effects, but no individual marker effects. The second stage involves the whole genome scan of marker tests using summary values for each line as the dependent variable. Missing data and unbalanced experimental designs can result in biased estimates of marker association effects from two-stage analyses. In this study, we developed a weighted two-stage analysis to reduce bias and improve power of GWAS while maintaining the computational efficiency of two-stage analyses. Simulation based on real marker data of a diverse panel of maize inbred lines was used to compare power and false discovery rate of the new weighted two-stage method to single-stage and other two-stage analyses and to compare different two-stage models. In the case of severely unbalanced data, only the weighted two-stage GWAS has power and false discovery rate similar to the one-stage analysis. The weighted GWAS method has been implemented in the open-source software TASSEL. ER -