TY - JOUR T1 - Powerful tests for multi-marker association analysis using ensemble learning JF - bioRxiv DO - 10.1101/005405 SP - 005405 AU - Badri Padhukasahasram Y1 - 2014/01/01 UR - http://biorxiv.org/content/early/2014/05/23/005405.abstract N2 - Multi-marker approaches are currently gaining a lot of interest in genome wide association studies and can enhance power to detect new associations under certain conditions. Gene and pathway based association tests are increasingly being viewed as useful complements to the more widely used single marker association analysis which have successfully uncovered numerous disease variants. A major drawback of single-marker based methods is that they do not consider pairwise and higher-order interactions between variants. Here, we describe multi-variate methods for gene and pathway based association analyses using phenotype predictions based on machine learning algorithms. Instead of utilizing only a linear or logistic regression model, we propose the use of ensembles of diverse machine learning algorithms for testing multi-variate associations. As the true mathematical relationship between a phenotype and any group of genetic and clinical variables is unknown in advance and may be complex, such a strategy gives us a general and flexible framework to approximate this relationship across different sets of SNPs. We show how phenotype prediction based on our method can be used for constructing tests for SNP set association analysis. We first apply our method to simulated datasets to demonstrate its power and correctness. Then, we apply our method to previously studied asthma-related genes in 2 independent asthma cohorts to conduct association tests. ER -