Abstract
Microarray and RNA-sequencing technologies have enabled rapid quantification of the transcriptomes in a large number of samples. Although dimension reduction methods are commonly applied to transcriptome datasets for visualization and interpretation of the sample variations, the results can be hindered by confounding factors, either biological or technical. In this study, we propose a Principal Component Analysis-based approach to Adjust for Confounding variation (AC-PCA). We show that AC-PCA can adjust for variations across individual donors present in a human brain exon array dataset. Our approach is able to recover the anatomical structure of neocortex regions, including the frontal-temporal and dorsal-ventral axes, and reveal temporal dynamics of the interregional variation, mimicking the “hourglass” pattern of spatiotempo-ral dynamics. For gene selection purposes, we extend AC-PCA with sparsity constraints, and propose and implement an efficient algorithm. The top selected genes from this algorithm demonstrate frontal/temporal and dorsal/ventral expression gradients and strong functional conservation.