Abstract
Data normalization is a crucial step in the gene expression analysis as it determines the validity of its downstream analyses. Although many metrics has been designed to evaluate the relative success of these methods, the results by different metrics did not show consistency. Based on the previous work, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) to evaluate 13 commonly used normalization methods and achieved consistency in our evaluation results using both bulk RNA-seq and scRNA-seq data from the same library construction protocol. These gene expression data, normalization methods and evaluation metrics have been included in an R package named NormExpression. NormExpression provides a framework for researchers to select normalization methods with a fast and simple way to evaluate different methods, particularly some data-driven methods or their own methods.