PT - JOURNAL ARTICLE AU - Sebastian Matuszewski AU - Marcel E. Hildebrandt AU - Ana-Hermina Ghenu AU - Jeffrey D. Jensen AU - Claudia Bank TI - A statistical guide to the design of deep mutational scanning experiments AID - 10.1101/048892 DP - 2016 Jan 01 TA - bioRxiv PG - 048892 4099 - http://biorxiv.org/content/early/2016/06/29/048892.short 4100 - http://biorxiv.org/content/early/2016/06/29/048892.full AB - The characterization of the distribution of mutational effects is a key goal in evolutionary biology. Recently developed deep-sequencing approaches allow for accurate and simultaneous estimation of the fitness effects of hundreds of engineered mutations by monitoring their relative abundance across time points in a single bulk competition. Naturally, the achievable resolution of the estimated fitness effects depends on the specific experimental setup, the organism and type of mutations studied, and the sequencing technology utilized, among other factors. By means of analytical approximations and simulations, we provide guidelines for optimizing time-sampled deep-sequencing bulk competition experiments, focusing on the number of mutants, the sequencing depth, and the number of sampled time points. Our analytical results show that sampling more time points together with extending the duration of the experiment improves the achievable precision disproportionately as compared with increasing the sequencing depth, or reducing the number of competing mutants. Even if the duration of the experiment is fixed, sampling more time points and clustering these at the beginning and the end of the experiment increases experimental power, and allows for efficient and precise assessment of the entire range of selection coefficients. Finally, we provide a formula for calculating the 95%-confidence interval for the measurement error estimate, which we implement as an interactive web tool. This allows for quantification of the maximum expected a priori precision of the experimental setup, as well as for a statistical threshold for determining deviations from neutrality for specific selection coefficient estimates.