TY - JOUR T1 - Longitudinal samples of bacterial genomes potentially bias evolutionary analyses JF - bioRxiv DO - 10.1101/103465 SP - 103465 AU - B.J. Arnold AU - W.P. Hanage Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/01/28/103465.abstract N2 - Samples of bacteria collected over a period of time are attractive for several reasons, including the ability to estimate the molecular clock rate and to detect fluctuations in allele frequencies over time. However, longitudinal datasets are occasionally used in analyses that assume samples were collected contemporaneously. Using both simulations and genomic data from Neisseria gonorrhoeae, Streptococcus mutans, Campylobacter jejuni, and Helicobacter pylori, we show that longitudinal samples (spanning more than a decade in real data) may suffer from considerable bias that inflates estimates of recombination and the number of rare mutations in a sample of genomic sequences. While longitudinal data are frequently accounted for using the serial coalescent, many studies use other programs or metrics, such as Tajima’s D, that are sensitive to these sampling biases and contain genomic data collected across many years. Notably, longitudinal samples from a population of constant size may exhibit evidence of exponential growth. We suggest that population genomic studies of bacteria should routinely account for temporal diversity in samples or provide evidence that longitudinal sampling bias does not affect conclusions. ER -