TY - JOUR T1 - Shared genomic variants: identification of transmission routes using pathogen deep sequence data JF - bioRxiv DO - 10.1101/032458 SP - 032458 AU - Colin J. Worby AU - Marc Lipsitch AU - William P. Hanage Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/08/01/032458.abstract N2 - Sequencing pathogen samples during a communicable disease outbreak is becoming an increasingly common procedure in epidemiological investigations. Identifying who infected whom sheds considerable light on transmission patterns, high-risk settings and subpopulations, and infection control effectiveness. Genomic data shed new light on transmission dynamics, and can be used to identify clusters of individuals likely to be linked by direct transmission. However, identification of individual routes of infection via single genome samples typically remains uncertain. Here, we investigate the potential of deep sequence data to provide greater resolution on transmission routes, via the identification of shared genomic variants. We assess several easily implemented methods to identify transmission routes using both shared variants and genetic distance, demonstrating that shared variants can provide considerable additional information in most scenarios. While shared variant approaches identify relatively few links in the presence of a small transmission bottleneck, these links are highly confident. Furthermore, we proposed hybrid approach additionally incorporating phylogenetic distance to provide greater resolution. We apply our methods to data collected during the 2014 Ebola outbreak, identifying several likely routes of transmission. Our study highlights the power of pathogen deep sequence data as a component of outbreak investigation and epidemiological analyses. ER -