RT Journal Article SR Electronic T1 EcOH: In silico serotyping of E. coli from short read data JF bioRxiv FD Cold Spring Harbor Laboratory SP 032151 DO 10.1101/032151 A1 Danielle J. Ingle A1 Mary Valcanis A1 Alex Kuzevski A1 Marija Tauschek A1 Michael Inouye A1 Tim Stinear A1 Myron M. Levine A1 Roy M. Robins-Browne A1 Kathryn E. Holt YR 2015 UL http://biorxiv.org/content/early/2015/11/18/032151.abstract AB The lipopolysaccharide (O) and flagellar (H) surface antigens of Escherichia coli are targets for serotyping that have traditionally been used to identify pathogenic lineages of E. coli. As serotyping has several limitations, public health reference laboratories are increasingly moving towards whole genome sequencing (WGS) for the rapid characterisation of bacterial isolates. Here we present a method to rapidly and accurately serotype E. coli isolates from raw, short read sequence data, leveraging the known genetic basis for the biosynthesis of O- and H-antigens. Our approach bypasses the need for de novo genome assembly by directly screening WGS reads against a curated database of alleles linked to known E. coli O-groups and H-types (the EcOH database) using the software package SRST2. We validated our approach by comparing in silico results with those obtained via serological phenotyping of 197 enteropathogenic (EPEC) isolates. We also demonstrated the utility of our method to characterise enterotoxigenic E. coli (ETEC) and the uropathogenic E. coli (UPEC) epidemic clone ST131, and for in silico serotyping of foodborne outbreak-related isolates in the public GenomeTrakr database.