RT Journal Article
SR Electronic
T1 EcOH: In silico serotyping of E. coli from short read data
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 032151
DO 10.1101/032151
A1 Danielle J. Ingle
A1 Mary Valcanis
A1 Alex Kuzevski
A1 Marija Tauschek
A1 Michael Inouye
A1 Tim Stinear
A1 Myron M. Levine
A1 Roy M. Robins-Browne
A1 Kathryn E. Holt
YR 2015
UL http://biorxiv.org/content/early/2015/11/18/032151.abstract
AB The lipopolysaccharide (O) and flagellar (H) surface antigens of Escherichia coli are targets for serotyping that have traditionally been used to identify pathogenic lineages of E. coli. As serotyping has several limitations, public health reference laboratories are increasingly moving towards whole genome sequencing (WGS) for the rapid characterisation of bacterial isolates. Here we present a method to rapidly and accurately serotype E. coli isolates from raw, short read sequence data, leveraging the known genetic basis for the biosynthesis of O- and H-antigens. Our approach bypasses the need for de novo genome assembly by directly screening WGS reads against a curated database of alleles linked to known E. coli O-groups and H-types (the EcOH database) using the software package SRST2. We validated our approach by comparing in silico results with those obtained via serological phenotyping of 197 enteropathogenic (EPEC) isolates. We also demonstrated the utility of our method to characterise enterotoxigenic E. coli (ETEC) and the uropathogenic E. coli (UPEC) epidemic clone ST131, and for in silico serotyping of foodborne outbreak-related isolates in the public GenomeTrakr database.