Abstract
Rationale: Sepsis is a life-threatening condition with high mortality rates and expensive treatment costs. To improve short- and long-term outcomes, it is critical to detect at-risk sepsis patients at an early stage.
Objective: Our primary goal was to develop machine learning models capable of predicting sepsis using streaming physiological data in real-time.
Methods: A dataset consisting of high-frequency physiological data from 1,161 critically ill patients admitted to the intensive care unit (ICU) was analyzed in this IRB-approved retrospective observational cohort study. Of that total, 634 patients were identified to have developed sepsis. In this paper, we define sepsis as meeting the Systemic Inflammatory Response Syndrome (SIRS) criteria in the presence of the suspicion of infection. In addition to the physiological data, we include white blood cell count (WBC) to develop a model that can signal the future occurrence of sepsis. A random forest classifier was trained to discriminate between sepsis and non-sepsis patients using a total of 108 features extracted from 2-hour moving time-windows. The models were trained on 80% of the patients and were tested on the remaining 20% of the patients, for two observational periods of lengths 3 and 6 hours.
Results: The models, respectively, resulted in F1 scores of 75% and 69% half-hour before sepsis onset and 79% and 76% ten minutes before sepsis onset. On average, the models were able to predict sepsis 210 minutes (3.5 hours) before the onset.
Conclusions: The use of robust machine learning algorithms, continuous streams of physiological data, and WBC, allows for early identification of at-risk patients in real-time with high accuracy.
Footnotes
Copyright form disclosure: Dr. Davis received funding from GlaxoSmithKline. The remaining authors have disclosed that they do not have any potential conflicts of interest.