Development and validation of a quick, automated, and reproducible ATR FT-IR spectroscopy machine-learning model for Klebsiella pneumoniae typing

J Clin Microbiol. 2024 Feb 14;62(2):e0121123. doi: 10.1128/jcm.01211-23. Epub 2024 Jan 29.

Abstract

The reliability of Fourier-transform infrared (FT-IR) spectroscopy for Klebsiella pneumoniae typing and outbreak control has been previously assessed, but issues remain in standardization and reproducibility. We developed and validated a reproducible FT-IR with attenuated total reflectance (ATR) workflow for the identification of K. pneumoniae lineages. We used 293 isolates representing multidrug-resistant K. pneumoniae lineages causing outbreaks worldwide (2002-2021) to train a random forest classification (RF) model based on capsular (KL)-type discrimination. This model was validated with 280 contemporaneous isolates (2021-2022), using wzi sequencing and whole-genome sequencing as references. Repeatability and reproducibility were tested in different culture media and instruments throughout time. Our RF model allowed the classification of 33 capsular (KL)-types and up to 36 clinically relevant K. pneumoniae lineages based on the discrimination of specific KL- and O-type combinations. We obtained high rates of accuracy (89%), sensitivity (88%), and specificity (92%), including from cultures obtained directly from the clinical sample, allowing to obtain typing information the same day bacteria are identified. The workflow was reproducible in different instruments throughout time (>98% correct predictions). Direct colony application, spectral acquisition, and automated KL prediction through Clover MS Data analysis software allow a short time-to-result (5 min/isolate). We demonstrated that FT-IR ATR spectroscopy provides meaningful, reproducible, and accurate information at a very early stage (as soon as bacterial identification) to support infection control and public health surveillance. The high robustness together with automated and flexible workflows for data analysis provide opportunities to consolidate real-time applications at a global level. IMPORTANCE We created and validated an automated and simple workflow for the identification of clinically relevant Klebsiella pneumoniae lineages by FT-IR spectroscopy and machine-learning, a method that can be extremely useful to provide quick and reliable typing information to support real-time decisions of outbreak management and infection control. This method and workflow is of interest to support clinical microbiology diagnostics and to aid public health surveillance.

Keywords: Fourier-transform infrared spectroscopy; KL-type; attenuated total reflectance; bacteria; classification model; infection control; machine-learning; nosocomial; outbreak; random forest; typing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Ataxia Telangiectasia Mutated Proteins
  • Bacteria*
  • Humans
  • Klebsiella pneumoniae* / genetics
  • Reproducibility of Results
  • Spectroscopy, Fourier Transform Infrared / methods
  • Whole Genome Sequencing

Substances

  • ATR protein, human
  • Ataxia Telangiectasia Mutated Proteins