Data-driven methodology to predict the ICU length of stay: A multicentre study of 99,492 admissions in 109 Brazilian units

Anaesth Crit Care Pain Med. 2022 Dec;41(6):101142. doi: 10.1016/j.accpm.2022.101142. Epub 2022 Aug 18.

Abstract

Purpose: The length of stay (LoS) is one of the most used metrics for resource use in Intensive Care Units (ICU). We propose a structured data-driven methodology to predict the ICU length of stay and the risk of prolonged stay, and its application in a large multicentre Brazilian ICU database.

Methods: Demographic data, comorbidities, complications, laboratory data, and primary and secondary diagnosis were prospectively collected and retrospectively analysed by a data-driven methodology, which includes eight different machine learning models and a stacking model. The study setting included 109 mixed-type ICUs from 38 Brazilian hospitals and the external validation was performed by 93 medical-surgical ICUs of 55 hospitals in Brazil.

Results: A cohort of 99,492 adult ICU admissions were included from the 1st of January to the 31st of December 2019. The stacking model combining Random Forests and Linear Regression presented the best results to predict ICU length of stay (RMSE = 3.82; MAE = 2.52; R² = 0.36). The prediction model for the risk of long stay were accurate to early identify prolonged stay patients (Brier Score = 0.04, AUC = 0.87, PPV = 0.83, NPV = 0.95).

Conclusion: The data-driven methodology to predict ICU length of stay and the risk of long-stay proved accurate in a large multicentre cohort of general ICU patients. The proposed models are helpful to predict the individual length of stay and to early identify patients with high risk of prolonged stay.

Keywords: Intensive care unit; Length of stay; Machine learning; Prediction model; Resource use.

Publication types

  • Multicenter Study

MeSH terms

  • Adult
  • Brazil
  • Critical Care*
  • Humans
  • Intensive Care Units*
  • Length of Stay
  • Retrospective Studies