Cohort profile for development of machine learning models to predict healthcare-related adverse events (Demeter): clinical objectives, data requirements for modelling and overview of data set for 2016-2018

BMJ Open. 2023 Aug 17;13(8):e070929. doi: 10.1136/bmjopen-2022-070929.

Abstract

Purpose: In-hospital health-related adverse events (HAEs) are a major concern for hospitals worldwide. In high-income countries, approximately 1 in 10 patients experience HAEs associated with their hospital stay. Estimating the risk of an HAE at the individual patient level as accurately as possible is one of the first steps towards improving patient outcomes. Risk assessment can enable healthcare providers to target resources to patients in greatest need through adaptations in processes and procedures. Electronic health data facilitates the application of machine-learning methods for risk analysis. We aim, first to reveal correlations between HAE occurrence and patients' characteristics and/or the procedures they undergo during their hospitalisation, and second, to build models that allow the early identification of patients at an elevated risk of HAE.

Participants: 143 865 adult patients hospitalised at Grenoble Alpes University Hospital (France) between 1 January 2016 and 31 December 2018.

Findings to date: In this set-up phase of the project, we describe the preconditions for big data analysis using machine-learning methods. We present an overview of the retrospective de-identified multisource data for a 2-year period extracted from the hospital's Clinical Data Warehouse, along with social determinants of health data from the National Institute of Statistics and Economic Studies, to be used in machine learning (artificial intelligence) training and validation. No supplementary information or evaluation on the part of medical staff will be required by the information system for risk assessment.

Future plans: We are using this data set to develop predictive models for several general HAEs including secondary intensive care admission, prolonged hospital stay, 7-day and 30-day re-hospitalisation, nosocomial bacterial infection, hospital-acquired venous thromboembolism, and in-hospital mortality.

Keywords: PREVENTIVE MEDICINE; REGISTRIES; Risk management; STATISTICS & RESEARCH METHODS.

Publication types

  • Review

MeSH terms

  • Cohort Studies
  • Computer Simulation*
  • Datasets as Topic
  • Female
  • Humans
  • Iatrogenic Disease*
  • Length of Stay*
  • Machine Learning*
  • Male
  • Risk Assessment