MedalCare-XL: 16,900 healthy and pathological synthetic 12 lead ECGs from electrophysiological simulations

Sci Data. 2023 Aug 8;10(1):531. doi: 10.1038/s41597-023-02416-4.

Abstract

Mechanistic cardiac electrophysiology models allow for personalized simulations of the electrical activity in the heart and the ensuing electrocardiogram (ECG) on the body surface. As such, synthetic signals possess known ground truth labels of the underlying disease and can be employed for validation of machine learning ECG analysis tools in addition to clinical signals. Recently, synthetic ECGs were used to enrich sparse clinical data or even replace them completely during training leading to improved performance on real-world clinical test data. We thus generated a novel synthetic database comprising a total of 16,900 12 lead ECGs based on electrophysiological simulations equally distributed into healthy control and 7 pathology classes. The pathological case of myocardial infraction had 6 sub-classes. A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity. The ECG database is split into training, validation, and test folds for development and objective assessment of novel machine learning algorithms.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Electrocardiography*
  • Heart*
  • Humans
  • Machine Learning
  • Myocardium