The Importance of Real-World Validation of Machine Learning Systems in Wearable Exercise Biofeedback Platforms: A Case Study

Sensors (Basel). 2021 Mar 27;21(7):2346. doi: 10.3390/s21072346.

Abstract

Machine learning models are being utilized to provide wearable sensor-based exercise biofeedback to patients undertaking physical therapy. However, most systems are validated at a technical level using lab-based cross validation approaches. These results do not necessarily reflect the performance levels that patients and clinicians can expect in the real-world environment. This study aimed to conduct a thorough evaluation of an example wearable exercise biofeedback system from laboratory testing through to clinical validation in the target setting, illustrating the importance of context when validating such systems. Each of the various components of the system were evaluated independently, and then in combination as the system is designed to be deployed. The results show a reduction in overall system accuracy between lab-based cross validation (>94%), testing on healthy participants (n = 10) in the target setting (>75%), through to test data collected from the clinical cohort (n = 11) (>59%). This study illustrates that the reliance on lab-based validation approaches may be misleading key stakeholders in the inertial sensor-based exercise biofeedback sector, makes recommendations for clinicians, developers and researchers, and discusses factors that may influence system performance at each stage of evaluation.

Keywords: biofeedback; biomedical technology; exercise therapy; human factors; inertial measurement unit; machine learning; wearables.

MeSH terms

  • Biofeedback, Psychology
  • Exercise
  • Healthy Volunteers
  • Humans
  • Machine Learning
  • Wearable Electronic Devices*