Leveraging Unlabelled Data in Multiple-Instance Learning Problems for Improved Detection of Parkinsonian Tremor in Free-Living Conditions

IEEE J Biomed Health Inform. 2023 Jul;27(7):3569-3578. doi: 10.1109/JBHI.2023.3267095. Epub 2023 Jun 30.

Abstract

Data-driven approaches for remote detection of Parkinson's Disease and its motor symptoms have proliferated in recent years, owing to the potential clinical benefits of early diagnosis. The holy grail of such approaches is the free-living scenario, in which data are collected continuously and unobtrusively during every day life. However, obtaining fine-grained ground-truth and remaining unobtrusive is a contradiction and therefore, the problem is usually addressed via multiple-instance learning. Yet for large scale studies, obtaining even the necessary coarse ground-truth is not trivial, as a complete neurological evaluation is required. In contrast, large scale collection of data without any ground-truth is much easier. Nevertheless, utilizing unlabelled data in a multiple-instance setting is not straightforward, as the topic has received very little research attention. Here we try to fill this gap by introducing a new method for combining semi-supervised with multiple-instance learning. Our approach builds on the Virtual Adversarial Training principle, a state-of-the-art approach for regular semi-supervised learning, which we adapt and modify appropriately for the multiple-instance setting. We first establish the validity of the proposed approach through proof-of-concept experiments on synthetic problems generated from two well-known benchmark datasets. We then move on to the actual task of detecting PD tremor from hand acceleration signals collected in-the-wild, but in the presence of additional completely unlabelled data. We show that by leveraging the unlabelled data of 454 subjects we can achieve large performance gains (up to 9% increase in F1-score) in per-subject tremor detection for a cohort of 45 subjects with known tremor ground-truth. In doing so, we confirm the validity of our approach on a real-world problem where the need for semi-supervised and multiple-instance learning arises naturally.

MeSH terms

  • Humans
  • Parkinson Disease* / diagnosis
  • Social Conditions
  • Supervised Machine Learning
  • Tremor* / diagnosis