Rapid health data repository allocation using predictive machine learning

Health Informatics J. 2020 Dec;26(4):3009-3036. doi: 10.1177/1460458220957486. Epub 2020 Sep 24.

Abstract

Health-related data is stored in a number of repositories that are managed and controlled by different entities. For instance, Electronic Health Records are usually administered by governments. Electronic Medical Records are typically controlled by health care providers, whereas Personal Health Records are managed directly by patients. Recently, Blockchain-based health record systems largely regulated by technology have emerged as another type of repository. Repositories for storing health data differ from one another based on cost, level of security and quality of performance. Not only has the type of repositories increased in recent years, but the quantum of health data to be stored has increased. For instance, the advent of wearable sensors that capture physiological signs has resulted in an exponential growth in digital health data. The increase in the types of repository and amount of data has driven a need for intelligent processes to select appropriate repositories as data is collected. However, the storage allocation decision is complex and nuanced. The challenges are exacerbated when health data are continuously streamed, as is the case with wearable sensors. Although patients are not always solely responsible for determining which repository should be used, they typically have some input into this decision. Patients can be expected to have idiosyncratic preferences regarding storage decisions depending on their unique contexts. In this paper, we propose a predictive model for the storage of health data that can meet patient needs and make storage decisions rapidly, in real-time, even with data streaming from wearable sensors. The model is built with a machine learning classifier that learns the mapping between characteristics of health data and features of storage repositories from a training set generated synthetically from correlations evident from small samples of experts. Results from the evaluation demonstrate the viability of the machine learning technique used.

Keywords: Big Health data; Blockchain; classifier; deep learning; digital health record storage; electronic health record; machine learning; quality of performance; security and privacy; stream data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Blockchain*
  • Electronic Health Records
  • Health Records, Personal*
  • Humans
  • Machine Learning
  • Technology