Analyzing data and data sources towards a unified approach for ensuring end-to-end data and data sources quality in healthcare 4.0

Comput Methods Programs Biomed. 2019 Nov:181:104967. doi: 10.1016/j.cmpb.2019.06.026. Epub 2019 Jun 29.

Abstract

Background and objective: Healthcare 4.0 is being hailed as the current industrial revolution in the healthcare domain, dealing with billions of heterogeneous IoT data sources that are connected over the Internet and aim at providing real-time health-related information for citizens and patients. It is of major importance to utilize an automated way to identify the quality levels of these data sources, in order to obtain reliable health data.

Methods: In this manuscript, we demonstrate an innovative mechanism for assessing the quality of various datasets in correlation with the quality of the corresponding data sources. For that purpose, the mechanism follows a 5-stepped approach through which the available data sources are detected, identified and connected to health platforms, where finally their data is gathered. Once the data is obtained, the mechanism cleans it and correlates it with the quality measurements that are captured from each different data source, in order to finally decide whether these data sources are being characterized as qualitative or not, and thus their data is kept for further analysis.

Results: The proposed mechanism is evaluated through an experiment using a sample of 18 existing heterogeneous medical data sources. Based on the captured results, we were able to identify a data source of unknown type, recognizing that it was a body weight scale. Afterwards, we were able to find out that the API method that was responsible for gathering data out of this data source was the getMeasurements() method, while combining both the body weight scale's quality and its derived data quality, we could decide that this data source was considered as qualitative enough.

Conclusions: By taking full advantage of capturing the quality of a data source through measuring and correlating both the data source's quality itself and the quality of its derived data, the proposed mechanism provides efficient results, being able to ensure end-to-end both data sources and data quality.

Keywords: Data quality; Data sources quality; Healthcare 4.0; Internet of things; Quality assessment.

MeSH terms

  • Body Weight
  • Data Accuracy*
  • Data Analysis*
  • Data Collection
  • Decision Making
  • Delivery of Health Care
  • Female
  • Humans
  • Information Storage and Retrieval / standards*
  • Male
  • Medical Informatics / methods*
  • Observer Variation
  • Registries
  • Reproducibility of Results