Potential value and impact of data mining and machine learning in clinical diagnostics

Crit Rev Clin Lab Sci. 2021 Jun;58(4):275-296. doi: 10.1080/10408363.2020.1857681. Epub 2021 Mar 19.

Abstract

Data mining involves the use of mathematical sciences, statistics, artificial intelligence, and machine learning to determine the relationships between variables from a large sample of data. It has previously been shown that data mining can improve the prediction and diagnostic precision of type 2 diabetes mellitus. A few studies have applied machine learning to assess hypertension and metabolic syndrome-related biomarkers, as well as refine the assessment of cardiovascular disease risk. Machine learning methods have also been applied to assess new biomarkers and survival outcomes in patients with renal diseases to predict the development of chronic kidney disease, disease progression, and renal graft survival. In the latter, random forest methods were found to be the best for the prediction of chronic kidney disease. Some studies have investigated the prognosis of nonalcoholic fatty liver disease and acute liver failure, as well as therapy response prediction in patients with viral disorders, using decision tree models. Machine learning techniques, such as Sparse High-Order Interaction Model with Rejection Option, have been used for diagnosing Alzheimer's disease. Data mining techniques have also been applied to identify the risk factors for serious mental illness, such as depression and dementia, and help to diagnose and predict the quality of life of such patients. In relation to child health, some studies have determined the best algorithms for predicting obesity and malnutrition. Machine learning has determined the important risk factors for preterm birth and low birth weight. Published studies of patients with cancer and bacterial diseases are limited and should perhaps be addressed more comprehensively in future studies. Herein, we provide an in-depth review of studies in which biochemical biomarker data were analyzed using machine learning methods to assess the risk of several common diseases, in order to summarize the potential applications of data mining methods in clinical diagnosis. Data mining techniques have now been increasingly applied to clinical diagnostics, and they have the potential to support this field.

Keywords: Data mining; decision tree; machine learning.

Publication types

  • Review

MeSH terms

  • Artificial Intelligence
  • Child
  • Data Mining
  • Diabetes Mellitus, Type 2*
  • Female
  • Humans
  • Infant, Newborn
  • Machine Learning
  • Pregnancy
  • Premature Birth*
  • Quality of Life