Data Mining for Cardiovascular Disease Prediction

J Med Syst. 2021 Jan 5;45(1):6. doi: 10.1007/s10916-020-01682-8.

Abstract

Cardiovascular diseases (CVDs) aredisorders of the heart and blood vessels and are a major cause of disability and premature death worldwide. Individuals at higher risk of developing CVD must be noticed at an early stage to prevent premature deaths. Advances in the field of computational intelligence, together with the vast amount of data produced daily in clinical settings, have made it possible to create recognition systems capable of identifying hidden patterns and useful information. This paper focuses on the application of Data Mining Techniques (DMTs) to clinical data collected during the medical examination in an attempt to predict whether or not an individual has a CVD. To this end, the CRossIndustry Standard Process for Data Mining (CRISP-DM) methodology was followed, in which five classifiers were applied, namely DT, Optimized DT, RI, RF, and DL. The models were mainly developed using the RapidMiner software with the assist of the WEKA tool and were analyzed based on accuracy, precision, sensitivity, and specificity. The results obtained were considered promising on the basis of the research for effective means of diagnosing CVD, with the best model being Optimized DT, which achieved the highest values for all the evaluation metrics, 73.54%, 75.82%, 68.89%, 78.16% and 0.788 for accuracy, precision, sensitivity, specificity, and AUC, respectively.

Keywords: CRISP-DM; Cardiovascular disease; Classification; Data mining; Decision support systems; Health information systems.

MeSH terms

  • Artificial Intelligence
  • Cardiovascular Diseases* / diagnosis
  • Cardiovascular Diseases* / epidemiology
  • Data Mining
  • Humans