Application of Data Mining Algorithms for Dementia in People with HIV/AIDS

Comput Math Methods Med. 2021 Jul 9:2021:4602465. doi: 10.1155/2021/4602465. eCollection 2021.

Abstract

Dementia interferes with the individual's motor, behavioural, and intellectual functions, causing him to be unable to perform instrumental activities of daily living. This study is aimed at identifying the best performing algorithm and the most relevant characteristics to categorise individuals with HIV/AIDS at high risk of dementia from the application of data mining. Principal component analysis (PCA) algorithm was used and tested comparatively between the following machine learning algorithms: logistic regression, decision tree, neural network, KNN, and random forest. The database used for this study was built from the data collection of 270 individuals infected with HIV/AIDS and followed up at the outpatient clinic of a reference hospital for infectious and parasitic diseases in the State of Ceará, Brazil, from January to April 2019. Also, the performance of the algorithms was analysed for the 104 characteristics available in the database; then, with the reduction of dimensionality, there was an improvement in the quality of the machine learning algorithms and identified that during the tests, even losing about 30% of the variation. Besides, when considering only 23 characteristics, the precision of the algorithms was 86% in random forest, 56% logistic regression, 68% decision tree, 60% KNN, and 59% neural network. The random forest algorithm proved to be more effective than the others, obtaining 84% precision and 86% accuracy.

MeSH terms

  • AIDS Dementia Complex / diagnosis*
  • AIDS Dementia Complex / epidemiology
  • AIDS Dementia Complex / etiology
  • Acquired Immunodeficiency Syndrome / complications*
  • Aged
  • Algorithms*
  • Brazil / epidemiology
  • Computational Biology
  • Data Mining / methods
  • Data Mining / statistics & numerical data
  • Databases, Factual
  • Decision Trees
  • Dementia / etiology*
  • Female
  • Follow-Up Studies
  • Humans
  • Logistic Models
  • Machine Learning
  • Male
  • Middle Aged
  • Neural Networks, Computer
  • Risk Factors