Discovery of Knowledge in the Incidence of a Type of Lung Cancer for Patients through Data Mining Models

Comput Intell Neurosci. 2022 May 31:2022:6058213. doi: 10.1155/2022/6058213. eCollection 2022.

Abstract

This paper presents the research results on the contribution of user-centered data mining based on the standard principles, focusing on the analysis of survival and mortality of lung cancer cases. Researchers used anonymized data from previously diagnosed instances in the health database to predict the condition of new patients who have not had their results yet. Medical professionals specializing in this field provided feedback on the usefulness of the new software, which was constructed using WEKA data mining tools and the Naive Bayes method. The results of this article provide elements of interest to discuss the value of identifying or discovering relationships in apparently "hidden" information to propose strategies to counteract health problems or prevent future complications and thus contribute to improving the quality of care. Life of the population, as would be the case of data mining in the health area, has shown applicability in the early detection and prevention of diseases for the analysis of genetic markers to determine the probability of a satisfactory response to medical treatment, and the most accurate model was Naive Bayes (91.1%). The Naive Bayes algorithm's closest competitor, bagging, came in second with 90.8%. The analysis found that the ZeroR algorithm had the lowest success rate at 80%.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Data Mining* / methods
  • Humans
  • Incidence
  • Lung Neoplasms* / epidemiology