Sub-clustering based recommendation system for stroke patient: Identification of a specific drug class for a given patient

Comput Biol Med. 2024 Mar:171:108117. doi: 10.1016/j.compbiomed.2024.108117. Epub 2024 Feb 7.

Abstract

Stroke is one of the leading causes of death worldwide. Previous studies have explored machine learning techniques for early detection of stroke patients using content-based recommendation systems. However, these models often struggle with timely detection of medications, which can be critical for patient management and decision-making regarding the prescription of new drugs. In this study, we developed a content-based recommendation model using three machine learning algorithms: Gaussian Mixture Model (GMM), Affinity Propagation (AP), and K-Nearest Neighbors (KNN), to aid Healthcare Professionals (HCP) in quickly detecting medications based on the symptoms of a patient with stroke. Our model focused on three classes of drugs: antihypertensive, anticoagulant, and fibrate. Each machine learning algorithm was used to accomplish specific tasks, thereby reducing the partial search space, computational cost, and accurately detecting a primary drug class without loss of precision and accuracy. Our proposed model, called CRGANNC (Clustering Recommendation Gaussian Affinity Nearest Neighbors Classifier), effectively addresses the sparsity and scalability issues faced by content-based recommendation models. The CRGANNC model dynamically partition clusters into sub-clusters with variable numbers based on the group, and can diagnose healthy, sick, and at-risk patients, and recommend drugs to the HCP. In addition to our analysis, we developed a semi-artificial dataset with new features such as weakness, dizziness, headache, nausea, and vomiting, using a pipeline. This dataset serves as a valuable resource for researchers in the sensitive domain of stroke, providing a starting point for building and testing models when real data is often restricted. Our work not only contributes to the development of predictive models for stroke but also establishes a framework for creating similar datasets in other sensitive domains, accelerating research efforts and improving patient care. Our experiments were conducted on our dataset consisting of 9691 patient records, with 1206 records for stroke attacks and 8485 healthy patients. The CRGANNC model achieved an average precision of 0.98, recall of 0.95 and F1-score of 0.96 across all three drugs classes. Furthermore, our model demonstrated significant improvement in computational efficiency compared to existing content-based recommendation models, reducing the processing time by 25.80% . This results indicate the effectiveness of our model in accurately detecting medications for stroke patients based on their symptoms.

Keywords: Content based filtering; Machine learning; Recommender system; Stroke disease.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Dizziness*
  • Fibric Acids
  • Head
  • Humans

Substances

  • Fibric Acids