Suboptimal capability of individual machine learning algorithms in modeling small-scale imbalanced clinical data of local hospital

Gang Li; Chenbi Li; Chengli Wang; Zeheng Wang

doi:10.1371/journal.pone.0298328

Suboptimal capability of individual machine learning algorithms in modeling small-scale imbalanced clinical data of local hospital

PLoS One. 2024 Feb 23;19(2):e0298328. doi: 10.1371/journal.pone.0298328. eCollection 2024.

Authors

Gang Li¹, Chenbi Li¹, Chengli Wang¹, Zeheng Wang^{2

3}

Affiliations

¹ Department of ICU, 3201 Hospital, Hanzhong, Shaanxi, China.
² Data61, CSIRO, Clayton, VIC, Australia.
³ Manufacturing, CSIRO, West Lindfield, NSW, Australia.

Abstract

In recent years, artificial intelligence (AI) has shown promising applications in various scientific domains, including biochemical analysis research. However, the effectiveness of AI in modeling small-scale, imbalanced datasets remains an open question in such fields. This study explores the capabilities of eight basic AI algorithms, including ridge regression, logistic regression, random forest regression, and others, in modeling a small, imbalanced clinical dataset (total n = 387, class 0 = 27, class 1 = 360) related to the records of the biochemical blood tests from the patients with multiple wasp stings (MWS). Through rigorous evaluation using k-fold cross-validation and comprehensive scoring, we found that none of the models could effectively model the data. Even after fine-tuning the hyperparameters of the best-performing models, the results remained below acceptable thresholds. The study highlights the challenges of applying AI to small-scale datasets with imbalanced groups in biochemical or clinical research and emphasizes the need for novel algorithms tailored to small-scale data. The findings also call for further exploration into techniques such as transfer learning and data augmentation, and they underline the importance of understanding the minimum dataset scale required for effective AI modeling in biochemical contexts.

Copyright: © 2024 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Algorithms
Animals
Artificial Intelligence
Humans
Insect Bites and Stings*
Machine Learning
Wasps*

Grants and funding

The authors received no specific funding for this work.