An Explainable Artificial Intelligence Model Proposed for the Prediction of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome and the Identification of Distinctive Metabolites

Diagnostics (Basel). 2023 Nov 21;13(23):3495. doi: 10.3390/diagnostics13233495.

Abstract

Background: Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a complex and debilitating illness with a significant global prevalence, affecting over 65 million individuals. It affects various systems, including the immune, neurological, gastrointestinal, and circulatory systems. Studies have shown abnormalities in immune cell types, increased inflammatory cytokines, and brain abnormalities. Further research is needed to identify consistent biomarkers and develop targeted therapies. This study uses explainable artificial intelligence and machine learning techniques to identify discriminative metabolites for ME/CFS.

Material and methods: The model investigates a metabolomics dataset of CFS patients and healthy controls, including 26 healthy controls and 26 ME/CFS patients aged 22-72. The dataset encapsulated 768 metabolites into nine metabolic super-pathways: amino acids, carbohydrates, cofactors, vitamins, energy, lipids, nucleotides, peptides, and xenobiotics. Random forest methods together with other classifiers were applied to the data to classify individuals as ME/CFS patients and healthy individuals. The classification learning algorithms' performance in the validation step was evaluated using a variety of methods, including the traditional hold-out validation method, as well as the more modern cross-validation and bootstrap methods. Explainable artificial intelligence approaches were applied to clinically explain the optimum model's prediction decisions.

Results: The metabolomics of C-glycosyltryptophan, oleoylcholine, cortisone, and 3-hydroxydecanoate were determined to be crucial for ME/CFS diagnosis. The random forest model outperformed the other classifiers in ME/CFS prediction using the 1000-iteration bootstrapping method, achieving 98% accuracy, precision, recall, F1 score, 0.01 Brier score, and 99% AUC. According to the obtained results, the bootstrap validation approach demonstrated the highest classification outcomes.

Conclusion: The proposed model accurately classifies ME/CFS patients based on the selected biomarker candidate metabolites. It offers a clear interpretation of risk estimation for ME/CFS, aiding physicians in comprehending the significance of key metabolomic features within the model.

Keywords: clinical classification; explainable artificial intelligence; metabolomics data; myalgic encephalomyelitis/chronic fatigue syndrome.