Application of logistic regression and machine learning methods for idiopathic inflammatory myopathies malignancy prediction

Clin Exp Rheumatol. 2023 Mar;41(2):330-339. doi: 10.55563/clinexprheumatol/8ievtq. Epub 2023 Mar 1.

Abstract

Objectives: Malignancy is related to idiopathic inflammatory myopathies (IIM) and leads to a poor prognosis. Early prediction of malignancy is thought to improve the prognosis. However, predictive models have rarely been reported in IIM. Herein, we aimed to establish and use a machine learning (ML) algorithm to predict the possible risk factors for malignancy in IIM patients.

Methods: We retrospectively reviewed the medical records of 168 patients diagnosed with IIM in Shantou Central hospital, from 2013 to 2021. We randomly divided patients into two groups, the training sets (70%) for construction of the prediction model, and the validation sets (30%) for evaluation of model performance. We constructed six types of ML algorithms models and the AUC of ROC curves were used to describe the efficacy of the model. Finally, we set up a web version using the best prediction model to make it more generally available.

Results: According to the multi-variable regression analysis, three predictors were found to be the risk factors to establish the prediction model, including age, ALT<80U/L, and anti-TIF1-γ, and ILD was found to be a protective factor. Compared with five other ML algorithms models, the traditional algorithm logistic regression (LR) model was as good or better than the other models to predict malignancy in IIM. The AUC of the ROC using LR was 0.900 in the training set and 0.784 in the validation set. We selected the LR model as the final prediction model. Accordingly, a nomogram was constructed using the above four factors. A web version was built and can be visited on the website or acquired by scanning the QR code.

Conclusions: The LR algorithm appears to be a good predictor of malignancy and may help clinicians screen, evaluate and follow up high-risk patients with IIM.

MeSH terms

  • Humans
  • Logistic Models
  • Machine Learning
  • Myositis* / diagnosis
  • Neoplasms* / diagnosis
  • Neoplasms* / therapy
  • Retrospective Studies