Applying logistic LASSO regression for the diagnosis of atypical Crohn's disease

Sci Rep. 2022 Jul 5;12(1):11340. doi: 10.1038/s41598-022-15609-5.

Abstract

In countries with a high incidence of tuberculosis, the typical clinical features of Crohn's disease (CD) may be covered up after tuberculosis infection, and the identification of atypical Crohn's disease and intestinal tuberculosis (ITB) is still a dilemma for clinicians. Least absolute shrinkage and selection operator (LASSO) regression has been applied to select variables in disease diagnosis. However, its value in discriminating ITB and atypical Crohn's disease remains unknown. A total of 400 patients were enrolled from January 2014 to January 2019 in second Xiangya hospital Central South University.Among them, 57 indicators including clinical manifestations, laboratory results, endoscopic findings, computed tomography enterography features were collected for further analysis. R software version 3.6.1 (glmnet package) was used to perform the LASSO logistic regression analysis. SPSS 20.0 was used to perform Pearson chi-square test and binary logistic regression analysis. In the variable selection step, LASSO regression and Pearson chi-square test were applied to select the most valuable variables as candidates for further logistic regression analysis. Secondly, variables identified from step 1 were applied to construct binary logistic regression analysis. Receiver operating characteristic (ROC) curve analysis was performed on these models to assess the ability and the optimal cutoff value for diagnosis. The area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy rate, together with their 95% confidence and intervals (CIs) were calculated. MedCalc software (Version 16.8) was applied to analyze the ROC curves of models. 332 patients were eventually enrolled to build a binary logistic regression model to discriminate CD (including comprehensive CD and tuberculosis infected CD) and ITB. However, we did not get a satisfactory diagnostic value via applying the binary logistic regression model of comprehensive CD and ITB to predict tuberculosis infected CD and ITB (accuracy rate:79.2%VS 65.1%). Therefore, we further established a binary logistic regression model to discriminate atypical CD from ITB, based on Pearsonchi-square test (model1) and LASSO regression (model 2). Model 1 showed 89.9% specificity, 65.9% sensitivity, 88.5% PPV, 68.9% NPV, 76.9% diagnostic accuracy, and an AUC value of 0.811, and model 2 showed 80.6% specificity, 84.4% sensitivity, 82.3% PPV, 82.9% NPV, 82.6% diagnostic accuracy, and an AUC value of 0.887. The comparison of AUCs between model1 and model2 was statistically different (P < 0.05). Tuberculosis infection increases the difficulty of discriminating CD from ITB. LASSO regression showed a more efficient ability than Pearson chi-square test based logistic regression on differential diagnosing atypical CD and ITB.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Crohn Disease* / diagnostic imaging
  • Humans
  • Latent Tuberculosis*
  • Logistic Models
  • Tuberculosis, Gastrointestinal* / diagnostic imaging
  • Tuberculosis, Lymph Node*