A Comparison of LASSO Regression and Tree-Based Models for Delayed Cerebral Ischemia in Elderly Patients With Subarachnoid Hemorrhage

Front Neurol. 2022 Mar 10:13:791547. doi: 10.3389/fneur.2022.791547. eCollection 2022.

Abstract

Backgrounds: As a most widely used machine learning method, tree-based algorithms have not been applied to predict delayed cerebral ischemia (DCI) in elderly patients with aneurysmal subarachnoid hemorrhage (aSAH). Hence, this study aims to develop the conventional regression and tree-based models and determine which model has better prediction performance for DCI development in hospitalized elderly patients after aSAH.

Methods: This was a multicenter, retrospective, observational cohort study analyzing elderly patients with aSAH aged 60 years and older. We randomly divided the multicentral data into model training and validation cohort in a ratio of 70-30%. One conventional regression and tree-based model, such as least absolute shrinkage and selection operator (LASSO), decision tree (DT), random forest (RF), and eXtreme Gradient Boosting (XGBoost), was developed. Accuracy, sensitivity, specificity, area under the precision-recall curve (AUC-PR), and area under the receiver operating characteristic curve (AUC-ROC) with 95% CI were employed to evaluate the model prediction performance. A DeLong test was conducted to calculate the statistical differences among models. Finally, we figured the importance weight of each feature to visualize the contribution on DCI.

Results: There were 111 and 42 patients in the model training and validation cohorts, and 53 cases developed DCI. According to AUC-ROC value in the model internal validation, DT of 0.836 (95% CI: 0.747-0.926, p = 0.15), RF of 1 (95% CI: 1-1, p < 0.05), and XGBoost of 0.931 (95% CI: 0.885-0.978, p = 0.01) outperformed LASSO of 0.793 (95% CI: 0.692-0.893). However, the LASSO scored a highest AUC-ROC value of 0.894 (95% CI: 0.8-0.989) than DT of 0.764 (95% CI: 0.6-0.928, p = 0.05), RF of 0.821 (95% CI: 0.683-0.959, p = 0.27), and XGBoost of 0.865 (95% CI: 0.751-0.979, p = 0.69) in independent external validation. Moreover, the LASSO had a highest AUC-PR value of 0.681 than DT of 0.615, RF of 0.667, and XGBoost of 0.622 in external validation. In addition, we found that CT values of subarachnoid clots, aneurysm therapy, and white blood cell counts were the most important features for DCI in elderly patients with aSAH.

Conclusions: The LASSO had a superior prediction power than tree-based models in external validation. As a result, we recommend the conventional LASSO regression model to predict DCI in elderly patients with aSAH.

Keywords: LASSO; aneurysm; delayed cerebral ischemia; subarachnoid hemorrhage; tree model.