Construction of a predictive model for bone metastasis from first primary lung adenocarcinoma within 3 cm based on machine learning algorithm: a retrospective study

PeerJ. 2024 Mar 14:12:e17098. doi: 10.7717/peerj.17098. eCollection 2024.

Abstract

Background: Adenocarcinoma, the most prevalent histological subtype of non-small cell lung cancer, is associated with a significantly higher likelihood of bone metastasis compared to other subtypes. The presence of bone metastasis has a profound adverse impact on patient prognosis. However, to date, there is a lack of accurate bone metastasis prediction models. As a result, this study aims to employ machine learning algorithms for predicting the risk of bone metastasis in patients.

Method: We collected a dataset comprising 19,454 cases of solitary, primary lung adenocarcinoma with pulmonary nodules measuring less than 3 cm. These cases were diagnosed between 2010 and 2015 and were sourced from the Surveillance, Epidemiology, and End Results (SEER) database. Utilizing clinical feature indicators, we developed predictive models using seven machine learning algorithms, namely extreme gradient boosting (XGBoost), logistic regression (LR), light gradient boosting machine (LightGBM), Adaptive Boosting (AdaBoost), Gaussian Naive Bayes (GNB), multilayer perceptron (MLP) and support vector machine (SVM).

Results: The results demonstrated that XGBoost exhibited superior performance among the four algorithms (training set: AUC: 0.913; test set: AUC: 0.853). Furthermore, for convenient application, we created an online scoring system accessible at the following URL: https://www.xsmartanalysis.com/model/predict/?mid=731symbol=7Fr16wX56AR9Mk233917, which is based on the highest performing model.

Conclusion: XGBoost proves to be an effective algorithm for predicting the occurrence of bone metastasis in patients with solitary, primary lung adenocarcinoma featuring pulmonary nodules below 3 cm in size. Moreover, its robust clinical applicability enhances its potential utility.

Keywords: Bone metastasis; Lung adenocarcinoma; Machine learning; XGBoost.

MeSH terms

  • Adenocarcinoma of Lung*
  • Adenocarcinoma*
  • Algorithms
  • Bayes Theorem
  • Bone Neoplasms*
  • Carcinoma, Non-Small-Cell Lung*
  • Humans
  • Lung Neoplasms*
  • Machine Learning
  • Retrospective Studies

Associated data

  • figshare/10.6084/m9.figshare.24481693.v1

Grants and funding

The authors received no funding for this work.