Development and validation of a machine learning model to predict the risk of lymph node metastasis in renal carcinoma

Xiaowei Feng; Tao Hong; Wencai Liu; Chan Xu; Wanying Li; Bing Yang; Yang Song; Ting Li; Wenle Li; Hui Zhou; Chengliang Yin

doi:10.3389/fendo.2022.1054358

Development and validation of a machine learning model to predict the risk of lymph node metastasis in renal carcinoma

Front Endocrinol (Lausanne). 2022 Nov 18:13:1054358. doi: 10.3389/fendo.2022.1054358. eCollection 2022.

Authors

Xiaowei Feng¹, Tao Hong², Wencai Liu³, Chan Xu⁴, Wanying Li⁴, Bing Yang⁵, Yang Song⁶, Ting Li⁷, Wenle Li^{1

8}, Hui Zhou⁹, Chengliang Yin¹⁰

Affiliations

¹ Department of Neuro Rehabilitation, Shaanxi Provincial Rehabilitation Hospital, Xi 'an, China.
² Department of Cardiac Surgery, Fuwai Hospital Chinese Academy of Medical Sciences, Shenzhen, Shenzhen, China.
³ Department of Orthopaedic Surgery, the First Affiliated Hospital of Nanchang University, Nanchang, China.
⁴ Department of Clinical Medical Research Center, Xianyang Central Hospital, Xianyang, China.
⁵ Life Science Department, Tianjin Prosel Biological Technology Co., Ltd, Tianjin, China.
⁶ Department of Gastroenterology and Hepatology, Chinese People's Liberation Army (PLA) General Hospital, Beijing, China.
⁷ Department of Cell Biology, College of Basic Medical Sciences, Tianjin Medical University, Tianjin, China.
⁸ State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics & Center for Molecular Imaging and Translational Medicine, School of Public Health, Xiamen University, Fujian, China.
⁹ School of Pharmacy, Tianjin Medical University, Tianjin, China.
¹⁰ Faculty of Medicine, Macau University of Science and Technology, Macau, Macau SAR China.

Abstract

Simple summary: Studies have shown that about 30% of kidney cancer patients will have metastasis, and lymph node metastasis (LNM) may be related to a poor prognosis. Our retrospective study aims to provide a reliable machine learning-based model to predict the occurrence of LNM in kidney cancer. We screened the pathological grade, liver metastasis, M staging, primary site, T staging, and tumor size from the training group (n=39016) formed by the SEER database and the validation group (n=771) formed by the medical center. Independent predictors of LNM in cancer patients. Using six different algorithms to build a prediction model, it is found that the prediction performance of the XGB model in the training group and the validation group is significantly better than any other machine learning model. The results show that prediction tools based on machine learning can accurately predict the probability of LNM in patients with kidney cancer and have satisfactory clinical application prospects.

Background: Lymph node metastasis (LNM) is associated with the prognosis of patients with kidney cancer. This study aimed to provide reliable machine learning-based (ML-based) models to predict the probability of LNM in kidney cancer.

Methods: Data on patients diagnosed with kidney cancer were extracted from the Surveillance, Epidemiology and Outcomes (SEER) database from 2010 to 2017, and variables were filtered by least absolute shrinkage and selection operator (LASSO), univariate and multivariate logistic regression analyses. Statistically significant risk factors were used to build predictive models. We used 10-fold cross-validation in the validation of the model. The area under the receiver operating characteristic curve (AUC) was used to assess the performance of the model. Correlation heat maps were used to investigate the correlation of features using permutation analysis to assess the importance of predictors. Probability density functions (PDFs) and clinical utility curves (CUCs) were used to determine clinical utility thresholds.

Results: The training cohort of this study included 39,016 patients, and the validation cohort included 771 patients. In the two cohorts, 2544 (6.5%) and 66 (8.1%) patients had LNM, respectively. Pathological grade, liver metastasis, M stage, primary site, T stage, and tumor size were independent predictive factors of LNM. In both model validation, the XGB model significantly outperformed any of the machine learning models with an AUC value of 0.916.A web calculator (https://share.streamlit.io/liuwencai4/renal_lnm/main/renal_lnm.py) were built based on the XGB model. Based on the PDF and CUC, we suggested 54.6% as a threshold probability for guiding the diagnosis of LNM, which could distinguish about 89% of LNM patients.

Conclusions: The predictive tool based on machine learning can precisely indicate the probability of LNM in kidney cancer patients and has a satisfying application prospect in clinical practice.

Keywords: kidney cancer; lymph node metastasis; machine learning; predictive model; renal cell cancer; web calculator.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Carcinoma, Renal Cell* / diagnosis
Humans
Kidney Neoplasms* / diagnosis
Liver Neoplasms* / diagnosis
Lymphatic Metastasis
Machine Learning
Retrospective Studies