Segmentation of patients with small cell lung cancer into responders and non-responders using the optimal cross-validation technique

Elham Majd; Li Xing; Xuekui Zhang

doi:10.1186/s12874-024-02185-7

Segmentation of patients with small cell lung cancer into responders and non-responders using the optimal cross-validation technique

BMC Med Res Methodol. 2024 Apr 8;24(1):83. doi: 10.1186/s12874-024-02185-7.

Authors

Elham Majd¹, Li Xing², Xuekui Zhang³

Affiliations

¹ Department of Mathematics and Statistics, University of Victoria, Victoria, BC, Canada.
² Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, SK, Canada.
³ Department of Mathematics and Statistics, University of Victoria, Victoria, BC, Canada. xuekui@uvic.ca.

Abstract

Background: The timing of treating cancer patients is an essential factor in the efficacy of treatment. So, patients who will not respond to current therapy should receive a different treatment as early as possible. Machine learning models can be built to classify responders and nonresponders. Such classification models predict the probability of a patient being a responder. Most methods use a probability threshold of 0.5 to convert the probabilities into binary group membership. However, the cutoff of 0.5 is not always the optimal choice.

Methods: In this study, we propose a novel data-driven approach to select a better cutoff value based on the optimal cross-validation technique. To illustrate our novel method, we applied it to three clinical trial datasets of small-cell lung cancer patients. We used two different datasets to build a scoring system to segment patients. Then the models were applied to segment patients into the test data.

Results: We found that, in test data, the predicted responders and non-responders had significantly different long-term survival outcomes. Our proposed novel method segments patients better than the standard approach using a cutoff of 0.5. Comparing clinical outcomes of responders versus non-responders, our novel method had a p-value of 0.009 with a hazard ratio of 0.668 for grouping patients using the Cox proportion hazard model and a p-value of 0.011 using the accelerated failure time model which approved a significant difference between responders and non-responders. In contrast, the standard approach had a p-value of 0.194 with a hazard ratio of 0.823 using the Cox proportion hazard model and a p-value of 0.240 using the accelerated failure time model indicating the responders and non-responders do not differ significantly in survival.

Conclusion: In summary, our novel prediction method can successfully segment new patients into responders and non-responders. Clinicians can use our prediction to decide if a patient should receive a different treatment or stay with the current treatment.

Keywords: Best overall response; Clinical trials; Cross-validation; Overall survival.

MeSH terms

Humans
Lung Neoplasms* / diagnosis
Lung Neoplasms* / therapy
Research Design
Small Cell Lung Carcinoma* / therapy
Treatment Outcome

Abstract

MeSH terms

Grants and funding