Predicting Phase 1 Lymphoma Clinical Trial Durations Using Machine Learning: An In-Depth Analysis and Broad Application Insights

Bowen Long; Shao-Wen Lai; Jiawen Wu; Srikar Bellur

doi:10.3390/clinpract14010007

Predicting Phase 1 Lymphoma Clinical Trial Durations Using Machine Learning: An In-Depth Analysis and Broad Application Insights

Clin Pract. 2023 Dec 29;14(1):69-88. doi: 10.3390/clinpract14010007.

Authors

Bowen Long¹, Shao-Wen Lai², Jiawen Wu¹, Srikar Bellur¹

Affiliations

¹ Department of Analytics, Harrisburg University of Science and Technology, Harrisburg, PA 17101, USA.
² Zippin, Mill Valley, CA 94941, USA.

Abstract

Lymphoma diagnoses in the US are substantial, with an estimated 89,380 new cases in 2023, necessitating innovative treatment approaches. Phase 1 clinical trials play a pivotal role in this context. We developed a binary predictive model to assess trial adherence to expected average durations, analyzing 1089 completed Phase 1 lymphoma trials from clinicaltrials.gov. Using machine learning, the Random Forest model demonstrated high efficacy with an accuracy of 0.7248 and an ROC-AUC of 0.7677 for lymphoma trials. The difference in the accuracy level of the Random Forest is statistically significant compared to the other alternative models, as determined by a 95% confidence interval on the testing set. Importantly, this model maintained an ROC-AUC of 0.7701 when applied to lung cancer trials, showcasing its versatility. A key insight is the correlation between higher predicted probabilities and extended trial durations, offering nuanced insights beyond binary predictions. Our research contributes to enhanced clinical research planning and potential improvements in patient outcomes in oncology.

Keywords: clinical research planning; lymphoma clinical trials; machine learning prediction; trial duration.

Grants and funding

This research received no external funding.