Development of prediction model to estimate future risk of ovarian lesions: A multi-center retrospective study

Prev Med Rep. 2023 Jun 23:35:102296. doi: 10.1016/j.pmedr.2023.102296. eCollection 2023 Oct.

Abstract

Background: To develop the preoperative prediction of ovarian lesions using regression-based statistics analyses and machine learning methods based on multiple serological biomarkers in China.

Methods: 1137 patients with ovarian lesions in Zhujiang Hospital and 518 patients in others hospital in China were randomly assigned to training, test and external validation cohorts. Five machine learning classifiers, including Random Forest (RF), Extreme Gradient Boosting (XGB), Support Vector Classifier (SVC), K-nearest Neighbor (KN), Multi-Layer Perceptron (MLP) and the Lasso-Logistics prediction model (LLRM) were used to derive diagnostic information from 23 predictors.

Results: The RF model had a high diagnostic value (AUC = 0.968) in predicting benign and malignant ovarian disease. Age and MLR were also potential diagnostic indicators for predicting ovarian disease except tumor indicators. The RF model well distinguished borderline ovarian tumors (AUC = 0.742). The RFM had a high predictive power to identify ovarian serous adenocarcinoma (AUC = 0.943) and ovarian endometriosis cysts (AUC = 0.914).

Conclusions: The RF models can effectively predict adnexal lesions, promising to be adjuncts to the preoperative prediction of ovarian cancer.

Keywords: AUC; Disease Prediction; Lasso Regression; Machine Learning; Ovarian Disease.