Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Ashleigh Ledger; Jolien Ceusters; Lil Valentin; Antonia Testa; Caroline Van Holsbeke; Dorella Franchi; Tom Bourne; Wouter Froyman; Dirk Timmerman; Ben Van Calster

doi:10.1186/s12874-023-02103-3

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

BMC Med Res Methodol. 2023 Nov 24;23(1):276. doi: 10.1186/s12874-023-02103-3.

Authors

Ashleigh Ledger¹, Jolien Ceusters^{1

2}, Lil Valentin^{3

4}, Antonia Testa^{5

6}, Caroline Van Holsbeke⁷, Dorella Franchi⁸, Tom Bourne^{1

9

10}, Wouter Froyman^{1

9}, Dirk Timmerman^{1

9}, Ben Van Calster^{11

12

13}

Affiliations

¹ Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium.
² Department of Oncology, Leuven Cancer Institute, Laboratory of Tumor Immunology and Immunotherapy, KU Leuven, Leuven, Belgium.
³ Department of Obstetrics and Gynecology, Skåne University Hospital, Malmö, Sweden.
⁴ Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden.
⁵ Department of Woman, Child and Public Health, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy.
⁶ Dipartimento Universitario Scienze della Vita e Sanità Pubblica, Università Cattolica del Sacro Cuore, Rome, Italy.
⁷ Department of Obstetrics and Gynecology, Ziekenhuis Oost-Limburg, Genk, Belgium.
⁸ Preventive Gynecology Unit, Division of Gynecology, European Institute of Oncology IRCCS, Milan, Italy.
⁹ Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium.
¹⁰ Queen Charlotte's and Chelsea Hospital, Imperial College, London, UK.
¹¹ Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium. ben.vancalster@kuleuven.be.
¹² Department of Biomedical Data Sciences, Leiden University Medical Centre (LUMC), Leiden, Netherlands. ben.vancalster@kuleuven.be.
¹³ Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium. ben.vancalster@kuleuven.be.

Abstract

Background: Assessing malignancy risk is important to choose appropriate management of ovarian tumors. We compared six algorithms to estimate the probabilities that an ovarian tumor is benign, borderline malignant, stage I primary invasive, stage II-IV primary invasive, or secondary metastatic.

Methods: This retrospective cohort study used 5909 patients recruited from 1999 to 2012 for model development, and 3199 patients recruited from 2012 to 2015 for model validation. Patients were recruited at oncology referral or general centers and underwent an ultrasound examination and surgery ≤ 120 days later. We developed models using standard multinomial logistic regression (MLR), Ridge MLR, random forest (RF), XGBoost, neural networks (NN), and support vector machines (SVM). We used nine clinical and ultrasound predictors but developed models with or without CA125.

Results: Most tumors were benign (3980 in development and 1688 in validation data), secondary metastatic tumors were least common (246 and 172). The c-statistic (AUROC) to discriminate benign from any type of malignant tumor ranged from 0.89 to 0.92 for models with CA125, from 0.89 to 0.91 for models without. The multiclass c-statistic ranged from 0.41 (SVM) to 0.55 (XGBoost) for models with CA125, and from 0.42 (SVM) to 0.51 (standard MLR) for models without. Multiclass calibration was best for RF and XGBoost. Estimated probabilities for a benign tumor in the same patient often differed by more than 0.2 (20% points) depending on the model. Net Benefit for diagnosing malignancy was similar for algorithms at the commonly used 10% risk threshold, but was slightly higher for RF at higher thresholds. Comparing models, between 3% (XGBoost vs. NN, with CA125) and 30% (NN vs. SVM, without CA125) of patients fell on opposite sides of the 10% threshold.

Conclusion: Although several models had similarly good performance, individual probability estimates varied substantially.

Keywords: Calibration; Machine learning; Multiclass models; Ovarian Neoplasms; Prediction models.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
CA-125 Antigen
Female
Humans
Logistic Models
Ovarian Neoplasms* / diagnostic imaging
Ovarian Neoplasms* / pathology
Retrospective Studies
Uncertainty

Substances

CA-125 Antigen