Early prediction of preeclampsia via machine learning

Ivana Marić; Abraham Tsur; Nima Aghaeepour; Andrea Montanari; David K Stevenson; Gary M Shaw; Virginia D Winn

doi:10.1016/j.ajogmf.2020.100100

Early prediction of preeclampsia via machine learning

Am J Obstet Gynecol MFM. 2020 May;2(2):100100. doi: 10.1016/j.ajogmf.2020.100100. Epub 2020 Mar 14.

Authors

Ivana Marić¹, Abraham Tsur², Nima Aghaeepour³, Andrea Montanari⁴, David K Stevenson⁵, Gary M Shaw⁵, Virginia D Winn⁶

Affiliations

¹ Department of Pediatrics, Stanford University School of Medicine, Stanford, CA. Electronic address: ivanam@stanford.edu.
² Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA; Department of Obstetrics and Gynecology, The Sheba Medical Center, Tel Hashomer, Israel.
³ Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA.
⁴ Department of Electrical Engineering and Department of Statistics, Stanford University, Stanford, CA.
⁵ Department of Pediatrics, Stanford University School of Medicine, Stanford, CA.
⁶ Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA.

PMID: 33345966
DOI: 10.1016/j.ajogmf.2020.100100

Abstract

Background: Early prediction of preeclampsia is challenging because of poorly understood causes, various risk factors, and likely multiple pathogenic phenotypes of preeclampsia. Statistical learning methods are well-equipped to deal with a large number of variables, such as patients' clinical and laboratory data, and to select the most informative features automatically.

Objective: Our objective was to use statistical learning methods to analyze all available clinical and laboratory data that were obtained during routine prenatal visits in early pregnancy and to use them to develop a prediction model for preeclampsia.

Study design: This was a retrospective cohort study that used data from 16,370 births at Lucile Packard Children Hospital at Stanford, CA, from April 2014 to January 2018. Two statistical learning algorithms were used to build a predictive model: (1) elastic net and (2) gradient boosting algorithm. Models for all preeclampsia and early-onset preeclampsia (<34 weeks gestation) were fitted with the use of patient data that were available at <16 weeks gestational age. The 67 variables that were considered in the models included maternal characteristics, medical history, routine prenatal laboratory results, and medication intake. The area under the receiver operator curve, true-positive rate, and false-positive rate were assessed via cross-validation.

Results: Using the elastic net algorithm, we developed a prediction model that contained a subset of the most informative features from all variables. The obtained prediction model for preeclampsia yielded an area under the curve of 0.79 (95% confidence interval, 0.75-0.83), sensitivity of 45.2%, and false-positive rate of 8.1%. The prediction model for early-onset preeclampsia achieved an area under the curve of 0.89 (95% confidence interval, 0.84-0.95), true-positive rate of 72.3%, and false-positive rate of 8.8%.

Conclusion: Statistical learning methods in a retrospective cohort study automatically identified a set of significant features for prediction and yielded high prediction performance for preeclampsia risk from routine early pregnancy information.

Keywords: early prediction of preeclampsia; elastic net; gradient boosting algorithm; machine learning; preeclampsia; statistical learning.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Child
Female
Gestational Age
Humans
Machine Learning
Pre-Eclampsia* / diagnosis
Pregnancy
Retrospective Studies
Risk Factors

Grants and funding

R01 HL139844/HL/NHLBI NIH HHS/United States