Prediction of gestational diabetes mellitus in Asian women using machine learning algorithms

Byung Soo Kang; Seon Ui Lee; Subeen Hong; Sae Kyung Choi; Jae Eun Shin; Jeong Ha Wie; Yun Sung Jo; Yeon Hee Kim; Kicheol Kil; Yoo Hyun Chung; Kyunghoon Jung; Hanul Hong; In Yang Park; Hyun Sun Ko

doi:10.1038/s41598-023-39680-8

Prediction of gestational diabetes mellitus in Asian women using machine learning algorithms

Sci Rep. 2023 Aug 16;13(1):13356. doi: 10.1038/s41598-023-39680-8.

Authors

Byung Soo Kang¹, Seon Ui Lee², Subeen Hong¹, Sae Kyung Choi³, Jae Eun Shin⁴, Jeong Ha Wie⁵, Yun Sung Jo², Yeon Hee Kim⁶, Kicheol Kil⁷, Yoo Hyun Chung⁸, Kyunghoon Jung⁹, Hanul Hong⁹, In Yang Park¹, Hyun Sun Ko¹⁰

Affiliations

¹ Department of Obstetrics and Gynecology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
² Department of Obstetrics and Gynecology, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
³ Department of Obstetrics and Gynecology, Incheon St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
⁴ Department of Obstetrics and Gynecology, Bucheon St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
⁵ Department of Obstetrics and Gynecology, Eunpyeong St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
⁶ Department of Obstetrics and Gynecology, Uijeongbu St. Mary's Hospital,, College of Medicine, The Catholic University of Korea, Seoul, Korea.
⁷ Department of Obstetrics and Gynecology, Yeouido St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
⁸ Department of Obstetrics and Gynecology, Daejeon St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
⁹ Innerwave Co., Ltd, Seoul, Korea.
¹⁰ Department of Obstetrics and Gynecology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea. mongkoko@catholic.ac.kr.

Abstract

This study developed a machine learning algorithm to predict gestational diabetes mellitus (GDM) using retrospective data from 34,387 pregnancies in multi-centers of South Korea. Variables were collected at baseline, E0 (until 10 weeks' gestation), E1 (11-13 weeks' gestation) and M1 (14-24 weeks' gestation). The data set was randomly divided into training and test sets (7:3 ratio) to compare the performances of light gradient boosting machine (LGBM) and extreme gradient boosting (XGBoost) algorithms, with a full set of variables (original). A prediction model with the whole cohort achieved area under the receiver operating characteristics curve (AUC) and area under the precision-recall curve (AUPR) values of 0.711 and 0.246 at baseline, 0.720 and 0.256 at E0, 0.721 and 0.262 at E1, and 0.804 and 0.442 at M1, respectively. Then comparison of three models with different variable sets were performed: [a] variables from clinical guidelines; [b] selected variables from Shapley additive explanations (SHAP) values; and [c] Boruta algorithms. Based on model [c] with the least variables and similar or better performance than the other models, simple questionnaires were developed. The combined use of maternal factors and laboratory data could effectively predict individual risk of GDM using a machine learning model.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Diabetes, Gestational* / diagnosis
Diabetes, Gestational* / epidemiology
East Asian People
Female
Humans
Machine Learning
Pregnancy
Republic of Korea
Retrospective Studies