A Bayesian latent class model for predicting gestational age in health administrative data

Pharm Stat. 2022 Nov;21(6):1199-1218. doi: 10.1002/pst.2225. Epub 2022 May 10.

Abstract

Health administrative data are oftentimes of limited use in epidemiological study on drug safety in pregnancy, due to lacking information on gestational age at birth (GAB). Although several studies have proposed algorithms to estimate GAB using claims database, failing to incorporate the unique distributional shape of GAB, can introduce bias in estimates and subsequent modeling. Hence, we develop a Bayesian latent class model to predict GAB. The model employs a mixture of Gaussian distributions with linear covariates within each class. This approach allows modeling heterogeneity in the population by identifying latent subgroups and estimating class-specific regression coefficients. We fit this model in a Bayesian framework conducting posterior computation with Markov Chain Monte Carlo methods. The method is illustrated with a dataset of 10,043 Rhode Island Medicaid mother-child pairs. We found that the three-class and six-class mixture specifications maximized prediction accuracy. Based on our results, Medicaid women were partitioned into three classes, featured by extreme preterm or preterm birth, preterm or" early" term birth, and" late" term birth. Obstetrical complications appeared to pose a significant influence on class-membership. Altogether, compared to traditional linear models our approach shows an advantage in predictive accuracy, because of superior flexibility in modeling a skewed response and population heterogeneity.

Keywords: Bayesian latent class model; administrative data; finite mixture model; gestational age at birth.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Female
  • Gestational Age
  • Humans
  • Infant, Newborn
  • Latent Class Analysis
  • Models, Statistical*
  • Pregnancy
  • Premature Birth* / epidemiology