Multiple Imputation of Missing Race/Ethnicity Information in the National Assisted Reproductive Technology Surveillance System

J Womens Health (Larchmt). 2024 Mar;33(3):328-338. doi: 10.1089/jwh.2023.0267. Epub 2023 Dec 19.

Abstract

Background: Missing race/ethnicity data are common in many surveillance systems and registries, which may limit complete and accurate assessments of racial and ethnic disparities. Centers for Disease Control and Prevention's National Assisted Reproductive Technology (ART) Surveillance System (NASS) has a congressional mandate to collect data on all ART cycles performed by fertility clinics in the United States and provides valuable information on ART utilization and treatment outcomes. However, race/ethnicity data are missing for many ART cycles in NASS. Materials and Methods: We multiply imputed missing race/ethnicity data using variables from NASS and additional zip code-level race/ethnicity information in U.S. Census data. To evaluate imputed data quality, we generated training data by imposing missing values on known race/ethnicity under missing at random assumption, imputed, and examined the relationship between race/ethnicity and the rate of stillbirth per pregnancy. Results: The distribution of imputed race/ethnicity was comparable to the reported one with the largest difference of 0.53% for non-Hispanic Asian. Our imputation procedure was well calibrated and correctly identified that 89.91% (standard error = 0.18) of known race/ethnicity values on average in training data. Compared to complete-case analysis, using multiply imputed data reduced bias of parameter estimates (the range of bias for stillbirth per pregnancy across race/ethnicity groups is 0.02%-0.18% for imputed data analysis, versus 0.04%-0.66% for complete-case analysis) and yielded narrower confidence intervals. Conclusions: Our results underscore the importance of collecting complete race/ethnicity information for ART surveillance. However, when the missingness exists, multiply imputed race/ethnicity can improve the accuracy and precision of health outcomes estimated across racial/ethnic groups.

Keywords: assisted reproductive technology; complete-case analysis; missing race/ethnicity; multiple imputation; stillbirth.

MeSH terms

  • Ethnicity*
  • Female
  • Humans
  • Population Surveillance
  • Pregnancy
  • Racial Groups
  • Reproductive Techniques, Assisted
  • Stillbirth*
  • United States / epidemiology