Using Multiple Imputation with GEE with Non-monotone Missing Longitudinal Binary Outcomes

Psychometrika. 2020 Dec;85(4):890-904. doi: 10.1007/s11336-020-09729-y. Epub 2020 Oct 2.

Abstract

This paper considers multiple imputation (MI) approaches for handling non-monotone missing longitudinal binary responses when estimating parameters of a marginal model using generalized estimating equations (GEE). GEE has been shown to yield consistent estimates of the regression parameters for a marginal model when data are missing completely at random (MCAR). However, when data are missing at random (MAR), the GEE estimates may not be consistent; the MI approaches proposed in this paper minimize bias under MAR. The first MI approach proposed is based on a multivariate normal distribution, but with the addition of pairwise products among the binary outcomes to the multivariate normal vector. Even though the multivariate normal does not impute 0 or 1 values for the missing binary responses, as discussed by Horton et al. (Am Stat 57:229-232, 2003), we suggest not rounding when filling in the missing binary data because it could increase bias. The second MI approach considered is the fully conditional specification (FCS) approach. In this approach, we specify a logistic regression model for each outcome given the outcomes at other time points and the covariates. Typically, one would only include main effects of the outcome at the other times as predictors in the FCS approach, but we explore if bias can be reduced by also including pairwise interactions of the outcomes at other time point in the FCS. In a study of asymptotic bias with non-monotone missing data, the proposed MI approaches are also compared to GEE without imputation. Finally, the proposed methods are illustrated using data from a longitudinal clinical trial comparing four psychosocial treatments from the National Institute on Drug Abuse Collaborative Cocaine Treatment Study, where patients' cocaine use is collected monthly for 6 months during treatment.

Keywords: fully conditional specification; generalized estimating equations; missing at random; missing completely at random; multivariate normal.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Computer Simulation
  • Humans
  • Logistic Models
  • Longitudinal Studies
  • Models, Statistical*
  • Normal Distribution
  • Psychometrics