A Bayesian Approach Towards Missing Covariate Data in Multilevel Latent Regression Models

Psychometrika. 2023 Dec;88(4):1495-1528. doi: 10.1007/s11336-022-09888-0. Epub 2022 Nov 23.

Abstract

The measurement of latent traits and investigation of relations between these and a potentially large set of explaining variables is typical in psychology, economics, and the social sciences. Corresponding analysis often relies on surveyed data from large-scale studies involving hierarchical structures and missing values in the set of considered covariates. This paper proposes a Bayesian estimation approach based on the device of data augmentation that addresses the handling of missing values in multilevel latent regression models. Population heterogeneity is modeled via multiple groups enriched with random intercepts. Bayesian estimation is implemented in terms of a Markov chain Monte Carlo sampling approach. To handle missing values, the sampling scheme is augmented to incorporate sampling from the full conditional distributions of missing values. We suggest to model the full conditional distributions of missing values in terms of non-parametric classification and regression trees. This offers the possibility to consider information from latent quantities functioning as sufficient statistics. A simulation study reveals that this Bayesian approach provides valid inference and outperforms complete cases analysis and multiple imputation in terms of statistical efficiency and computation time involved. An empirical illustration using data on mathematical competencies demonstrates the usefulness of the suggested approach.

Keywords: Item response theory; Markov chain Monte Carlo; classification and regression trees; missing values; population heterogeneity.

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Data Interpretation, Statistical
  • Models, Statistical*
  • Monte Carlo Method
  • Psychometrics
  • Social Sciences*